WikisqlA large annotated semantic parsing corpus for developing natural language interfaces.
Stars: ✭ 965 (+614.81%)
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (-59.26%)
ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+2.96%)
Oie ResourcesA curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+109.63%)
BondBOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (-28.89%)
DoccanoOpen source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+4048.15%)
ChazutsuThe tool to make NLP datasets ready to use
Stars: ✭ 238 (+76.3%)
Char Rnn TensorflowMulti-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stars: ✭ 58 (-57.04%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-64.44%)
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+17.04%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+88.89%)
Pytorch NlpBasic Utilities for PyTorch Natural Language Processing (NLP)
Stars: ✭ 1,996 (+1378.52%)
Text2sql DataA collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (+112.59%)
Pytreebank😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-31.11%)
Ua GecUA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (-20%)
Dbg PdsDeutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-8.15%)
Chars2vecCharacter-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-3.7%)
Ember ImpaginationAn Ember Addon that puts the fun back in asynchronous, paginated datasets
Stars: ✭ 123 (-8.89%)
Fnc 1 BaselineA baseline implementation for FNC-1
Stars: ✭ 123 (-8.89%)
Awesome Italian Public DatasetsA selection of interesting Open dataset from the Italian Public Administration and Civic Data use cases
Stars: ✭ 132 (-2.22%)
ClicrMachine reading comprehension on clinical case reports
Stars: ✭ 123 (-8.89%)
KeitaMy personal toolkit for PyTorch development.
Stars: ✭ 124 (-8.15%)
Konoha🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-3.7%)
UdaUnsupervised Data Augmentation (UDA)
Stars: ✭ 1,877 (+1290.37%)
Hpatches BenchmarkPython & Matlab code for local feature descriptor evaluation with the HPatches dataset.
Stars: ✭ 129 (-4.44%)
Onepiece Kga knowledge graph project for ONEPIECE /《海贼王》知识图谱
Stars: ✭ 123 (-8.89%)
Zamia AiFree and open source A.I. system based on Python, TensorFlow and Prolog.
Stars: ✭ 133 (-1.48%)
Spacy Js🎀 JavaScript API for spaCy with Python REST API
Stars: ✭ 123 (-8.89%)
Learnpaddle2PaddlePaddle Fluid 版本系列教程,CSDN博客专栏:
Stars: ✭ 129 (-4.44%)
HakeHAKE: Human Activity Knowledge Engine (CVPR'18/19/20, NeurIPS'20)
Stars: ✭ 132 (-2.22%)
Files2rougeCalculating ROUGE score between two files (line-by-line)
Stars: ✭ 120 (-11.11%)
ContactposeLarge dataset of hand-object contact, hand- and object-pose, and 2.9 M RGB-D grasp images.
Stars: ✭ 129 (-4.44%)
Nlp Pretrained ModelA collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-9.63%)
Gis Dataset BrasilGeographic Information Systems (GIS) Dataset Brasil - Coleção de shapefiles, GeoJSON e TopoJSON prontas para uso
Stars: ✭ 121 (-10.37%)
MedquadMedical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
Stars: ✭ 129 (-4.44%)
TokenizerFast and customizable text tokenization library with BPE and SentencePiece support
Stars: ✭ 132 (-2.22%)
DialoglueDialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue
Stars: ✭ 120 (-11.11%)
OpenvehiclevisionAn opensource lib. for vehicle vision applications (written by MATLAB), lane marking detection, road segmentation
Stars: ✭ 120 (-11.11%)
Deep LyricsLyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Stars: ✭ 127 (-5.93%)
Githubrankingsspain⬆️ Rankings with the most active GitHub users in Spain (sorted by public contributions) 🇪🇸
Stars: ✭ 127 (-5.93%)
ScattertextBeautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+1175.56%)
DiscobertCode for paper "Discourse-Aware Neural Extractive Text Summarization" (ACL20)
Stars: ✭ 120 (-11.11%)
PymetamapPython wraper for MetaMap
Stars: ✭ 119 (-11.85%)