ProsodyHelsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+28.7%)
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (-49.07%)
Nlp bahasa resourcesA Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+46.3%)
FakenewscorpusA dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+136.11%)
Mams For AbsaA Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment analysis.
Stars: ✭ 135 (+25%)
Pytreebank😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (-13.89%)
NlvrCornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
Stars: ✭ 192 (+77.78%)
Dialog corpus用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+1438.89%)
Clue中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+2145.37%)
Chinese Names Corpus中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+2726.85%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+6062.96%)
Efaqa Corpus Zh❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Stars: ✭ 170 (+57.41%)
BondBOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (-11.11%)
Oie ResourcesA curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+162.04%)
DoccanoOpen source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+5085.19%)
Text2sql DataA collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (+165.74%)
Typing AssistantTyping Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Stars: ✭ 32 (-70.37%)
WikisqlA large annotated semantic parsing corpus for developing natural language interfaces.
Stars: ✭ 965 (+793.52%)
MtntCode for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-55.56%)
Ja.text8Japanese text8 corpus for word embedding.
Stars: ✭ 79 (-26.85%)
ChazutsuThe tool to make NLP datasets ready to use
Stars: ✭ 238 (+120.37%)
Awesome Persian Nlp IrCurated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+325.93%)
Pytorch NlpBasic Utilities for PyTorch Natural Language Processing (NLP)
Stars: ✭ 1,996 (+1748.15%)
QuantedaAn R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (+499.07%)
Char Rnn TensorflowMulti-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stars: ✭ 58 (-46.3%)
Dataset Listlists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (-22.22%)
Papers读过的CV方向的一些论文,图像生成文字、弱监督分割等
Stars: ✭ 99 (-8.33%)
PytorchnlpbookCode and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://nlproc.info
Stars: ✭ 1,390 (+1187.04%)
Chinese nlu by using rasa nlu使用 RASA NLU 来构建中文自然语言理解系统(NLU)| Use RASA NLU to build a Chinese Natural Language Understanding System (NLU)
Stars: ✭ 99 (-8.33%)
NeuronblocksNLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Stars: ✭ 1,356 (+1155.56%)
Ios mlList of Machine Learning, AI, NLP solutions for iOS. The most recent version of this article can be found on my blog.
Stars: ✭ 1,409 (+1204.63%)
MetaknowledgeA Python library for doing bibliometric and network analysis in science and health policy research
Stars: ✭ 102 (-5.56%)
ObjectronObjectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Stars: ✭ 1,352 (+1151.85%)
DatasetCrop/Weed Field Image Dataset
Stars: ✭ 98 (-9.26%)
Repo 2016R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation
Stars: ✭ 103 (-4.63%)
Open Semantic Entity Search ApiOpen Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names
Stars: ✭ 98 (-9.26%)
Cubicasa5kCubiCasa5k floor plan dataset
Stars: ✭ 98 (-9.26%)
Linguistic Style TransferNeural network parametrized objective to disentangle and transfer style and content in text
Stars: ✭ 106 (-1.85%)
ChatgirlChatGirl is an AI ChatBot based on TensorFlow Seq2Seq Model. ChatGirl 一个基于 TensorFlow Seq2Seq 模型的聊天机器人。(包含预处理过的 twitter 英文数据集,训练,运行,工具代码,来波 Star 。)QQ群:167122861
Stars: ✭ 105 (-2.78%)
PynlpA pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-4.63%)
Exposure correctionReference code for the paper "Learning Multi-Scale Photo Exposure Correction", CVPR 2021.
Stars: ✭ 98 (-9.26%)
The Nlp PandectA comprehensive reference for all topics related to Natural Language Processing
Stars: ✭ 1,349 (+1149.07%)
Wb srgbWhite balance camera-rendered sRGB images (CVPR 2019) [Matlab & Python]
Stars: ✭ 101 (-6.48%)
DeepweedsA Multiclass Weed Species Image Dataset for Deep Learning
Stars: ✭ 96 (-11.11%)
Texting[ACL 2020] Tensorflow implementation for "Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks"
Stars: ✭ 103 (-4.63%)
DataloadersPytorch and TensorFlow data loaders for several audio datasets
Stars: ✭ 97 (-10.19%)
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+1175.93%)