All Projects → DANeS → Similar Projects or Alternatives

116 Open source projects that are alternatives of or similar to DANeS

egret-wenda-corpus
A Public Corpus for Machine Learning
Stars: ✭ 41 (-35.94%)
Mutual labels:  corpus, corpus-data
Sejong Corpus
Korean sejong corpus download and simple analysis
Stars: ✭ 116 (+81.25%)
Mutual labels:  corpus
Typing Assistant
Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Stars: ✭ 32 (-50%)
Mutual labels:  corpus
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+670.31%)
Mutual labels:  corpus
Blacklab
A corpus retrieval engine based on Apache Lucene
Stars: ✭ 69 (+7.81%)
Mutual labels:  corpus
Khcoder
KH Coder: for Quantitative Content Analysis or Text Mining
Stars: ✭ 126 (+96.88%)
Mutual labels:  corpus
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+1182.81%)
Mutual labels:  corpus
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+146.88%)
Mutual labels:  corpus
Pansori
Tools for ASR Corpus Generation from Online Video
Stars: ✭ 106 (+65.63%)
Mutual labels:  corpus
Bookcorpus
Crawl BookCorpus
Stars: ✭ 443 (+592.19%)
Mutual labels:  corpus
Fuzzdata
Fuzzing resources for feeding various fuzzers with input. 🔧
Stars: ✭ 376 (+487.5%)
Mutual labels:  corpus
Ja.text8
Japanese text8 corpus for word embedding.
Stars: ✭ 79 (+23.44%)
Mutual labels:  corpus
Code Docstring Corpus
Preprocessed Python functions and docstrings for automated code documentation (code2doc) and automated code generation (doc2code) tasks.
Stars: ✭ 137 (+114.06%)
Mutual labels:  corpus
Mitie chinese wikipedia corpus
Pre-trained Wikipedia corpus by MITIE
Stars: ✭ 43 (-32.81%)
Mutual labels:  corpus
Nlvr
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
Stars: ✭ 192 (+200%)
Mutual labels:  corpus
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+1256.25%)
Mutual labels:  corpus
Dialog corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+2496.88%)
Mutual labels:  corpus
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+10300%)
Mutual labels:  corpus
Dialogue-Corpus
No description or website provided.
Stars: ✭ 27 (-57.81%)
Mutual labels:  corpus
Small Chinese Corpus
Some useful Chinese corpus datasets 中文语料小数据
Stars: ✭ 462 (+621.88%)
Mutual labels:  corpus
Datasets
Poetry-related datasets developed by THUAIPoet (Jiuge) group.
Stars: ✭ 111 (+73.44%)
Mutual labels:  corpus
Corpora
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Stars: ✭ 4,293 (+6607.81%)
Mutual labels:  corpus
Wp2txt
WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata.
Stars: ✭ 145 (+126.56%)
Mutual labels:  corpus
Lexicon Thai
คลังศัพท์ภาษาไทย
Stars: ✭ 96 (+50%)
Mutual labels:  corpus
Korpora
Korean corpus repository
Stars: ✭ 270 (+321.88%)
Mutual labels:  corpus
Medical-Names-Corpus
医疗语料库。医疗机构名语料库。药品本位码。
Stars: ✭ 26 (-59.37%)
Mutual labels:  corpus
Dataset List
lists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (+31.25%)
Mutual labels:  corpus
Gossiping Chinese Corpus
PTT 八卦版問答中文語料
Stars: ✭ 137 (+114.06%)
Mutual labels:  corpus
Russian news corpus
Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ
Stars: ✭ 76 (+18.75%)
Mutual labels:  corpus
Weibo terminater
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Stars: ✭ 2,295 (+3485.94%)
Mutual labels:  corpus
Coarij
Corpus of Annual Reports in Japan
Stars: ✭ 55 (-14.06%)
Mutual labels:  corpus
Awesome Chatbot
Awesome Chatbot Projects,Corpus,Papers,Tutorials.Chinese Chatbot =>:
Stars: ✭ 1,785 (+2689.06%)
Mutual labels:  corpus
Chatterbot Corpus
A multilingual dialog corpus
Stars: ✭ 964 (+1406.25%)
Mutual labels:  corpus
megs
A merged version of multiple open-source German speech datasets.
Stars: ✭ 21 (-67.19%)
Mutual labels:  corpus
Lyrics Corpora
An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts
Stars: ✭ 13 (-79.69%)
Mutual labels:  corpus
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+3200%)
Mutual labels:  corpus
Naive Bayes Classifier
Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
Stars: ✭ 6 (-90.62%)
Mutual labels:  corpus
Efaqa Corpus Zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Stars: ✭ 170 (+165.63%)
Mutual labels:  corpus
Seq2seq Chatbot
Chatbot in 200 lines of code using TensorLayer
Stars: ✭ 777 (+1114.06%)
Mutual labels:  corpus
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+89.06%)
Mutual labels:  corpus
Quanteda
An R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (+910.94%)
Mutual labels:  corpus
rclc
Rich Context leaderboard competition, including the corpus and current SOTA for required tasks.
Stars: ✭ 20 (-68.75%)
Mutual labels:  corpus
Weixin public corpus
微信公众号语料库
Stars: ✭ 465 (+626.56%)
Mutual labels:  corpus
Colibri Core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (+75%)
Mutual labels:  corpus
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+618.75%)
Mutual labels:  corpus
Indonesian Nlp Resources
data resource untuk NLP bahasa indonesia
Stars: ✭ 143 (+123.44%)
Mutual labels:  corpus
Chinese Nlp Corpus
Collections of Chinese NLP corpus
Stars: ✭ 438 (+584.38%)
Mutual labels:  corpus
Ua Gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+68.75%)
Mutual labels:  corpus
Wordless
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Stars: ✭ 378 (+490.63%)
Mutual labels:  corpus
Chinese Names Corpus
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+4670.31%)
Mutual labels:  corpus
Cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (+334.38%)
Mutual labels:  corpus
Pubmed Rct
PubMed 200k RCT dataset: a large dataset for sequential sentence classification.
Stars: ✭ 101 (+57.81%)
Mutual labels:  corpus
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+298.44%)
Mutual labels:  corpus
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+3689.06%)
Mutual labels:  corpus
Indian ParallelCorpus
Curated list of publicly available parallel corpus for Indian Languages
Stars: ✭ 23 (-64.06%)
Mutual labels:  corpus
Chi Corpus
迟先生语料库
Stars: ✭ 96 (+50%)
Mutual labels:  corpus
Probabilistic-RNN-DA-Classifier
Probabilistic Dialogue Act Classification for the Switchboard Corpus using an LSTM model
Stars: ✭ 22 (-65.62%)
Mutual labels:  corpus
german-nouns
A list of ~100,000 German nouns and their grammatical properties compiled from WiktionaryDE as CSV file. Plus a module to look up the data and parse compound words.
Stars: ✭ 101 (+57.81%)
Mutual labels:  corpus
Awesome Deeplearning Resources
Deep Learning and deep reinforcement learning research papers and some codes
Stars: ✭ 2,483 (+3779.69%)
Mutual labels:  corpus
Prosody
Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+117.19%)
Mutual labels:  corpus
1-60 of 116 similar projects