All Projects → Coarij → Similar Projects or Alternatives

1516 Open source projects that are alternatives of or similar to Coarij

Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+120%)
Prosody
Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Stars: ✭ 139 (+152.73%)
Ua Gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (+96.36%)
Nlp bahasa resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Stars: ✭ 158 (+187.27%)
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+363.64%)
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+1392.73%)
Indonesian Nlp Resources
data resource untuk NLP bahasa indonesia
Stars: ✭ 143 (+160%)
Mutual labels:  dataset, corpus
Text2sql Data
A collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (+421.82%)
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (+887.27%)
Char Rnn Tensorflow
Multi-layer Recurrent Neural Networks for character-level language models implements by TensorFlow
Stars: ✭ 58 (+5.45%)
Dataset List
lists of text corpus and more (mainly Japanese)
Stars: ✭ 84 (+52.73%)
Mutual labels:  dataset, corpus
Dialog corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+2921.82%)
Mutual labels:  dataset, corpus
Mams For Absa
A Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment analysis.
Stars: ✭ 135 (+145.45%)
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+796.36%)
Mutual labels:  dataset, corpus
Pytreebank
😡😇 Stanford Sentiment Treebank loader in Python
Stars: ✭ 93 (+69.09%)
Species-Names-Corpus
物种名称语料库。植物名,动物名。
Stars: ✭ 23 (-58.18%)
Mutual labels:  corpus, dataset
Korean Hate Speech
Korean HateSpeech Dataset
Stars: ✭ 192 (+249.09%)
Ja.text8
Japanese text8 corpus for word embedding.
Stars: ✭ 79 (+43.64%)
Efaqa Corpus Zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Stars: ✭ 170 (+209.09%)
Bond
BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision
Stars: ✭ 96 (+74.55%)
Nlvr
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
Stars: ✭ 192 (+249.09%)
Pytorch Nlp
Basic Utilities for PyTorch Natural Language Processing (NLP)
Stars: ✭ 1,996 (+3529.09%)
Pandas Datareader
Extract data from a wide range of Internet sources into a pandas DataFrame.
Stars: ✭ 2,183 (+3869.09%)
Mutual labels:  dataset, finance
Chazutsu
The tool to make NLP datasets ready to use
Stars: ✭ 238 (+332.73%)
Medical-Names-Corpus
医疗语料库。医疗机构名语料库。药品本位码。
Stars: ✭ 26 (-52.73%)
Mutual labels:  corpus, dataset
Chinese Names Corpus
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Stars: ✭ 3,053 (+5450.91%)
Mutual labels:  dataset, corpus
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+414.55%)
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+736.36%)
Mtnt
Code for the collection and analysis of the MTNT dataset
Stars: ✭ 48 (-12.73%)
Quanteda
An R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (+1076.36%)
Clue
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Stars: ✭ 2,425 (+4309.09%)
Mutual labels:  dataset, corpus
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+12001.82%)
Mutual labels:  dataset, corpus
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+1478.18%)
Mutual labels:  dataset, corpus
Wikisql
A large annotated semantic parsing corpus for developing natural language interfaces.
Stars: ✭ 965 (+1654.55%)
Awesome Financial Nlp
Researches for Natural Language Processing for Financial Domain
Stars: ✭ 220 (+300%)
Doccano
Open source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+10081.82%)
Gossiping Chinese Corpus
PTT 八卦版問答中文語料
Stars: ✭ 137 (+149.09%)
Mutual labels:  dataset, corpus
Weixin public corpus
微信公众号语料库
Stars: ✭ 465 (+745.45%)
Typing Assistant
Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Stars: ✭ 32 (-41.82%)
Market Reporter
Automatic Generation of Brief Summaries of Time-Series Data
Stars: ✭ 54 (-1.82%)
Codar
✅ CODAR is a Framework built using PyTorch to analyze post (Text+Media) and predict Cyber Bullying and offensive content. 💬📷
Stars: ✭ 52 (-5.45%)
Mutual labels:  dataset
Cdqa Annotator
⛔ [NOT MAINTAINED] A web-based annotator for closed-domain question answering datasets with SQuAD format.
Stars: ✭ 48 (-12.73%)
Multidigitmnist
Combine multiple MNIST digits to create datasets with 100/1000 classes for few-shot learning/meta-learning
Stars: ✭ 48 (-12.73%)
Mutual labels:  dataset
Nltk Book Resource
Notes and solutions to complement the official NLTK book
Stars: ✭ 54 (-1.82%)
Covid 19
Novel Coronavirus 2019 time series data on cases
Stars: ✭ 1,060 (+1827.27%)
Mutual labels:  dataset
Spacy Lookups Data
📂 Additional lookup tables and data resources for spaCy
Stars: ✭ 48 (-12.73%)
Paymint
The Paymint Wallet is a secure and user friendly Bitcoin wallet
Stars: ✭ 48 (-12.73%)
Mutual labels:  finance
Iob2corpus
Japanese IOB2 tagged corpus for Named Entity Recognition.
Stars: ✭ 51 (-7.27%)
Stocksight
Stock market analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
Stars: ✭ 1,037 (+1785.45%)
Greynir
The greynir.is natural language processing website for Icelandic
Stars: ✭ 47 (-14.55%)
Scdv
Text classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-1.82%)
Thot
Thot toolkit for statistical machine translation
Stars: ✭ 53 (-3.64%)
Images Web Crawler
This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..
Stars: ✭ 51 (-7.27%)
Mutual labels:  dataset
Pujangga
Pujangga - Indonesian Natural Language Processing Tool with REST API, an Interface for InaNLP and Deeplearning4j's Word2Vec
Stars: ✭ 47 (-14.55%)
Exemplar
An open relation extraction system
Stars: ✭ 46 (-16.36%)
Msgarch
MSGARCH R Package
Stars: ✭ 51 (-7.27%)
Mutual labels:  finance
Py Nltools
A collection of basic python modules for spoken natural language processing
Stars: ✭ 46 (-16.36%)
Nagisa Tutorial Pycon2019
Code for PyCon JP 2019 talk "Python による日本語自然言語処理 〜系列ラベリングによる実世界テキスト分析〜"
Stars: ✭ 46 (-16.36%)
Finance.js
A JavaScript library for common financial calculations
Stars: ✭ 1,070 (+1845.45%)
Mutual labels:  finance
Lingua Franca
Mycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-7.27%)
1-60 of 1516 similar projects