All Projects → Mitie_chinese_wikipedia_corpus → Similar Projects or Alternatives

274 Open source projects that are alternatives of or similar to Mitie_chinese_wikipedia_corpus

Russian news corpus
Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ
Stars: ✭ 76 (+76.74%)
Mutual labels:  corpus, nlp-machine-learning
Babyai
BabyAI platform. A testbed for training agents to understand and execute language commands.
Stars: ✭ 490 (+1039.53%)
Mutual labels:  nlp-machine-learning
Ner
Named Entity Recognition
Stars: ✭ 288 (+569.77%)
Mutual labels:  nlp-machine-learning
Indian ParallelCorpus
Curated list of publicly available parallel corpus for Indian Languages
Stars: ✭ 23 (-46.51%)
Mutual labels:  corpus
Nlp Conference Compendium
Compendium of the resources available from top NLP conferences.
Stars: ✭ 349 (+711.63%)
Mutual labels:  nlp-machine-learning
Tapas
End-to-end neural table-text understanding models.
Stars: ✭ 583 (+1255.81%)
Mutual labels:  nlp-machine-learning
Korpora
Korean corpus repository
Stars: ✭ 270 (+527.91%)
Mutual labels:  corpus
Click2analyze Androiddevchallenge
An app to analyze the text and fixing the anomaly of the message that deviates from what is standard, normal, or expected. #AndroidDevChallenge
Stars: ✭ 20 (-53.49%)
Mutual labels:  nlp-machine-learning
Awesome Sentiment Analysis
Repository with all what is necessary for sentiment analysis and related areas
Stars: ✭ 459 (+967.44%)
Mutual labels:  nlp-machine-learning
open-discourse
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
Stars: ✭ 47 (+9.3%)
Mutual labels:  corpus
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-44.19%)
Mutual labels:  corpus
Fuzzdata
Fuzzing resources for feeding various fuzzers with input. 🔧
Stars: ✭ 376 (+774.42%)
Mutual labels:  corpus
Quanteda
An R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (+1404.65%)
Mutual labels:  corpus
Contextualized Topic Models
A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (+639.53%)
Mutual labels:  nlp-machine-learning
Lyrics Corpora
An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts
Stars: ✭ 13 (-69.77%)
Mutual labels:  corpus
Dstc8 Schema Guided Dialogue
The Schema-Guided Dialogue Dataset
Stars: ✭ 277 (+544.19%)
Mutual labels:  nlp-machine-learning
Nlp base
自然语言基础模型
Stars: ✭ 524 (+1118.6%)
Mutual labels:  nlp-machine-learning
Fakenewscorpus
A dataset of millions of news articles scraped from a curated list of data sources.
Stars: ✭ 255 (+493.02%)
Mutual labels:  corpus
Letslearnai.github.io
Lets Learn AI
Stars: ✭ 33 (-23.26%)
Mutual labels:  nlp-machine-learning
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-55.81%)
Mutual labels:  corpus
Small Chinese Corpus
Some useful Chinese corpus datasets 中文语料小数据
Stars: ✭ 462 (+974.42%)
Mutual labels:  corpus
ConveRT-pytorch
ConveRT Paper Pytorch Implementation
Stars: ✭ 49 (+13.95%)
Mutual labels:  nlp-machine-learning
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+1809.3%)
Mutual labels:  corpus
Chinese Nlp Corpus
Collections of Chinese NLP corpus
Stars: ✭ 438 (+918.6%)
Mutual labels:  corpus
fairseq-tagging
a Fairseq fork for sequence tagging/labeling tasks
Stars: ✭ 26 (-39.53%)
Mutual labels:  nlp-machine-learning
SpiCE-Corpus
An open-access corpus of conversational bilingual speech in Cantonese and English
Stars: ✭ 33 (-23.26%)
Mutual labels:  corpus
Wordless
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Stars: ✭ 378 (+779.07%)
Mutual labels:  corpus
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+15379.07%)
Mutual labels:  corpus
Text mining resources
Resources for learning about Text Mining and Natural Language Processing
Stars: ✭ 358 (+732.56%)
Mutual labels:  nlp-machine-learning
Sdtm mapper
AI SDTM mapping (R for ML, Python, TensorFlow for DL)
Stars: ✭ 27 (-37.21%)
Mutual labels:  nlp-machine-learning
Lingua
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Stars: ✭ 341 (+693.02%)
Mutual labels:  nlp-machine-learning
Deeppavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
Stars: ✭ 5,525 (+12748.84%)
Mutual labels:  nlp-machine-learning
Dab
Data Augmentation by Backtranslation (DAB) ヽ( •_-)ᕗ
Stars: ✭ 294 (+583.72%)
Mutual labels:  nlp-machine-learning
Talismane
NLP framework: sentence detector, tokeniser, pos-tagger and dependency parser
Stars: ✭ 38 (-11.63%)
Mutual labels:  nlp-machine-learning
Cluecorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (+546.51%)
Mutual labels:  corpus
Chinese models for spacy
SpaCy 中文模型 | Models for SpaCy that support Chinese
Stars: ✭ 543 (+1162.79%)
Mutual labels:  nlp-machine-learning
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+534.88%)
Mutual labels:  nlp-machine-learning
Company Names Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+1918.6%)
Mutual labels:  corpus
Customer satisfaction analysis
基于在线民宿 UGC 数据的意见挖掘项目,包含数据挖掘和NLP 相关的处理,负责数据采集、主题抽取、情感分析等任务。目的是克服用户打分和评论不一致,实时对在线民宿的满意度评测,包含在线评论采集和情感可视化分析。搭建了百度地图POI查询入口,可以进行自动化的批量查询 POI 信息的功能;构建了基于在线民宿语料的 LDA 自动主题聚类模型,利用主题中心词能找出对应的主题属性字典;以用户打分作为标注,然后 litNlp 自带的字符级 TextCNN 进行情感分析,将情感分类概率分布作为情感趋势,最后通过 POI 热力图的方式对不同地域的民宿满意度进行展示。软件版本请见链接。
Stars: ✭ 262 (+509.3%)
Mutual labels:  nlp-machine-learning
Cluepretrainedmodels
高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型
Stars: ✭ 493 (+1046.51%)
Mutual labels:  corpus
Medical-Names-Corpus
医疗语料库。医疗机构名语料库。药品本位码。
Stars: ✭ 26 (-39.53%)
Mutual labels:  corpus
Tika Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (+2218.6%)
Mutual labels:  nlp-machine-learning
EdgarAllanPoetry
Computer-generated poetry
Stars: ✭ 22 (-48.84%)
Mutual labels:  corpus
Weixin public corpus
微信公众号语料库
Stars: ✭ 465 (+981.4%)
Mutual labels:  corpus
fastmorph
Fast corpus search engine originally made for the Corpus of Written Tatar language
Stars: ✭ 14 (-67.44%)
Mutual labels:  corpus
Naive Bayes Classifier
Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
Stars: ✭ 6 (-86.05%)
Mutual labels:  corpus
Species-Names-Corpus
物种名称语料库。植物名,动物名。
Stars: ✭ 23 (-46.51%)
Mutual labels:  corpus
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+969.77%)
Mutual labels:  corpus
DeepSentiPers
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (-60.47%)
Mutual labels:  corpus
Chatterbot Corpus
A multilingual dialog corpus
Stars: ✭ 964 (+2141.86%)
Mutual labels:  corpus
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-48.84%)
Mutual labels:  corpus
Bookcorpus
Crawl BookCorpus
Stars: ✭ 443 (+930.23%)
Mutual labels:  corpus
fuzzing-corpus
My fuzzing corpus
Stars: ✭ 120 (+179.07%)
Mutual labels:  corpus
Rasa Ui
Rasa UI is a frontend for the Rasa Framework
Stars: ✭ 796 (+1751.16%)
Mutual labels:  nlp-machine-learning
knowledge-extraction-recipes-forms
Knowledge Extraction For Forms Accelerators & Examples
Stars: ✭ 144 (+234.88%)
Mutual labels:  nlp-machine-learning
Corpora
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Stars: ✭ 4,293 (+9883.72%)
Mutual labels:  corpus
Coursera Natural Language Processing Specialization
Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.
Stars: ✭ 39 (-9.3%)
Mutual labels:  nlp-machine-learning
Typing Assistant
Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Stars: ✭ 32 (-25.58%)
Mutual labels:  corpus
Seq2seq Chatbot
Chatbot in 200 lines of code using TensorLayer
Stars: ✭ 777 (+1706.98%)
Mutual labels:  corpus
1-60 of 274 similar projects