Neuralcoref✨Fast Coreference Resolution in spaCy with Neural Networks
Thinc🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
Customer Chatbot中文智能客服机器人demo,包含闲聊和专业问答2个部分,支持自定义组件(Chinese intelligent customer chatbot Demo, including the gossip and the professional Q&A(FAQ) , support for custom components!)
ClafCLaF: Open-Source Clova Language Framework
FlowqaImplementation of conversational QA model: FlowQA (with slight improvement)
PyrougeA Python wrapper for the ROUGE summarization evaluation package
EmbeddingsFast, DB Backed pretrained word embeddings for natural language processing.
Pyss3A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Displacy Ent💥 displaCy-ent.js: An open-source named entity visualiser for the modern web
ArxivnotesIssuesにNLP(自然言語処理)に関連するの論文を読んだまとめを書いています.雑です.🚧 マークは編集中の論文です(事実上放置のものも多いです).🍡 マークは概要のみ書いてます(早く見れる的な意味で団子).
GermanwordembeddingsToolkit to obtain and preprocess german corpora, train models using word2vec (gensim) and evaluate them with generated testsets
Acl Paperspaper summary of Association for Computational Linguistics
SentimentanalysisSentiment analysis neural network trained by fine-tuning BERT, ALBERT, or DistilBERT on the Stanford Sentiment Treebank.
Vec4irWord Embeddings for Information Retrieval
Nlp learning结合python一起学习自然语言处理 (nlp): 语言模型、HMM、PCFG、Word2vec、完形填空式阅读理解任务、朴素贝叶斯分类器、TFIDF、PCA、SVD
DetoxifyTrained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers.
Opencc4j🇨🇳Open Chinese Convert is an opensource project for conversion between Traditional Chinese and Simplified Chinese.(java 中文繁简体转换)
VirgilioVirgilio is developed and maintained by these awesome people.
You can email us virgilio.datascience (at) gmail.com or join the Discord chat.
ExamplesJina examples and demos to help you get started
NeologdnJapanese text normalizer for mecab-neologd
Datastories Semeval2017 Task4Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Fairseq GecSource code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
TextheroText preprocessing, representation and visualization from zero to hero.
Dkpro CoreCollection of software components for natural language processing (NLP) based on the Apache UIMA framework.
TriviaqaCode for the TriviaQA reading comprehension dataset
KomoranKorean Morphological Analyzer by shineware
Persian Nerپیکره بزرگ شناسایی موجودیتهای نامدار فارسی برچسب خورده
Cargo SpellcheckChecks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar
Kr Wordrank비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다
R Net In KerasOpen R-NET implementation and detailed analysis: https://git.io/vd8dx
Nlp profilerA simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
StopwordsDefault English stopword lists from many different sources
ThuctcAn Efficient Chinese Text Classifier
EudexA blazingly fast phonetic reduction/hashing algorithm.
GsdmmGSDMM: Short text clustering
KcwsDeep Learning Chinese Word Segment
KashgariKashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
TockTock - the open source conversational AI toolkit
Knockknock🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
Spark NlpState of the Art Natural Language Processing
CadmiumNatural Language Processing (NLP) library for Crystal
Gpt 2 Tensorflow2.0OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0