dnn-lstm-word-segmentChinese Word Segmention Base on the Deep Learning and LSTM Neural Network
Stars: ✭ 24 (+84.62%)
SymspellSymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Stars: ✭ 1,976 (+15100%)
Lac百度NLP:分词,词性标注,命名实体识别,词重要性
Stars: ✭ 2,792 (+21376.92%)
youtokentome-rubyHigh performance unsupervised text tokenization for Ruby
Stars: ✭ 17 (+30.77%)
Pytorch-NLUPytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…
Stars: ✭ 151 (+1061.54%)
lxa5Linguistica 5: Unsupervised Learning of Linguistic Structure
Stars: ✭ 27 (+107.69%)
ChineseBertThis is a chinese Bert model specific for question answering
Stars: ✭ 24 (+84.62%)
ltp4jltp4j: Language Technology Platform For Java
Stars: ✭ 165 (+1169.23%)
ThulacAn Efficient Lexical Analyzer for Chinese
Stars: ✭ 629 (+4738.46%)
foliapyAn extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.
Stars: ✭ 13 (+0%)
Thulac JavaAn Efficient Lexical Analyzer for Chinese
Stars: ✭ 285 (+2092.31%)
THUCKETHU Chinese Keyphrase Extraction Toolkit
Stars: ✭ 116 (+792.31%)
Information Extraction ChineseChinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Stars: ✭ 1,888 (+14423.08%)
Electra with tensorflowThis is an implementation of electra according to the paper {ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators}
Stars: ✭ 13 (+0%)
JcsegJcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for the latest lucene,solr,elasticsearch
Stars: ✭ 754 (+5700%)
ArabicProcessingCogA Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Stars: ✭ 19 (+46.15%)
Weatherbot一个基于 Rasa 的中文天气情况问询机器人(chatbot), 带 Web UI 界面
Stars: ✭ 186 (+1330.77%)
sembei🍘 単語分割を経由しない単語埋め込み 🍘
Stars: ✭ 14 (+7.69%)
foliaFoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+330.77%)
mystem-scalaMorphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (+61.54%)
Thulac PythonAn Efficient Lexical Analyzer for Chinese
Stars: ✭ 1,619 (+12353.85%)
word2vec-tsneGoogle News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+353.85%)
Chineseaddress ocrPhotographing Chinese-Address OCR implemented using CTPN+CTC+Address Correction. 拍照文档中文地址文字识别。
Stars: ✭ 309 (+2276.92%)
ChinesenlpDatasets, SOTA results of every fields of Chinese NLP
Stars: ✭ 1,206 (+9176.92%)
G2pcg2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
Stars: ✭ 155 (+1092.31%)
berserkerBerserker - BERt chineSE woRd toKenizER
Stars: ✭ 17 (+30.77%)
Awesome Chinese NlpA curated list of resources for Chinese NLP 中文自然语言处理相关资料
Stars: ✭ 6,599 (+50661.54%)
Chinese-Minority-PLMCINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
Stars: ✭ 133 (+923.08%)
Segmentit任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment
Stars: ✭ 139 (+969.23%)
Fengshenbang-LMFengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Stars: ✭ 1,813 (+13846.15%)
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+51100%)
python-arpa🐍 Python library for n-gram models in ARPA format
Stars: ✭ 35 (+169.23%)
Fancy NlpNLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
Stars: ✭ 233 (+1692.31%)
CISTEMStemmer for German
Stars: ✭ 33 (+153.85%)
wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (+1184.62%)
Chinese Chatbot中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行,跑不起来直播吃键盘。
Stars: ✭ 124 (+853.85%)
SentimentAnalysisSentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (+146.15%)
Ddparser百度开源的依存句法分析系统
Stars: ✭ 537 (+4030.77%)
ThuctcAn Efficient Chinese Text Classifier
Stars: ✭ 179 (+1276.92%)
datastories-semeval2017-task6Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Stars: ✭ 20 (+53.85%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (+0%)
citation-functionMeasuring the Evolution of a Scientific Field through Citation Frames
Stars: ✭ 40 (+207.69%)
Chinese nlu by using rasa nlu使用 RASA NLU 来构建中文自然语言理解系统(NLU)| Use RASA NLU to build a Chinese Natural Language Understanding System (NLU)
Stars: ✭ 99 (+661.54%)
datalinguistStanford CoreNLP in idiomatic Clojure.
Stars: ✭ 93 (+615.38%)
frogFrog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Stars: ✭ 70 (+438.46%)
Zhparserzhparser is a PostgreSQL extension for full-text search of Chinese language
Stars: ✭ 418 (+3115.38%)