Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for the latest lucene,solr,elasticsearch

Stars: ✭ 754 (+266.02%)

Mutual labels: chinese-nlp, chinese-word-segmentation

Chinese Word Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Stars: ✭ 9,548 (+4534.95%)

Mutual labels: chinese, chinese-word-segmentation

Cluedatasetsearch

搜索所有中文NLP数据集，附常用英文NLP数据集

Stars: ✭ 2,112 (+925.24%)

Mutual labels: chinese, sentiment-analysis

G2pc

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese

Stars: ✭ 155 (-24.76%)

Mutual labels: chinese-nlp, chinese-word-segmentation

Datastories Semeval2017 Task4

Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".

Stars: ✭ 184 (-10.68%)

Mutual labels: sentiment-analysis

Pyhanlp

中文分词词性标注命名实体识别依存句法分析新词发现关键词短语提取自动摘要文本分类聚类拼音简繁自然语言处理

Stars: ✭ 2,564 (+1144.66%)

Mutual labels: chinese-word-segmentation

Jszhuyin

JS 注音：JavaScript 自動選字注音輸入法；"Smart" Chinese Zhuyin Input Method in JavaScript.

Stars: ✭ 184 (-10.68%)

Mutual labels: chinese

Crypto trader

Q-Learning Based Cryptocurrency Trader and Portfolio Optimizer for the Poloniex Exchange

Stars: ✭ 184 (-10.68%)

Mutual labels: sentiment-analysis

View All Similar Projects ➔

nlp4han

中文自然语言处理工具集。更多信息参见Wiki

功能特性

断句
- 基于规则的中文断句器
分词
- 基于字的最大熵中文分词器
- 组合中文分词和词性标注器
词性标注
- 基准中文词性标注器
- 单步基于词的最大熵中文词性标注器
- 单步基于字的最大熵中文词性标注器
- 组合中文分词和词性标注器
- 基于HMM的中文词性标注器
N元语言模型
HMM模型
命名实体识别
- 基于字的命名实体识别
- 基于分词的命名实体识别
- 基于分词和词性标注的命名实体识别
组块/浅层句法分析
- 基于词的最大熵中文基本组块标注
- 基于词和词性的最大熵中文基本组块标注
- 组合中文词性标注和基本组块标注
- 基于SVM的中文组块标注
依存句法分析
- 基于最大生成树MST和最大熵的依存句法分析
- 基于转换的依存句法分析
短语结构（成分）句法分析
- 基于最大熵的短语结构（成分）句法分析
- 基于CKY的PCFG短语结构（成分）句法分析
- 中心词驱动的短语结构句法分析
- 基于隐藏标记的非词汇化短语结构句法分析
语义角色标注
- 基于最大熵的语义角色标注
指代消解
- 基于Hobbs算法的指代消解
情感分析
- 基于朴素贝叶斯的文档情感分析
- 基于规则和短语结构树的句子情感分析
GUI工具
- 基于nlp4han功能实现的短语结构树编辑工具

更新日志

2018.12.16, 基于Hobbs算法的指代消解
2018.12, 基于隐藏标记的非词汇化短语结构句法分析
2018.11, 集成短语结构树编辑工具到nlp4han-tools，并使用nlp4han的中文分词、词性标注和句法分析功能。
2018.10, 基于SVM的中文组块标注
2018.9, 中心词驱动的短语结构句法分析
2018.7, 基于CKY的PCFG短语结构（成分）句法分析
2018.6, 基于转换的依存句法分析
2018.5, 基于朴素贝叶斯的文档情感分析, 基于规则和短语结构树的句子情感分析
2018.3, 基于最大熵的语义角色标注
2018.2, 基于最大熵的短语结构（成分）句法分析, HMM模型
2018.1, 基于最大生成树MST和最大熵的依存句法分析
2017.12, 组合中文词性标注和基本组块标注, N元语言模型
2017.11, 基于词和词性的最大熵中文基本组块标注, 基于词的最大熵中文基本组块标注
2017.10, 基于分词和词性标注的命名实体识别
2017.9, 基于分词的命名实体识别, 基于字的命名实体识别
2017.8, 基于HMM的中文词性标注器
2017.7, 组合中文分词和词性标注器
2017.6, 单步基于字的最大熵中文词性标注器
2017.5, 单步基于词的最大熵中文词性标注器
2017.4, 基准中文词性标注器
2017.3, 组合中文分词和词性标注器
2017.2, 基于字的最大熵中文分词器
2016.12, 基于规则的中文断句器

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 206

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (11) 🔗