Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → htw2012 → chinese-nlp-ner

htw2012 / chinese-nlp-ner

Licence: other

一套针对中文实体识别的BLSTM-CRF解决方案

Programming Languages

139335 projects - #7 most used programming language

6916 projects

77523 projects

Labels

nlp chinese chinese-nlp ner chinese-ner

Projects that are alternatives of or similar to chinese-nlp-ner

Nlp chinese corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

Stars: ✭ 6,656 (+47442.86%)

Mutual labels: chinese, chinese-nlp

📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

Stars: ✭ 8,705 (+62078.57%)

Mutual labels: chinese, chinese-nlp

Bert Chinese Ner

使用预训练语言模型BERT做中文NER

Stars: ✭ 758 (+5314.29%)

Mutual labels: chinese, ner

中文 NLP 任务预处理工具包，准确、高效、零使用门槛

Stars: ✭ 449 (+3107.14%)

Mutual labels: chinese, ner

任何 JS 环境可用的中文分词包，fork from leizongmin/node-segment

Stars: ✭ 139 (+892.86%)

Mutual labels: chinese, chinese-nlp

Bert Ner Pytorch

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Stars: ✭ 654 (+4571.43%)

Mutual labels: chinese, ner

Cnn Question Classification Keras

Chinese Question Classifier (Keras Implementation) on BQuLD

Stars: ✭ 28 (+100%)

Mutual labels: chinese, chinese-nlp

中文命名实体识别，实体抽取，tensorflow，pytorch，BiLSTM+CRF

Stars: ✭ 938 (+6600%)

Mutual labels: chinese, ner

Cluedatasetsearch

搜索所有中文NLP数据集，附常用英文NLP数据集

Stars: ✭ 2,112 (+14985.71%)

Mutual labels: chinese, ner

Chinese Open Information Extraction (Tree-based Triple Relation Extraction Module)

Stars: ✭ 98 (+600%)

Mutual labels: chinese, chinese-nlp

zhparser is a PostgreSQL extension for full-text search of Chinese language

Stars: ✭ 418 (+2885.71%)

Mutual labels: chinese, chinese-nlp

details

Stars: ✭ 252 (+1700%)

Mutual labels: chinese, ner

Albert Chinese Ner

使用预训练语言模型ALBERT做中文NER

Stars: ✭ 302 (+2057.14%)

Mutual labels: chinese, ner

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

Stars: ✭ 689 (+4821.43%)

Mutual labels: chinese, ner

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Stars: ✭ 1,295 (+9150%)

Mutual labels: chinese, ner

中文自然语言处理工具集【断句/分词/词性标注/组块/句法分析/语义分析/NER/N元语法/HMM/代词消解/情感分析/拼写检查】

Stars: ✭ 206 (+1371.43%)

Mutual labels: chinese, chinese-nlp

中文NER的那些事儿

Stars: ✭ 241 (+1621.43%)

Mutual labels: ner, chinese-ner

bert tokenization for java

This is a java version of Chinese tokenization descried in BERT.

Stars: ✭ 39 (+178.57%)

Mutual labels: chinese-nlp

Convert asian text to web fonts

Stars: ✭ 14 (+0%)

Mutual labels: chinese

pbrt 中文整合翻译基于物理的渲染：从理论到实现 Physically Based Rendering: From Theory To Implementation

Stars: ✭ 221 (+1478.57%)

Mutual labels: chinese

View All Similar Projects ➔

chinese-nlp-ner

一套针对中文实体识别的BLSTM-CRF解决方案，主要包括：

数据处理
模型构建
模型训练
模型测试
服务部署(thrift和flask)两种方式

实现过程的一些要点记录：

粒度的问题。

字符和词均有尝试，整体而言，差别不是太大，约%1左右，实际中应用中如果实体词比较长，词的效果稍微好点。
引入外部特征加入。

主要目的是丰富标准的blstm+crf中Word Embedding层的特征学习问题，实际过程中，加入词的特征、词性POS tag信息，会提高2%+的性能，整体看来，外部加入的特征越多，越要好一点，多多益善。
实体概率计算的问题。

主要解决应用的时候，用于某种程度的置信度的判断，主要计算方法：（输出标签概率之积）的1/len的幂。举例如下：

比如输出实体“刘德华”的实体计算概率如下： “刘德华”对应的输出标签为B-Per、M-Per、E-Per。 P(B-Per,M-Per,E-Per)=P(B-Per)*P(M-Per|B-Per)*P(E-Per|B-Per,M-Per)

考虑到实体长度的影响，进行求幂: Pr(B-Per,M-Per,E-Per)=pow(P(B-Per,M-Per,E-Per), 1/len(B-Per,M-Per,E-Per))

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 14

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗