把李航老师《统计学习方法》的后几章的算法都用java实现了一遍，实现盒子与球的EM算法，扩展到去GMM训练，后来实现了HMM分词（实现了HMM分词的参数训练）和CRF分词（借用CRF++训练的参数模型），最后利用tensorFlow把BiLSTM+CRF实现了，然后为lucene包装了一个XinAnalyzer

Stars: ✭ 21 (+61.54%)

Mutual labels: crf

deepseg

Chinese word segmentation in tensorflow 2.x

Stars: ✭ 23 (+76.92%)

Mutual labels: crf

video-quality-metrics

Test specified presets/CRF values for the x264 or x265 encoder. Compares VMAF/SSIM/PSNR numerically & via graphs.

Stars: ✭ 87 (+569.23%)

Mutual labels: crf

ChineseNER

中文NER的那些事儿

Stars: ✭ 241 (+1753.85%)

Mutual labels: crf

Gumbel-CRF

Implementation of NeurIPS 20 paper: Latent Template Induction with Gumbel-CRFs

Stars: ✭ 51 (+292.31%)

Mutual labels: crf

Pytorch Bert Crf Ner

KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)

Stars: ✭ 236 (+1715.38%)

Mutual labels: crf

korean ner tagging challenge

KU_NERDY 이동엽, 임희석 (2017 국어 정보 처리 시스템경진대회 금상) - 한글 및 한국어 정보처리 학술대회

Stars: ✭ 30 (+130.77%)

Mutual labels: crf

Fancy Nlp

NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.

Stars: ✭ 233 (+1692.31%)

Mutual labels: crf

mahjong

开源中文分词工具包，中文分词Web API，Lucene中文分词，中英文混合分词

Stars: ✭ 40 (+207.69%)

Mutual labels: crf

keras-crf-layer

Implementation of CRF layer in Keras.

Stars: ✭ 76 (+484.62%)

Mutual labels: crf

Hierarchical-Word-Sense-Disambiguation-using-WordNet-Senses

Word Sense Disambiguation using Word Specific models, All word models and Hierarchical models in Tensorflow

Stars: ✭ 33 (+153.85%)

Mutual labels: crf

BiLSTM-CRF-NER-PyTorch

This repo contains a PyTorch implementation of a BiLSTM-CRF model for named entity recognition task.

Stars: ✭ 109 (+738.46%)

Mutual labels: crf

View All Similar Projects ➔

crf-seg

crf-seg是CRF模型用于自然语言处理（NLP）的Java工具包，目标是普及自然语言处理在生产环境中的应用。 crf-seg具备性能高效、架构清晰、语料时新、可自定义语料、可自定义模型的特点。

author：xuming(shibing624)

environment：jdk 1.8

演示页面 http://www.borntowin.cn:8080/xmnlp

CRF模型对新词有很好的识别能力，对繁体字的处理及专有名词识别良好，但开销较大。是目前中文分词效果最好的模型，可用于生产环境。

使用

模型文件需要另外下载，并不包含在源码中，网盘下载：http://pan.baidu.com/s/1skQW35j，放置在 data/model/segment 下。

crf-seg调用方便：

System.out.println(Xmnlp.crfSegment("你好，欢迎使用CRF分词工具！"));

训练自定义模型

使用GenerateBMESDemo（位于test中的org.xm.xmnlp.demo下）生成自己数据的序列标注集，之后用crf++生成crf模型。
提供熟语料参考文件以及我用GenerateBMESDemo生成的测试文件，可以作为格式参考，网盘下载：http://pan.baidu.com/s/1eStd0jg 。
提供人民日报2014版标注的分词数据，网盘下载：链接：http://pan.baidu.com/s/1gfae4Zh 密码：l506 。尊重版权，传播请注明出处。
提供linux版和windows版的crf++模型生成工具，网盘下载：http://pan.baidu.com/s/1skKkTgL 。
请通过命令行参数指定CRF++生成txt格式的模型，比如：

crf_learn -f 3 -c 4.0 template train.bmes.txt crf-simple.model -t

然后将生成的 crf-simple.model.txt 的路径替换到配置项CRFSegmentModelPath，首次运行后会得到相应的 crf-simple.model.txt.bin 文件；下次加载时会直接从bin缓存加载，速度会快很多。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

shibing624 / crf-seg

Programming Languages

Labels

Projects that are alternatives of or similar to crf-seg

crf-seg

author：xuming(shibing624)

environment：jdk 1.8

演示页面 http://www.borntowin.cn:8080/xmnlp

CRF模型对新词有很好的识别能力，对繁体字的处理及专有名词识别良好，但开销较大。是目前中文分词效果最好的模型，可用于生产环境。

使用

crf-seg调用方便：

训练自定义模型