Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

Stars: ✭ 1,767 (+609.64%)

Mutual labels: ner, crf, sequence-labeling

A Pytorch Tutorial To Sequence Labeling

Empower Sequence Labeling with Task-Aware Neural Language Model | a PyTorch Tutorial to Sequence Labeling

Stars: ✭ 257 (+3.21%)

Mutual labels: crf, sequence-labeling, pos-tagging

Lm Lstm Crf

Empower Sequence Labeling with Task-Aware Language Model

Stars: ✭ 778 (+212.45%)

Mutual labels: ner, crf, sequence-labeling

Named entity recognition

中文命名实体识别（包括多种模型：HMM，CRF，BiLSTM，BiLSTM+CRF的具体实现）

Stars: ✭ 995 (+299.6%)

Mutual labels: ner, crf, sequence-labeling

Lightner

Inference with state-of-the-art models (pre-trained by LD-Net / AutoNER / VanillaNER / ...)

Stars: ✭ 102 (-59.04%)

Mutual labels: ner, sequence-labeling

Min nlp practice

Chinese & English Cws Pos Ner Entity Recognition implement using CNN bi-directional lstm and crf model with char embedding.基于字向量的CNN池化双向BiLSTM与CRF模型的网络，可能一体化的完成中文和英文分词，词性标注，实体识别。主要包括原始文本数据，数据转换,训练脚本,预训练模型,可用于序列标注研究.注意：唯一需要实现的逻辑是将用户数据转化为序列模型。分词准确率约为93%，词性标注准确率约为90%，实体标注（在本样本上）约为85%。

Stars: ✭ 107 (-57.03%)

Mutual labels: ner, crf

Pytorch Bert Crf Ner

KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)

Stars: ✭ 236 (-5.22%)

Mutual labels: ner, crf

Multilstm

keras attentional bi-LSTM-CRF for Joint NLU (slot-filling and intent detection) with ATIS

Stars: ✭ 122 (-51%)

Mutual labels: ner, crf

Ner

命名体识别(NER)综述-论文-模型-代码(BiLSTM-CRF/BERT-CRF)-竞赛资源总结-随时更新

Stars: ✭ 118 (-52.61%)

Mutual labels: ner, crf

Malaya

Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/

Stars: ✭ 239 (-4.02%)

Mutual labels: ner, pos-tagging

Etagger

reference tensorflow code for named entity tagging

Stars: ✭ 100 (-59.84%)

Mutual labels: ner, crf

Nlp Journey

Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation)，etc. All codes are implemented intensorflow 2.0.

Stars: ✭ 1,290 (+418.07%)

Mutual labels: ner, crf

Daguan 2019 rank9

datagrand 2019 information extraction competition rank9

Stars: ✭ 121 (-51.41%)

Mutual labels: ner, crf

Torchcrf

An Inplementation of CRF (Conditional Random Fields) in PyTorch 1.0

Stars: ✭ 58 (-76.71%)

Mutual labels: ner, crf

Ner Slot filling

中文自然语言的实体抽取和意图识别（Natural Language Understanding），可选Bi-LSTM + CRF 或者 IDCNN + CRF

Stars: ✭ 151 (-39.36%)

Mutual labels: ner, crf

Macadam

Macadam是一个以Tensorflow(Keras)和bert4keras为基础，专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入; 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法; 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法。

Stars: ✭ 149 (-40.16%)

Mutual labels: ner, sequence-labeling

Clinical Ner

面向中文电子病历的命名实体识别

Stars: ✭ 151 (-39.36%)

Mutual labels: ner, crf

View All Similar Projects ➔

NER-LSTM-CNNs-CRF

LSTM-CNNs-CRF impolment in pytorch, and test in conll2003 dataset, reference End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF.
PyTorch0.3.1 Code release on here. [PyTorch0.3.1]

Requirement

PyTorch: 1.0.1
Python: 3.6
Cuda: 8.0 or 9.0(support cuda speed up, can chose)

Usage

modify the config file, detail see the Config directory
Train:
	The best nn model will be saved during training.
	---> sh run_train_p.sh
Test:
	Train finished. Decoding test data, and write decode result to file.
	---> sh run_test.sh
Eval:
	For the decode result file, use conlleval script in Tools directory to calculate F-score.
	---> sh run_eval.sh

Config

[Embed]
	pretrained_embed = True（default: False）
	nnembed = True
	pretrained_embed_file = embed file path
[Data]
	max_count = -1  ## Number of sentences loaded(-1 represents all)
[Save]
	save_pkl = True(save pkl file for test, default True)
	save_best_model = True(save best performance result for test, default True)
[Model]
	use_crf = True  ## CRF 
	use_char = True  ## CNN
	model_bilstm = True  ##BiLSTM
	embed_dim = 100, lstm_hiddens = 200
	dropout_emb = 0.5, dropout = 0.5
	max_char_len = 20, char_dim = 30, conv_filter_sizes = 3, conv_filter_nums = 30
[Optimizer]
	sgd = True
	learning_rate = 0.015 , weight_decay = 1.0e-8
	use_lr_decay = True, lr_rate_decay = 0.05, min_lrate = 0.000005, max_patience = 1
[Train]
	batch_size = 10
	early_max_patience = 10(early stop max patience)

This is a major configuration file description, for more detailed reference to config.cfg file and config readme.

Model

BiLSTM
- CNN
- CRF

Data

The number of sentences:

Data	Train	Dev	Test
conll2003	14987	3466	3684

The Data format is BIES label, data sample in Data directory.
Tag convert scripr in Here [tagSchemeConverter.py]
Conll2003 dataset can be downloaded from Conll2003
Pre-Trained Embedding can be downloaded from glove.6B.zip

Performance

Performance on the Conll2003, eval on the script conlleval in Tools
log in final_log

Model	% P	% R	% F1
BLSTM	88.61	88.50	88.56
BLSTM-CRF	90.33	88.81	89.56
BLSTM-CNN	89.23	90.97	90.09
BLSTM-CNN-CRF	91.42	91.24	91.33

Reference

Question

if you have any question, you can open a issue or email [email protected]{gmail.com, 163.com}.
if you have any good suggestions, you can PR or email me.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 249

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗