All Projects → bamtercelboo → Pytorch_ner_bilstm_cnn_crf

bamtercelboo / Pytorch_ner_bilstm_cnn_crf

Licence: apache-2.0
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF implement in pyotrch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch ner bilstm cnn crf

fairseq-tagging
a Fairseq fork for sequence tagging/labeling tasks
Stars: ✭ 26 (-89.56%)
Mutual labels:  ner, pos-tagging, sequence-labeling
Ntagger
reference pytorch code for named entity tagging
Stars: ✭ 58 (-76.71%)
Mutual labels:  ner, crf, sequence-labeling
Hscrf Pytorch
ACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://aclweb.org/anthology/P18-2038)
Stars: ✭ 284 (+14.06%)
Mutual labels:  ner, crf, sequence-labeling
Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+609.64%)
Mutual labels:  ner, crf, sequence-labeling
A Pytorch Tutorial To Sequence Labeling
Empower Sequence Labeling with Task-Aware Neural Language Model | a PyTorch Tutorial to Sequence Labeling
Stars: ✭ 257 (+3.21%)
Mutual labels:  crf, sequence-labeling, pos-tagging
Lm Lstm Crf
Empower Sequence Labeling with Task-Aware Language Model
Stars: ✭ 778 (+212.45%)
Mutual labels:  ner, crf, sequence-labeling
Named entity recognition
中文命名实体识别(包括多种模型:HMM,CRF,BiLSTM,BiLSTM+CRF的具体实现)
Stars: ✭ 995 (+299.6%)
Mutual labels:  ner, crf, sequence-labeling
Lightner
Inference with state-of-the-art models (pre-trained by LD-Net / AutoNER / VanillaNER / ...)
Stars: ✭ 102 (-59.04%)
Mutual labels:  ner, sequence-labeling
Min nlp practice
Chinese & English Cws Pos Ner Entity Recognition implement using CNN bi-directional lstm and crf model with char embedding.基于字向量的CNN池化双向BiLSTM与CRF模型的网络,可能一体化的完成中文和英文分词,词性标注,实体识别。主要包括原始文本数据,数据转换,训练脚本,预训练模型,可用于序列标注研究.注意:唯一需要实现的逻辑是将用户数据转化为序列模型。分词准确率约为93%,词性标注准确率约为90%,实体标注(在本样本上)约为85%。
Stars: ✭ 107 (-57.03%)
Mutual labels:  ner, crf
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-5.22%)
Mutual labels:  ner, crf
Multilstm
keras attentional bi-LSTM-CRF for Joint NLU (slot-filling and intent detection) with ATIS
Stars: ✭ 122 (-51%)
Mutual labels:  ner, crf
Ner
命名体识别(NER)综述-论文-模型-代码(BiLSTM-CRF/BERT-CRF)-竞赛资源总结-随时更新
Stars: ✭ 118 (-52.61%)
Mutual labels:  ner, crf
Malaya
Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/
Stars: ✭ 239 (-4.02%)
Mutual labels:  ner, pos-tagging
Etagger
reference tensorflow code for named entity tagging
Stars: ✭ 100 (-59.84%)
Mutual labels:  ner, crf
Nlp Journey
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Stars: ✭ 1,290 (+418.07%)
Mutual labels:  ner, crf
Daguan 2019 rank9
datagrand 2019 information extraction competition rank9
Stars: ✭ 121 (-51.41%)
Mutual labels:  ner, crf
Torchcrf
An Inplementation of CRF (Conditional Random Fields) in PyTorch 1.0
Stars: ✭ 58 (-76.71%)
Mutual labels:  ner, crf
Ner Slot filling
中文自然语言的实体抽取和意图识别(Natural Language Understanding),可选Bi-LSTM + CRF 或者 IDCNN + CRF
Stars: ✭ 151 (-39.36%)
Mutual labels:  ner, crf
Macadam
Macadam是一个以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入; 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法; 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法。
Stars: ✭ 149 (-40.16%)
Mutual labels:  ner, sequence-labeling
Clinical Ner
面向中文电子病历的命名实体识别
Stars: ✭ 151 (-39.36%)
Mutual labels:  ner, crf

NER-LSTM-CNNs-CRF

Requirement

PyTorch: 1.0.1
Python: 3.6
Cuda: 8.0 or 9.0(support cuda speed up, can chose)

Usage

modify the config file, detail see the Config directory
Train:
	The best nn model will be saved during training.
	---> sh run_train_p.sh
Test:
	Train finished. Decoding test data, and write decode result to file.
	---> sh run_test.sh
Eval:
	For the decode result file, use conlleval script in Tools directory to calculate F-score.
	---> sh run_eval.sh  

Config

[Embed]
	pretrained_embed = True(default: False)
	nnembed = True
	pretrained_embed_file = embed file path
[Data]
	max_count = -1  ## Number of sentences loaded(-1 represents all)
[Save]
	save_pkl = True(save pkl file for test, default True)
	save_best_model = True(save best performance result for test, default True)
[Model]
	use_crf = True  ## CRF 
	use_char = True  ## CNN
	model_bilstm = True  ##BiLSTM
	embed_dim = 100, lstm_hiddens = 200
	dropout_emb = 0.5, dropout = 0.5
	max_char_len = 20, char_dim = 30, conv_filter_sizes = 3, conv_filter_nums = 30
[Optimizer]
	sgd = True
	learning_rate = 0.015 , weight_decay = 1.0e-8
	use_lr_decay = True, lr_rate_decay = 0.05, min_lrate = 0.000005, max_patience = 1
[Train]
	batch_size = 10
	early_max_patience = 10(early stop max patience)

This is a major configuration file description, for more detailed reference to config.cfg file and config readme.

Model

  • BiLSTM
    • CNN
    • CRF

Data

  • The number of sentences:
Data Train Dev Test
conll2003 14987 3466 3684
  • The Data format is BIES label, data sample in Data directory.

  • Tag convert scripr in Here [tagSchemeConverter.py]

  • Conll2003 dataset can be downloaded from Conll2003

  • Pre-Trained Embedding can be downloaded from glove.6B.zip

Performance

  • Performance on the Conll2003, eval on the script conlleval in Tools

  • log in final_log

Model % P % R % F1
BLSTM 88.61 88.50 88.56
BLSTM-CRF 90.33 88.81 89.56
BLSTM-CNN 89.23 90.97 90.09
BLSTM-CNN-CRF 91.42 91.24 91.33

Reference

Question

  • if you have any question, you can open a issue or email [email protected]{gmail.com, 163.com}.

  • if you have any good suggestions, you can PR or email me.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].