All Projects → tianlinyang → stack-lstm-ner

tianlinyang / stack-lstm-ner

Licence: other
Transition-based NER system

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to stack-lstm-ner

ner
A command-line utility for extracting names of people, places, and organizations from text on macOS.
Stars: ✭ 75 (+114.29%)
Mutual labels:  named-entity-recognition
NER corpus chinese
NER(命名实体识别)中文语料,一站式获取
Stars: ✭ 102 (+191.43%)
Mutual labels:  named-entity-recognition
LNEx
📍 🏢 🏦 🏣 🏪 🏬 LNEx: Location Name Extractor
Stars: ✭ 21 (-40%)
Mutual labels:  named-entity-recognition
presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Stars: ✭ 62 (+77.14%)
Mutual labels:  named-entity-recognition
watchman
Watchman: An open-source social-media event-detection system
Stars: ✭ 18 (-48.57%)
Mutual labels:  named-entity-recognition
IE Paper Notes
Paper notes for Information Extraction, including Relation Extraction (RE), Named Entity Recognition (NER), Entity Linking (EL), Event Extraction (EE), Named Entity Disambiguation (NED).
Stars: ✭ 14 (-60%)
Mutual labels:  named-entity-recognition
BERT-NER
Using pre-trained BERT models for Chinese and English NER with 🤗Transformers
Stars: ✭ 114 (+225.71%)
Mutual labels:  named-entity-recognition
named-entity-recognition
Notebooks for teaching Named Entity Recognition at the Cultural Heritage Data School, run by Cambridge Digital Humanities
Stars: ✭ 18 (-48.57%)
Mutual labels:  named-entity-recognition
react-taggy
A simple zero-dependency React component for tagging user-defined entities within a block of text.
Stars: ✭ 29 (-17.14%)
Mutual labels:  named-entity-recognition
clinical concept extraction
Clinical Concept Extraction with Contextual Word Embedding
Stars: ✭ 34 (-2.86%)
Mutual labels:  named-entity-recognition
mitie-ruby
Named-entity recognition for Ruby
Stars: ✭ 77 (+120%)
Mutual labels:  named-entity-recognition
named-entity-recognition-template
Build a deep learning model for predicting the named entities from text.
Stars: ✭ 51 (+45.71%)
Mutual labels:  named-entity-recognition
nlp ner workshop
Named-Entity-Recognition Workshop
Stars: ✭ 15 (-57.14%)
Mutual labels:  named-entity-recognition
eve-bot
EVE bot, a customer service chatbot to enhance virtual engagement for Twitter Apple Support
Stars: ✭ 31 (-11.43%)
Mutual labels:  named-entity-recognition
CLNER
[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning
Stars: ✭ 50 (+42.86%)
Mutual labels:  named-entity-recognition
ner-d
Python module for Named Entity Recognition (NER) using natural language processing.
Stars: ✭ 14 (-60%)
Mutual labels:  named-entity-recognition
TorchBlocks
A PyTorch-based toolkit for natural language processing
Stars: ✭ 85 (+142.86%)
Mutual labels:  named-entity-recognition
grobid-ner
A Named-Entity Recogniser based on Grobid.
Stars: ✭ 38 (+8.57%)
Mutual labels:  named-entity-recognition
SkillsExtractorCognitiveSearch
Azure Search Cognitive Skill to extract technical and business skills from text
Stars: ✭ 51 (+45.71%)
Mutual labels:  named-entity-recognition
huner
Named Entity Recognition for biomedical entities
Stars: ✭ 44 (+25.71%)
Mutual labels:  named-entity-recognition

stack-lstm-ner

PyTorch implementation of Transition-based NER system [1].

Requirements

  • Python 3.x
  • PyTorch 0.3.0

Task

Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example

John   lives in New   York
B-PER  O     O  B-LOC I-LOC

Corresponding sequence of actions

SHIFT
REDUCE(PER)
OUT
OUT
SHIFT
SHIFT
REDUCE(LOC)

Data format

The training data must be in the following format (identical to the CoNLL2003 dataset).

A default test file is provided to help you getting started.

John B-PER
lives O
in O
New B-LOC
York I-LOC
. O

Training

To train the model, run train.py with the following parameters:

--rand_embedding      # use this if you want to randomly initialize the embeddings
--emb_file           # file dir for word embedding
--char_structure        # choose 'lstm' or 'cnn'
--train_file		  # path to training file
--dev_file		  	  # path to development file
--test_file		  	  # path to test file
--gpu 				  # gpu id, set to -1 if use cpu mode
--update              # choose from 'sgd' or adam
--batch_size  		  # batch size, default=100
--singleton_rate        # the rate for changing the words with low frequency to '<unk>'
--checkpoint 		  # path to checkpoint and saved model

Decoding

To tag a raw file, simpliy run predict.py with the following parameters:

--load_arg            # path to saved json file with all args
--load_check_point    # path to saved model
--test_file           # path to test file
--test_file_out         # path to test file output
--batch_size            # batch size
--gpu                   # gpu id, set to -1 if use cpu mode

Please be aware that when using the model in stack_lstm.py, --batch_size must be 1.

Result

When models are only trained on the CoNLL 2003 English NER dataset, the results are summarized as below.

Model Variant F1 Time(h)
Lample et al. 2016 pretrain 86.67
pretrain + dropout 87.96
pretrain + dropout + char 90.33
Our Implementation pretrain + dropout
pretrain + dropout + char (BiLSTM)
pretrain + dropout + char (CNN)

Author

Huimeng Zhang: [email protected]

References

[1] Lample et al., Neural Architectures for Named Entity Recognition, 2016

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].