All Projects → randomrandom → deep-atrous-ner

randomrandom / deep-atrous-ner

Licence: other
Deep-Atrous-CNN-NER: Word level model for Named Entity Recognition

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to deep-atrous-ner

KoBERT-NER
NER Task with KoBERT (with Naver NLP Challenge dataset)
Stars: ✭ 76 (+117.14%)
Mutual labels:  named-entity-recognition, ner
TweebankNLP
[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Stars: ✭ 84 (+140%)
Mutual labels:  named-entity-recognition, ner
PhoNER COVID19
COVID-19 Named Entity Recognition for Vietnamese (NAACL 2021)
Stars: ✭ 55 (+57.14%)
Mutual labels:  named-entity-recognition, ner
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (+574.29%)
Mutual labels:  named-entity-recognition, ner
molminer
Python library and command-line tool for extracting compounds from scientific literature. Written in Python.
Stars: ✭ 38 (+8.57%)
Mutual labels:  named-entity-recognition, ner
Bert ner
Ner with Bert
Stars: ✭ 240 (+585.71%)
Mutual labels:  named-entity-recognition, ner
NER-and-Linking-of-Ancient-and-Historic-Places
An NER tool for ancient place names based on Pleiades and Spacy.
Stars: ✭ 26 (-25.71%)
Mutual labels:  named-entity-recognition, ner
Bertner
ChineseNER based on BERT, with BiLSTM+CRF layer
Stars: ✭ 195 (+457.14%)
Mutual labels:  named-entity-recognition, ner
lingvo--Ner-ru
Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+8.57%)
Mutual labels:  named-entity-recognition, ner
scikitcrf NER
Python library for custom entity recognition using Sklearn CRF
Stars: ✭ 17 (-51.43%)
Mutual labels:  named-entity-recognition, ner
Ner Datasets
Datasets to train supervised classifiers for Named-Entity Recognition in different languages (Portuguese, German, Dutch, French, English)
Stars: ✭ 220 (+528.57%)
Mutual labels:  named-entity-recognition, ner
anonymization-api
How to build and deploy an anonymization API with FastAPI
Stars: ✭ 51 (+45.71%)
Mutual labels:  named-entity-recognition, ner
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (+505.71%)
Mutual labels:  named-entity-recognition, ner
Ner Bert Pytorch
PyTorch solution of named entity recognition task Using Google AI's pre-trained BERT model.
Stars: ✭ 249 (+611.43%)
Mutual labels:  named-entity-recognition, ner
Monpa
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Stars: ✭ 203 (+480%)
Mutual labels:  named-entity-recognition, ner
neural name tagging
Code for "Reliability-aware Dynamic Feature Composition for Name Tagging" (ACL2019)
Stars: ✭ 39 (+11.43%)
Mutual labels:  named-entity-recognition, ner
Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+420%)
Mutual labels:  named-entity-recognition, ner
Persian Ner
پیکره بزرگ شناسایی موجودیت‌های نامدار فارسی برچسب خورده
Stars: ✭ 183 (+422.86%)
Mutual labels:  named-entity-recognition, ner
SynLSTM-for-NER
Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.
Stars: ✭ 26 (-25.71%)
Mutual labels:  named-entity-recognition, ner
CrossNER
CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)
Stars: ✭ 87 (+148.57%)
Mutual labels:  named-entity-recognition, ner

Deep-Atrous-CNN-NER: Word level model for Named Entity Recognition

A Deep Atrous CNN architecture suitable for Named Entity Recognition on input with variable length, which achieves state of the art results. Up to 10x times faster during prediction time.

The architecture replaces the predominant LSTM-based architectures for Named Entity Recognition tasks. Instead it uses fully convolutional model with dilated convolutions, which are resolution perserving. The architecture is inspired by the ByteNet model described in Neural Machine Translation in Linear Time.

The model has several components:

  1. Embedding layer for word representation
  2. 1x1 convolutions for depth decompression
  3. An atrous-cnn part which is similar to the ByteNet encoder described in Neural Machine Translation in Linear Time
  4. 1x1 Convolutions for depth compression
  5. A fully connected layer and SoftMax for numerical stability

Tailored cross-entropy function is applied at the end. Dropout is applied within every ResNet block in the atrous-cnn component.

The network support embedding initialization with pre-trained GloVe vectors (GloVe: Gloval Vectors for Word Representations) which handle even rare words quite well compared to word2vec.

To speed up training the model pre-processes any input into "clean" file, which then utilizes for training. The data is read by line from the "clean" files for better memory management. All input data is split into the appropriate buckets and dynamic padding is applied, which provides better accuracy and speed up during training. The input pipeline can read from multiple data sources which makes addition of more data sources easy as long as they are preprocessed in the right format. The model can be trained on multiple GPUs if the hardware provides this capability.

(Some images are cropped from WaveNet: A Generative Model for Raw Audio, Neural Machine Translation in Linear Time and Tensorflow's Reading Data Tutorial)

Version

Current version : 0.0.0.1

Dependencies ( VERSION MUST BE MATCHED EXACTLY! )

  1. numpy==1.13.0
  2. pandas==0.20.2
  3. protobuf==3.3.0
  4. python-dateutil==2.6.0
  5. scipy==0.19.1
  6. six==1.10.0
  7. sklearn==0.18.2
  8. sugartensor==1.0.0.2
  9. tensorflow==1.2.0
  10. tqdm==4.14.0

Installation

  1. python3.5 -m pip install -r requirements.txt
  2. install tensorflow or tensorflow-gpu, depending on whether your machine supports GPU configurations

Dataset & Preprocessing

Currently the only supported dataset is the CoNLL-2003, which can be found within the repo, additional instructions how to obtain and preprocess it can be found here

The CoNLL-2003 dataset contains around 15,000 sententences (~203,000 tokens), along with a validation and test sets consisting of 3466 sentences (~51,000 tokens) and 3684 sentences (~46,400 tokens) respectively. From these tokens there are mainly 5 different entities: Person (PER), Location(LOC), Organization(ORG), Miscellaneous (MISC) along with non entity elements tagged as (O).

Training the network

Before training the network you need to preprocess all the files. Do this by running python preprocess.py

The model can be trained across multiple GPUs to speed up the computations. In order to start the training:

Execute


python train.py ( <== Use all available GPUs )
or
CUDA_VISIBLE_DEVICES=0,1 python train.py ( <== Use only GPU 0, 1 )

Currently the model achieves up to 94 f1-score on the validation set.

Monitoring and Debugging the training

In order to monitor the training, accuracy and other interesting metrics like gradients, activations, distributions, etc. across layers do the following:

# when in the project's root directory
bash launch_tensorboard.sh

then open your browser http://localhost:6008/

(kudos to sugartensor for the great tf wrapper which handles all the monitoring out of the box)

During training you can also monitor the f1-scores on the validation set, which are written after every epoch on the console in a similar format:


Epoch 49 - f1 scores of the meaningful classes: [ 0.93855503  0.8487395   0.93145357  0.87792642]
Epoch 49 - total f1 score: 0.9051094890510949
Epoch 50 - f1 scores of the meaningful classes: [ 0.94793926  0.86325211  0.95052474  0.88527397]
Epoch 50 - total f1 score: 0.9179461364208408
Improved F1 score, max model saved in file: asset/train/max_model.ckpt

If the f1-score exceeds the previous best result a max_model.ckpt is saved automatically.

Testing

You can load any previously saved max_model.ckpt for further testing on different datasets, by running:


python test.py ( <== Use all available GPUs )
or
CUDA_VISIBLE_DEVICES=0,1 python test.py ( <== Use only GPU 0, 1 )

It will produces a similar output:

Precision scores of the meaningful classes: [ 0.9712689   0.89571279  0.89779375  0.80020146] 
Recall scores of the meaningful classes: [ 0.956563845  0.87803279  0.92972093  0.80620112]
F1 scores of the meaningful classes: [ 0.96567506  0.88388911  0.91051454  0.80530973]
Total precision score: 0.9188072859744991  
Total recall score: 0.8992679076693969   
Total f1 score: 0.9083685761886454

Which in the case of the CoNLL-2003 score represents the following table:

Class Precision Recall F1
PER 97.22 95.88 96.56
ORG 89.99 87.92 91.94
LOC 95.89 96.99 96.42
MISC 80.43 89.62 89.58
Total 93.92 95.88 94.35

The script will run by default on the testb dataset.

Future works

  1. Increase the number of supported datasets
  2. Put everything into Docker
  3. Create a REST API for an easy deploy as a service

My other repositories

  1. Deep-Atrous-CNN-Text-Network

Citation

If you find this code useful please cite me in your work:


George Stoyanov. Deep-Atrous-CNN-NER. 2017. GitHub repository. https://github.com/randomrandom.

Authors

George Stoyanov ([email protected])

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].