All Projects → deepmipt → Ner

deepmipt / Ner

Licence: apache-2.0
Named Entity Recognition

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Ner

Intent classifier
Stars: ✭ 67 (-76.74%)
Mutual labels:  natural-language-processing, nlp-machine-learning, natural-language-understanding
Nlp Conference Compendium
Compendium of the resources available from top NLP conferences.
Stars: ✭ 349 (+21.18%)
Mutual labels:  natural-language-processing, nlp-machine-learning, natural-language-understanding
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (-57.99%)
Mutual labels:  natural-language-processing, named-entity-recognition, natural-language-understanding
Coursera Natural Language Processing Specialization
Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.
Stars: ✭ 39 (-86.46%)
Mutual labels:  natural-language-processing, nlp-machine-learning, natural-language-understanding
Natural Language Processing Specialization
This repo contains my coursework, assignments, and Slides for Natural Language Processing Specialization by deeplearning.ai on Coursera
Stars: ✭ 151 (-47.57%)
Mutual labels:  natural-language-processing, nlp-machine-learning, natural-language-understanding
Catalyst
🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
Stars: ✭ 224 (-22.22%)
Mutual labels:  natural-language-processing, natural-language-understanding
Machine Learning Resources
A curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (-21.53%)
Mutual labels:  natural-language-processing, nlp-machine-learning
sequence labeling tf
Sequence Labeling in Tensorflow
Stars: ✭ 18 (-93.75%)
Mutual labels:  named-entity-recognition, natural-language-understanding
Autonlp
🤗 AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically
Stars: ✭ 263 (-8.68%)
Mutual labels:  natural-language-processing, natural-language-understanding
Pytorch graph Rel
A PyTorch implementation of GraphRel
Stars: ✭ 204 (-29.17%)
Mutual labels:  natural-language-processing, named-entity-recognition
Question-Answering-based-on-SQuAD
Question Answering System using BiDAF Model on SQuAD v2.0
Stars: ✭ 20 (-93.06%)
Mutual labels:  nlp-machine-learning, natural-language-understanding
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (-1.74%)
Mutual labels:  natural-language-processing, natural-language-understanding
Dilated Cnn Ner
Dilated CNNs for NER in TensorFlow
Stars: ✭ 222 (-22.92%)
Mutual labels:  natural-language-processing, named-entity-recognition
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (-26.39%)
Mutual labels:  natural-language-processing, named-entity-recognition
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-18.06%)
Mutual labels:  natural-language-processing, named-entity-recognition
Character Based Cnn
Implementation of character based convolutional neural network
Stars: ✭ 205 (-28.82%)
Mutual labels:  natural-language-processing, nlp-machine-learning
TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Stars: ✭ 42 (-85.42%)
Mutual labels:  nlp-machine-learning, natural-language-understanding
slotminer
Tool for slot extraction from text
Stars: ✭ 15 (-94.79%)
Mutual labels:  named-entity-recognition, natural-language-understanding
OpenPrompt
An Open-Source Framework for Prompt-Learning.
Stars: ✭ 1,769 (+514.24%)
Mutual labels:  nlp-machine-learning, natural-language-understanding
NLP-Natural-Language-Processing
Projects and useful articles / links
Stars: ✭ 149 (-48.26%)
Mutual labels:  nlp-machine-learning, natural-language-understanding

This repository is outdated please move to https://github.com/deepmipt/DeepPavlov

Neural Networks for Named Entity Recognition

In this repo you can find several neural network architectures for named entity recognition from the paper "Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition" https://arxiv.org/pdf/1709.09686.pdf, which is inspired by LSTM+CRF architecture from https://arxiv.org/pdf/1603.01360.pdf.

NER class from ner/network.py provides methods for construction, training and inference neural networks for Named Entity Recognition.

We provide pre-trained CNN model for Russian Named Entity Recognition. The model was trained on three datatasets:

  • Gareev corpus [1] (obtainable by request to authors)
  • FactRuEval 2016 [2]
  • NE3 (extended Persons-1000) [3, 4]

The pre-trained model can recognize such entities as:

  • Persons (PER)
  • Locations (LOC)
  • Organizations (ORG)

An example of usage of the pre-trained model is provided in example.ipynb.

Remark: at training stage the corpora were lemmatized and lowercased. So text must be tokenized and lemmatized and lowercased before feeding it into the model.

The F1 measure for presented model along with other published solution provided in the table below:

Models Gareev’s dataset Persons-1000 FactRuEval 2016
Gareev et al. [1] 75.05
Malykh et al. [5] 62.49
Trofimov [6] 95.57
Rubaylo et al. [7] 78.13
Sysoev et al. [8] 74.67
Ivanitsky et al. [9] 87.88
Mozharova et al. [10] 97.21
Our (Bi-LSTM+CRF) 87.17 99.26 82.10

Usage

Installing

The toolkit is implemented in Python 3 and requires a number of packages. To install all needed packages use:

$ pip3 install -r requirements.txt

or

$ pip3 install git+https://github.com/deepmipt/ner

Warning: there is no GPU version of TensorFlow specified in the requirements file

Command-Line Interface

The simplest way to use pre-trained Russian NER model is via command line interface:

$ echo "На конспирологическом саммите в США глава Федерального Бюро Расследований сделал невероятное заявление" | ./ner.py

На O
конспирологическом O
саммите O
в O
США B-LOC
глава O
Федерального B-ORG
Бюро I-ORG
Расследований I-ORG
сделал O
невероятное O
заявление O

And for interactive usage simply type:

$ ./ner.py

Usage as module

>>> import ner
>>> extractor = ner.Extractor()
>>> for m in extractor("На конспирологическом саммите в США глава Федерального Бюро Расследований сделал невероятное заявление"):
...     print(m)
Match(tokens=[Token(span=(32, 35), text='США')], span=Span(start=32, end=35), type='LOC')
Match(tokens=[Token(span=(42, 54), text='Федерального'), Token(span=(55, 59), text='Бюро'), Token(span=(60, 73), text='Расследований')], span=Span(start=42, end=73), type='ORG')

Training

To see how to train the network and what format of data is required see training_example.ipynb jupyter notebook.

Literature

[1] - Rinat Gareev, Maksim Tkachenko, Valery Solovyev, Andrey Simanovsky, Vladimir Ivanov: Introducing Baselines for Russian Named Entity Recognition. Computational Linguistics and Intelligent Text Processing, 329 -- 342 (2013).

[2] - https://github.com/dialogue-evaluation/factRuEval-2016

[3] - http://ai-center.botik.ru/Airec/index.php/ru/collections/28-persons-1000

[4] - http://labinform.ru/pub/named_entities/descr_ne.htm

[5] - Reproducing Russian NER Baseline Quality without Additional Data. In proceedings of the 3rd International Workshop on ConceptDiscovery in Unstructured Data, Moscow, Russia, 54 – 59 (2016)

[6] - Rubaylo A. V., Kosenko M. Y.: Software utilities for natural language information retrievial. Almanac of modern science and education, Volume 12 (114), 87 – 92.(2016)

[7] - Sysoev A. A., Andrianov I. A.: Named Entity Recognition in Russian: the Power of Wiki-Based Approach. dialog-21.ru

[8] - Ivanitskiy Roman, Alexander Shipilo, Liubov Kovriguina: Russian Named Entities Recognition and Classification Using Distributed Word and Phrase Representations. In SIMBig, 150 – 156. (2016).

[9] - Mozharova V., Loukachevitch N.: Two-stage approach in Russian named entity recognition. In Intelligence, Social Media and Web (ISMW FRUCT), 2016 International FRUCT Conference, 1 – 6 (2016)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].