Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Stars: ✭ 19,518 (+25581.58%)

Mutual labels: natural-language-processing, named-entity-recognition

Usc Ds Relationextraction

Distantly Supervised Relation Extraction

Stars: ✭ 378 (+397.37%)

Mutual labels: natural-language-processing, information-extraction

Spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Stars: ✭ 21,978 (+28818.42%)

Mutual labels: natural-language-processing, named-entity-recognition

Spacy Streamlit

👑 spaCy building blocks and visualizers for Streamlit apps

Stars: ✭ 360 (+373.68%)

Mutual labels: natural-language-processing, named-entity-recognition

Ner Lstm

Named Entity Recognition using multilayered bidirectional LSTM

Stars: ✭ 532 (+600%)

Mutual labels: natural-language-processing, named-entity-recognition

Corenlp

Stanford CoreNLP: A Java suite of core NLP tools.

Stars: ✭ 8,248 (+10752.63%)

Mutual labels: natural-language-processing, named-entity-recognition

Gcn Over Pruned Trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)

Stars: ✭ 312 (+310.53%)

Mutual labels: natural-language-processing, information-extraction

Snips Nlu

Snips Python library to extract meaning from text

Stars: ✭ 3,583 (+4614.47%)

Mutual labels: named-entity-recognition, information-extraction

Entity Recognition Datasets

A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.

Stars: ✭ 891 (+1072.37%)

Mutual labels: natural-language-processing, named-entity-recognition

View All Similar Projects ➔

Implementation of Nested Named Entity Recognition

Some files are part of NeuroNLP2.

Requirements

We tested this library with the following libraries:

Python (3.7)
PyTorch (1.3.0)
Numpy (1.17.3)
AdaBound (0.0.5)
StanfordNLP (0.2.0) for accessing the Java Stanford CoreNLP Server (3.9.2)
Transformers (2.1.1)

Running experiments

Testing this library with a sample data

Run the gen_data.py to generate the processed data files for training, and they will be placed at the "./data/" directory
```
python gen_data.py
```
Run the train.py to start training
```
python train.py
```

Reproducing our experiment on the ACE-2004 dataset

Put the corpus ACE-2004 into the "../ACE2004/" directory
Put this .tgz file into the "../" and extract it
Run the parse_ace2004.py to extract sentences for training, and they will be placed at the "./data/ace2004/"
```
python parse_ace2004.py
```
Run the gen_data_for_ace2004.py to prepare the processed data files for training, and they will be placed at the "./data/" directory
```
python gen_data_for_ace2004.py
```
Run the train.py to start training
```
python train.py
```

Reproducing our experiment on the ACE-2005 dataset

Put the corpus ACE-2005 into the "../ACE2005/" directory
Put this .tgz file into the "../" and extract it
Run the parse_ace2005.py to extract sentences for training, and they will be placed at the "./data/ace2005/"
```
python parse_ace2005.py
```
Run the gen_data_for_ace2005.py to prepare the processed data files for training, and they will be placed at the "./data/" directory
```
python gen_data_for_ace2005.py
```
Run the train.py to start training
```
python train.py
```

Reproducing our experiment on the GENIA dataset

Put the corpus GENIA into the "../GENIA/" directory
Run the parse_genia.py to extract sentences for training, and they will be placed at the "./data/genia/"
```
python parse_genia.py
```
Run the gen_data_for_genia.py to prepare the processed data files for training, and they will be placed at the "./data/" directory
```
python gen_data_for_genia.py
```
Run the train.py to start training
```
python train.py
```

Configuration

Configurations of the model and training are in config.py

Citation

Please cite our paper:

@article{shibuya-hovy-2020-nested,
  title = "Nested Named Entity Recognition via Second-best Sequence Learning and Decoding",
  author = "Shibuya, Takashi and Hovy, Eduard",
  journal = "Transactions of the Association for Computational Linguistics",
  volume = "8",
  year = "2020",
  doi = "10.1162/tacl_a_00334",
  pages = "605--620",
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 76

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗