All Projects → kolloldas → Torchnlp

kolloldas / Torchnlp

Licence: apache-2.0
Easy to use NLP library built on PyTorch and TorchText

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Torchnlp

Etagger
reference tensorflow code for named entity tagging
Stars: ✭ 100 (-57.08%)
Mutual labels:  crf, transformer
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-90.13%)
Mutual labels:  crf, transformer
Sentimentanalysis
Sentiment analysis neural network trained by fine-tuning BERT, ALBERT, or DistilBERT on the Stanford Sentiment Treebank.
Stars: ✭ 186 (-20.17%)
Mutual labels:  transformer
Multigraph transformer
transformer, multi-graph transformer, graph, graph classification, sketch recognition, sketch classification, free-hand sketch, official code of the paper "Multi-Graph Transformer for Free-Hand Sketch Recognition"
Stars: ✭ 231 (-0.86%)
Mutual labels:  transformer
Bert Chainer
Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
Stars: ✭ 205 (-12.02%)
Mutual labels:  transformer
Kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
Stars: ✭ 190 (-18.45%)
Mutual labels:  transformer
Sttn
[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting
Stars: ✭ 211 (-9.44%)
Mutual labels:  transformer
Transformer Clinic
Understanding the Difficulty of Training Transformers
Stars: ✭ 179 (-23.18%)
Mutual labels:  transformer
Fancy Nlp
NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
Stars: ✭ 233 (+0%)
Mutual labels:  crf
Linear Attention Transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
Stars: ✭ 205 (-12.02%)
Mutual labels:  transformer
Self Attention Cv
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
Stars: ✭ 209 (-10.3%)
Mutual labels:  transformer
Pytorch Transformer
pytorch implementation of Attention is all you need
Stars: ✭ 199 (-14.59%)
Mutual labels:  transformer
Graphtransformer
Graph Transformer Architecture. Source code for "A Generalization of Transformer Networks to Graphs", DLG-AAAI'21.
Stars: ✭ 187 (-19.74%)
Mutual labels:  transformer
Yin
The efficient and elegant JSON:API 1.1 server library for PHP
Stars: ✭ 214 (-8.15%)
Mutual labels:  transformer
Graph Transformer
Transformer for Graph Classification (Pytorch and Tensorflow)
Stars: ✭ 191 (-18.03%)
Mutual labels:  transformer
Nn
🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Stars: ✭ 5,720 (+2354.94%)
Mutual labels:  transformer
Fairseq Image Captioning
Transformer-based image captioning extension for pytorch/fairseq
Stars: ✭ 180 (-22.75%)
Mutual labels:  transformer
Lumen Api Starter
Lumen 8 基础上扩展出的API 启动项目,精心设计的目录结构,规范统一的响应数据格式,Repository 模式架构的最佳实践。
Stars: ✭ 197 (-15.45%)
Mutual labels:  transformer
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (-11.59%)
Mutual labels:  transformer
Jddc solution 4th
2018-JDDC大赛第4名的解决方案
Stars: ✭ 235 (+0.86%)
Mutual labels:  transformer

TorchNLP

TorchNLP is a deep learning library for NLP tasks. Built on PyTorch and TorchText, it is an attempt to provide reusable components that work across tasks. Currently it can be used for Named Entity Recognition (NER) and Chunking tasks with a Bidirectional LSTM CRF model and a Transformer network model. It can support any dataset which uses the CoNLL 2003 format. More tasks will be added shortly

High Level Workflow

  1. Define the NLP task
  2. Extend the Model class and implement the forward() and loss() methods to return predictions and loss respectively
  3. Use the HParams class to easily define the hyperparameters for the model
  4. Define a data function to return dataset iterators, vocabularies etc using TorchText API. Check conll.py for an example
  5. Set up the Evaluator and Trainer classes to use the model, dataset iterators and metrics. Check ner.py for details
  6. Run the trainer for desired number of epochs along with an early stopping criteria
  7. Use the evaluator to evaluate the trained model on a specific dataset split
  8. Run inference on the trained model using available input processors

Boilerplate Components

  • Model: Handles loading and saving of models as well as the associated hyperparameters
  • HParams: Generic class to define hyperparameters. Can be persisted
  • Trainer: Train a given model on a dataset. Supports features like predefined learning rate decay schedules and early stopping
  • Evaluator: Evaluates the model on a dataset and multiple predefined or custom metrics.
  • get_input_processor_words: Use during inference to quickly convert input strings into a format that can be processed by a model

Available Models

  • transformer.Encoder, transformer.Decoder: Transfomer network implementation from Attention is all you need
  • CRF: Conditional Random Field layer which can be used as the final output
  • TransformerTagger: Sequence tagging model implemented using the Transformer network and CRF
  • BiLSTMTagger: Sequence tagging model implemented using bidirectional LSTMs and CRF

Installation

TorchNLP requires a minimum of Python 3.5 and PyTorch 0.4.0 to run. Check Pytorch for the installation steps. Clone this repository and install other dependencies like TorchText:

pip install -r requirements.txt

Go to the root of the project and check for integrity with PyTest:

pytest

Install this project:

python setup.py

Usage

TorchNLP is designed to be used inside the python interpreter to make it easier to experiment without typing cumbersome command line arguments.

NER Task

The NER task can be run on any dataset that confirms to the CoNLL 2003 format. To use the CoNLL 2003 NER dataset place the dataset files in the following directory structure within your workspace root:

.data
  |
  |---conll2003
          |
          |---eng.train.txt
          |---eng.testa.txt
          |---eng.testb.txt

eng.testa.txt is used the validation dataset and eng.testb.txt is used as the test dataset.

Start the NER module in the python shell which sets up the imports:

python -i -m torchnlp.ner
Task: Named Entity Recognition

Available models:
-------------------
TransformerTagger

    Sequence tagger using the Transformer network (https://arxiv.org/pdf/1706.03762.pdf)
    Specifically it uses the Encoder module. For character embeddings (per word) it uses
    the same Encoder module above which an additive (Bahdanau) self-attention layer is added

BiLSTMTagger

    Sequence tagger using bidirectional LSTM. For character embeddings per word
    uses (unidirectional) LSTM


Available datasets:
-------------------
    conll2003: Conll 2003 (Parser only. You must place the files)

>>>

Train the Transformer model on the CoNLL 2003 dataset:

>>> train('ner-conll2003', TransformerTagger, conll2003)

The first argument is the task name. You need to use the same task name during evaluation and inference. By default the train function will use the F1 metric with a window of 5 epochs to perform early stopping. To change the early stopping criteria set the PREFS global variable as follows:

>>> PREFS.early_stopping='lowest_3_loss'

This will now use validation loss as the stopping criteria with a window of 3 epochs. The model files are saved under taskname-modelname directory. In this case it is ner-conll2003-TransformerTagger

Evaluate the trained model on the testb dataset split:

>>> evaluate('ner-conll2003', TransformerTagger, conll2003, 'test')

It will display metrics like accuracy, sequence accuracy, F1 etc

Run the trained model interactively for the ner task:

>>> interactive('ner-conll2003', TransformerTagger)
...
Ctrl+C to quit
> Tom went to New York
I-PER O O I-LOC I-LOC

You can similarly train the bidirectional LSTM CRF model by using the BiLSTMTagger class. Customizing hyperparameters is quite straight forward. Let's look at the hyperparameters for TransformerTagger:

>>> h2 = hparams_transformer_ner()
>>> h2

Hyperparameters:
 filter_size=128
 optimizer_adam_beta2=0.98
 learning_rate=0.2
 learning_rate_warmup_steps=500
 input_dropout=0.2
 embedding_size_char=16
 dropout=0.2
 hidden_size=128
 optimizer_adam_beta1=0.9
 embedding_size_word=300
 max_length=256
 attention_dropout=0.2
 relu_dropout=0.2
 batch_size=100
 num_hidden_layers=1
 attention_value_channels=0
 attention_key_channels=0
 use_crf=True
 embedding_size_tags=100
 learning_rate_decay=noam_step
 embedding_size_char_per_word=100
 num_heads=4
 filter_size_char=64

Now let's disable the CRF layer:

>>> h2.update(use_crf=False)

Hyperparameters:
filter_size=128
optimizer_adam_beta2=0.98
learning_rate=0.2
learning_rate_warmup_steps=500
input_dropout=0.2
embedding_size_char=16
dropout=0.2
hidden_size=128
optimizer_adam_beta1=0.9
embedding_size_word=300
max_length=256
attention_dropout=0.2
relu_dropout=0.2
batch_size=100
num_hidden_layers=1
attention_value_channels=0
attention_key_channels=0
use_crf=False
embedding_size_tags=100
learning_rate_decay=noam_step
embedding_size_char_per_word=100
num_heads=4
filter_size_char=64

Use it to re-train the model:

>>> train('ner-conll2003-nocrf', TransformerTagger, conll2003, hparams=h2)

Along with the model the hyperparameters are also saved so there is no need to pass the HParams object during evaluation. Also note that by default it will not overwrite any existing model directories (will rename instead). To change that behavior set the PREFS variable:

>>> PREFS.overwrite_model_dir = True

The PREFS variable is automatically persisted in prefs.json

Chunking Task

The CoNLL 2000 dataset is available for the Chunking task. The dataset is automatically downloaded from the public repository so you don't need to manually download it.

Start the Chunking task:

python -i -m torchnlp.chunk

Train the Transformer model:

>>> train('chunk-conll2000', TransformerTagger, conll2000)

There is no validation partition provided in the repository hence 10% of the training set is used for validation.

Evaluate the model on the test set:

>>> evaluate('chunk-conll2000', TransformerTagger, conll2000, 'test')

Standalone Use

The transformer.Encoder, transformer.Decoder and CRF modules can be independently imported as they only depend on PyTorch:

from torchnlp.modules.transformer import Encoder
from torchnlp.modules.transformer import Decoder
from torchnlp.modules.crf import CRF

Please refer to the comments within the source code for more details on the usage

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].