Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

transformer, multi-graph transformer, graph, graph classification, sketch recognition, sketch classification, free-hand sketch, official code of the paper "Multi-Graph Transformer for Free-Hand Sketch Recognition"

Stars: ✭ 231 (-0.86%)

Mutual labels: transformer

Bert Chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

Stars: ✭ 205 (-12.02%)

Mutual labels: transformer

Kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

Stars: ✭ 190 (-18.45%)

Mutual labels: transformer

Sttn

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting

Stars: ✭ 211 (-9.44%)

Mutual labels: transformer

Transformer Clinic

Understanding the Difficulty of Training Transformers

Stars: ✭ 179 (-23.18%)

Mutual labels: transformer

Fancy Nlp

NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.

Stars: ✭ 233 (+0%)

Mutual labels: crf

Linear Attention Transformer

Transformer based on a variant of attention that is linear complexity in respect to sequence length

Stars: ✭ 205 (-12.02%)

Mutual labels: transformer

Self Attention Cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Stars: ✭ 209 (-10.3%)

Mutual labels: transformer

Pytorch Transformer

pytorch implementation of Attention is all you need

Stars: ✭ 199 (-14.59%)

Mutual labels: transformer

Graphtransformer

Graph Transformer Architecture. Source code for "A Generalization of Transformer Networks to Graphs", DLG-AAAI'21.

Stars: ✭ 187 (-19.74%)

Mutual labels: transformer

Yin

The efficient and elegant JSON:API 1.1 server library for PHP

Stars: ✭ 214 (-8.15%)

Mutual labels: transformer

Graph Transformer

Transformer for Graph Classification (Pytorch and Tensorflow)

Stars: ✭ 191 (-18.03%)

Mutual labels: transformer

🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Stars: ✭ 5,720 (+2354.94%)

Mutual labels: transformer

Fairseq Image Captioning

Transformer-based image captioning extension for pytorch/fairseq

Stars: ✭ 180 (-22.75%)

Mutual labels: transformer

Lumen Api Starter

Lumen 8 基础上扩展出的API 启动项目，精心设计的目录结构，规范统一的响应数据格式，Repository 模式架构的最佳实践。

Stars: ✭ 197 (-15.45%)

Mutual labels: transformer

Hardware Aware Transformers

[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Stars: ✭ 206 (-11.59%)

Mutual labels: transformer

Jddc solution 4th

2018-JDDC大赛第4名的解决方案

Stars: ✭ 235 (+0.86%)

Mutual labels: transformer

View All Similar Projects ➔

TorchNLP

TorchNLP is a deep learning library for NLP tasks. Built on PyTorch and TorchText, it is an attempt to provide reusable components that work across tasks. Currently it can be used for Named Entity Recognition (NER) and Chunking tasks with a Bidirectional LSTM CRF model and a Transformer network model. It can support any dataset which uses the CoNLL 2003 format. More tasks will be added shortly

High Level Workflow

Define the NLP task
Extend the Model class and implement the forward() and loss() methods to return predictions and loss respectively
Use the HParams class to easily define the hyperparameters for the model
Define a data function to return dataset iterators, vocabularies etc using TorchText API. Check conll.py for an example
Set up the Evaluator and Trainer classes to use the model, dataset iterators and metrics. Check ner.py for details
Run the trainer for desired number of epochs along with an early stopping criteria
Use the evaluator to evaluate the trained model on a specific dataset split
Run inference on the trained model using available input processors

Boilerplate Components

Model: Handles loading and saving of models as well as the associated hyperparameters
HParams: Generic class to define hyperparameters. Can be persisted
Trainer: Train a given model on a dataset. Supports features like predefined learning rate decay schedules and early stopping
Evaluator: Evaluates the model on a dataset and multiple predefined or custom metrics.
get_input_processor_words: Use during inference to quickly convert input strings into a format that can be processed by a model

Available Models

transformer.Encoder, transformer.Decoder: Transfomer network implementation from Attention is all you need
CRF: Conditional Random Field layer which can be used as the final output
TransformerTagger: Sequence tagging model implemented using the Transformer network and CRF
BiLSTMTagger: Sequence tagging model implemented using bidirectional LSTMs and CRF

Installation

TorchNLP requires a minimum of Python 3.5 and PyTorch 0.4.0 to run. Check Pytorch for the installation steps. Clone this repository and install other dependencies like TorchText:

pip install -r requirements.txt

Go to the root of the project and check for integrity with PyTest:

pytest

Install this project:

python setup.py

Usage

TorchNLP is designed to be used inside the python interpreter to make it easier to experiment without typing cumbersome command line arguments.

NER Task

The NER task can be run on any dataset that confirms to the CoNLL 2003 format. To use the CoNLL 2003 NER dataset place the dataset files in the following directory structure within your workspace root:

.data
  |
  |---conll2003
          |
          |---eng.train.txt
          |---eng.testa.txt
          |---eng.testb.txt

eng.testa.txt is used the validation dataset and eng.testb.txt is used as the test dataset.

Start the NER module in the python shell which sets up the imports:

python -i -m torchnlp.ner

Task: Named Entity Recognition

Available models:
-------------------
TransformerTagger

    Sequence tagger using the Transformer network (https://arxiv.org/pdf/1706.03762.pdf)
    Specifically it uses the Encoder module. For character embeddings (per word) it uses
    the same Encoder module above which an additive (Bahdanau) self-attention layer is added

BiLSTMTagger

    Sequence tagger using bidirectional LSTM. For character embeddings per word
    uses (unidirectional) LSTM


Available datasets:
-------------------
    conll2003: Conll 2003 (Parser only. You must place the files)

>>>

Train the Transformer model on the CoNLL 2003 dataset:

>>> train('ner-conll2003', TransformerTagger, conll2003)

The first argument is the task name. You need to use the same task name during evaluation and inference. By default the train function will use the F1 metric with a window of 5 epochs to perform early stopping. To change the early stopping criteria set the PREFS global variable as follows:

>>> PREFS.early_stopping='lowest_3_loss'

This will now use validation loss as the stopping criteria with a window of 3 epochs. The model files are saved under taskname-modelname directory. In this case it is ner-conll2003-TransformerTagger

Evaluate the trained model on the testb dataset split:

>>> evaluate('ner-conll2003', TransformerTagger, conll2003, 'test')

It will display metrics like accuracy, sequence accuracy, F1 etc

Run the trained model interactively for the ner task:

>>> interactive('ner-conll2003', TransformerTagger)
...
Ctrl+C to quit
> Tom went to New York
I-PER O O I-LOC I-LOC

You can similarly train the bidirectional LSTM CRF model by using the BiLSTMTagger class. Customizing hyperparameters is quite straight forward. Let's look at the hyperparameters for TransformerTagger:

>>> h2 = hparams_transformer_ner()
>>> h2

Hyperparameters:
 filter_size=128
 optimizer_adam_beta2=0.98
 learning_rate=0.2
 learning_rate_warmup_steps=500
 input_dropout=0.2
 embedding_size_char=16
 dropout=0.2
 hidden_size=128
 optimizer_adam_beta1=0.9
 embedding_size_word=300
 max_length=256
 attention_dropout=0.2
 relu_dropout=0.2
 batch_size=100
 num_hidden_layers=1
 attention_value_channels=0
 attention_key_channels=0
 use_crf=True
 embedding_size_tags=100
 learning_rate_decay=noam_step
 embedding_size_char_per_word=100
 num_heads=4
 filter_size_char=64

Now let's disable the CRF layer:

>>> h2.update(use_crf=False)

Hyperparameters:
filter_size=128
optimizer_adam_beta2=0.98
learning_rate=0.2
learning_rate_warmup_steps=500
input_dropout=0.2
embedding_size_char=16
dropout=0.2
hidden_size=128
optimizer_adam_beta1=0.9
embedding_size_word=300
max_length=256
attention_dropout=0.2
relu_dropout=0.2
batch_size=100
num_hidden_layers=1
attention_value_channels=0
attention_key_channels=0
use_crf=False
embedding_size_tags=100
learning_rate_decay=noam_step
embedding_size_char_per_word=100
num_heads=4
filter_size_char=64

Use it to re-train the model:

>>> train('ner-conll2003-nocrf', TransformerTagger, conll2003, hparams=h2)

Along with the model the hyperparameters are also saved so there is no need to pass the HParams object during evaluation. Also note that by default it will not overwrite any existing model directories (will rename instead). To change that behavior set the PREFS variable:

>>> PREFS.overwrite_model_dir = True

The PREFS variable is automatically persisted in prefs.json

Chunking Task

The CoNLL 2000 dataset is available for the Chunking task. The dataset is automatically downloaded from the public repository so you don't need to manually download it.

Start the Chunking task:

python -i -m torchnlp.chunk

Train the Transformer model:

>>> train('chunk-conll2000', TransformerTagger, conll2000)

There is no validation partition provided in the repository hence 10% of the training set is used for validation.

Evaluate the model on the test set:

>>> evaluate('chunk-conll2000', TransformerTagger, conll2000, 'test')

Standalone Use

The transformer.Encoder, transformer.Decoder and CRF modules can be independently imported as they only depend on PyTorch:

from torchnlp.modules.transformer import Encoder
from torchnlp.modules.transformer import Decoder
from torchnlp.modules.crf import CRF

Please refer to the comments within the source code for more details on the usage

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 233

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗