All Projects → sebastianruder → Sluice Networks

sebastianruder / Sluice Networks

Code for Sluice networks: Learning what to share between loosely related tasks

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sluice Networks

Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (+2803.7%)
Mutual labels:  natural-language-processing, transfer-learning
Learn To Select Data
Code for Learning to select data for transfer learning with Bayesian Optimization
Stars: ✭ 140 (+3.7%)
Mutual labels:  natural-language-processing, transfer-learning
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+320%)
Mutual labels:  natural-language-processing, transfer-learning
Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+34.81%)
Mutual labels:  natural-language-processing, transfer-learning
Spacy Transformers
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Stars: ✭ 919 (+580.74%)
Mutual labels:  natural-language-processing, transfer-learning
Chars2vec
Character-based word embeddings model based on RNN for handling real world texts
Stars: ✭ 130 (-3.7%)
Mutual labels:  natural-language-processing
Mk Tfjs
Play MK.js with TensorFlow.js
Stars: ✭ 133 (-1.48%)
Mutual labels:  transfer-learning
Rasa Chatbot Templates
RASA chatbot use case boilerplate
Stars: ✭ 127 (-5.93%)
Mutual labels:  natural-language-processing
Neuraldialog Larl
PyTorch implementation of latent space reinforcement learning for E2E dialog published at NAACL 2019. It is released by Tiancheng Zhao (Tony) from Dialog Research Center, LTI, CMU
Stars: ✭ 127 (-5.93%)
Mutual labels:  natural-language-processing
Mams For Absa
A Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment analysis.
Stars: ✭ 135 (+0%)
Mutual labels:  natural-language-processing
Shot
code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"
Stars: ✭ 134 (-0.74%)
Mutual labels:  transfer-learning
Uda
Unsupervised Data Augmentation (UDA)
Stars: ✭ 1,877 (+1290.37%)
Mutual labels:  natural-language-processing
Konoha
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Stars: ✭ 130 (-3.7%)
Mutual labels:  natural-language-processing
Awesome Ai Services
An overview of the AI-as-a-service landscape
Stars: ✭ 133 (-1.48%)
Mutual labels:  natural-language-processing
Medquad
Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
Stars: ✭ 129 (-4.44%)
Mutual labels:  natural-language-processing
Imagenet
Pytorch Imagenet Models Example + Transfer Learning (and fine-tuning)
Stars: ✭ 134 (-0.74%)
Mutual labels:  transfer-learning
Deep Lyrics
Lyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Stars: ✭ 127 (-5.93%)
Mutual labels:  natural-language-processing
Persian Stopwords
Persian (Farsi) Stop Words List
Stars: ✭ 131 (-2.96%)
Mutual labels:  natural-language-processing
Zamia Ai
Free and open source A.I. system based on Python, TensorFlow and Prolog.
Stars: ✭ 133 (-1.48%)
Mutual labels:  natural-language-processing
Tensorflow 1.4 Billion Password Analysis
Deep Learning model to analyze a large corpus of clear text passwords.
Stars: ✭ 1,720 (+1174.07%)
Mutual labels:  natural-language-processing

Sluice networks: Learning what to share between loosely related tasks

Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard (2017). Sluice networks: Learning what to share between loosely related tasks. arXiv preprint arXiv:1705.08142.

A Sluice Network with two tasks

Installation instructions

The code works with Python 3.5. The main requirement is DyNet (and its dependencies).

DyNet can be installed using the instructions here (the detailed instructions, not TL;DR).

Besides that, we use progress to track the training progress, which can be installed via pip install progress.

Repository structure

  • constants.py: Contains constants used across all files.
  • predictors.py: Contains classes for sequence predictors and layers.
  • PTB2chunks.py: A script to extract chunks from the PTB format.
  • run_sluice_net.py: Script to train, load, and evaluate SluiceNetwork.
  • sluice_net.py: The main logic for the SluiceNetwork.
  • utils.py: Utility methods for data processing.

Model

The code for the Bi-LSTM we use as basis for SluiceNetwork is based on the state-of-the-art hierarchical Bi-LSTM tagger by Plank et al. (2016). You can find their repo here. For most of the Bi-LSTM hyperparameters, we adopt their choices.

Example usage

python run_sluice_net.py --dynet-autobatch 1 --dynet-seed 123 \
                         --task-names chunk pos --h-layers 3  --pred-layer 3 3 \
                         --cross-stitch --layer-connect stitch \
                         --num-subspaces 2 --constraint-weight 0.1 \
                         --constrain-matrices 1 2 --patience 3 \
                         --train-dir ontonotes-5.0/train \
                         --dev-dir ontonotes-5.0/development \
                         --test-dir ontonotes-5.0/test \
                         --train bc --test bn mz nw wb \
                         --model-dir model/chunk_3_pos_3_bc_mz --log-dir logs
  • --dynet-autobatch 1: use DyNet auto-batching
  • --dynet-seed 123: provide a seed to DyNet
  • --task-names chunk pos: run the model with chunking as main task and POS tagging as auxiliary task
  • --h-layers 3: use 3 layers in the model
  • --cross-stitch: use cross-stitch (alpha) units
  • --layer-connect stitch: use layer-stitch (beta) units before the FC layer
  • --num-subspaces 2: use two subspaces
  • --constraint-weight 0.1: use subspace orthogonality constraint with a weight of 0.1
  • --constrain-matrices 1 2: place the constraint on the LSTM matrices with indices 1 and 2
  • --patience 3: use patience 3 for training
  • --train-dir, --dev-dir, --test-dir: use the specified directories for training, development, and testing (the train, development, and test directories of the CoNLL formatted OntoNotes 5.0 data)
  • --train bc: train the model on the bc domain
  • --test bn mz nw wb: test the model on the bn, mz, nw, and wb domains
  • --model-dir: the directory of the model
  • --log-dir: the directory for logging

Data

We use the English OntoNotes v5.0 data in the format used by the CoNLL 2011/2012 shared task.

In order to obtain the data, you need to follow these steps:

  1. Obtain the OntoNotes v5.0 data from the LDC. Most universities have a copy of it.
  2. Convert the data to the CoNLL 2011/2012 format using the scripts here.

Following this, you will have a file tree that looks like the following:

conll-formatted-ontonotes-5.0/
└── data
   ├── development
   |   └── data
   |       └── english
   |           └── annotations
   |               ├── bc
   |               ├── bn
   |               ├── mz
   |               ├── nw
   |               ├── pt
   |               ├── tc
   |               └── wb
   ├── test
   │   └── data
   │       └── english
   │           └── annotations
   │               ├── bc
   │               ├── bn
   │               ├── mz
   │               ├── nw
   │               ├── pt
   │               ├── tc
   │               └── wb
   └── train
       └── data
           └── english
               └── annotations
                   ├── bc
                   ├── bn
                   ├── mz
                   ├── nw
                   ├── pt
                   ├── tc
                   └── wb

A leaf folder such as bc has the following structure:

bc
├── cctv
|    └── 00
├── cnn
|    └── 00
├── msnbc
|    └── 00
├── p2.5_a2e
|    └── 00
├── p2.5_c23
|    └── 00
└── phoenix
     └── 00

Each 00 folder should then contain *.gold_skel files containing the CoNLL skeleton annotations and *.gold_conll files containing the word forms and annotations in CoNLL format.

The *.gold_conll files contain annotations for POS tagging, parsing, word sense disambiguation, named entity recognition (NER), semantic role labeling (SRL), and coreference resolution (see here). We only use POS tags, NER labels and predicate labels of SRL in our experiments.

In order to obtain annotations for chunking, run the provided PTB2chunks.py script with the following command:

python PTB2chunks.py --original-folder ORIG_FOLDER --conll-folder CONLL_FOLDER

where ORIG_FOLDER is the root of the original OntoNotes 5.0 data directory (where annotations is the first subdirectory) and --conll-folder is the conll-formatted-ontonotes-5.0/ directory above.

The script will then extract the chunk annotations and create *.chunks files that are saved next to *.gold_skel and *.gold_conll.

Reference

If you make use of the contents of this repository, we appreciate citing the following paper:

@article{ruder2017sluice,
  title={Sluice networks: Learning what to share between loosely related tasks},
  author={Ruder, Sebastian and Bingel, Joachim and Augenstein, Isabelle and S{\o}gaard, Anders},
  journal={arXiv preprint arXiv:1705.08142},
  year={2017}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].