All Projects → Adaxry → GCDT

Adaxry / GCDT

Licence: BSD-3-Clause license
Code for the paper: GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

Programming Languages

PLSQL
303 projects
CWeb
17 projects
python
139335 projects - #7 most used programming language
perl
6916 projects

GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

Contents

Introduction

The code of our proposed GCDT, which deepens the state transition path at each position in a sentence, and further assign every token with a global representation learned from the entire sentence. [paper]. The implementation is based on THUMT.

Usage

  • Trim Glove
sh trim_glove.sh path_to_glove

path_to_glove is the path of your decompressed Glove embedding.

  • Training
sh train.sh task_name

task_name is the name of tasks between ner and chunking.

  • Evaluation and Testing
sh test.sh task_name test_type

Set test_type to testa for evaluation and testb for testing. Please note there is no evaluation set for the chunking task.

Requirements

  • tensorflow 1.12
  • python 3.5

Citation

Please cite the following paper if you use the code:

@InProceedings{Liu:19,
  author    = {Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen and Jie Zhou},
  title     = {GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling},
  booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
}

FAQ

  • Why not evaluate along with training?

    For training efficiency, we firstly train a model for specified steps, and restore checkpoints for evaluation and testing. For the CoNLL03, we compute the score on the test set at the best-performing checkpoints on the evaluation set. For the CoNLL2000, we compute the score on the test set directly.

  • How to get BERT embeddings?

    We provide a simple tool to gennerate the BERT embedding for sequence labeling tasks. And then assign bert_emb_path with correct path and set use_bert to True in train.sh.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].