All Projects → tnq177 → improving_lexical_choice_in_nmt

tnq177 / improving_lexical_choice_in_nmt

Licence: MIT License
No description, website, or topics provided.

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects
shell
77523 projects

Good friend lol

Tested with Python 2.7.3 and Tensorflow 1.1
Talk: https://www3.nd.edu/~tnguye28/naacl18.pdf

General

This is the code for the paper Improving Lexical Choice in Neural Machine Translation (accepted at NAACL HLT 2018). The branches are:

  • master: baseline NMT
  • tied_embedding: baseline NMT with tied embedding
  • fixnorm: fixnorm model in paper
  • fixnorm_lex: fixnorm+lex model in paper
  • arthur: apply the method of Arthur et al. on top of tied_embedding NMT

To train a model:

  • write a configuration function in configurations.py
  • run: python -m nmt --proto your_config_func

Depending on your config function, the code generates a direction under nmt/saved_models/your_model_name and saves all dev validations there, as well as dev perplexities, train perplexities, best model checkpoint, checkpoint so far (I've tested with saving 1 best checkpoint, not sure about > 1). You should use this checkpoint to translate on any other input.

To translate with UNK replacement:

  • run: python -m nmt --proto your_config_func --mode translate --unk-repl --model-file path_your_saved_checkpoint.cpkt --input-file path_to_input_file

Remember the checkpoint includes data file, meta file, ... but just link to .cpkt, ignore the extension.

References

Code & scripts might be inspired/borrowed from some sources:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].