All Projects → hassyGo → Nlg Rl

hassyGo / Nlg Rl

Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Nlg Rl

Matterport3dsimulator
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
Stars: ✭ 260 (+340.68%)
Mutual labels:  natural-language-processing, reinforcement-learning, rl
Machine learning examples
A collection of machine learning examples and tutorials.
Stars: ✭ 6,466 (+10859.32%)
Mutual labels:  natural-language-processing, reinforcement-learning
Pplm
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Stars: ✭ 674 (+1042.37%)
Mutual labels:  natural-language-processing, natural-language-generation
Ciff
Cornell Instruction Following Framework
Stars: ✭ 23 (-61.02%)
Mutual labels:  natural-language-processing, reinforcement-learning
Demos
Some JavaScript works published as demos, mostly ML or DS
Stars: ✭ 55 (-6.78%)
Mutual labels:  natural-language-processing, reinforcement-learning
This Word Does Not Exist
This Word Does Not Exist
Stars: ✭ 640 (+984.75%)
Mutual labels:  natural-language-processing, natural-language-generation
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (+1293.22%)
Mutual labels:  natural-language-processing, natural-language-generation
Rosettastone
Hearthstone simulator using C++ with some reinforcement learning
Stars: ✭ 510 (+764.41%)
Mutual labels:  reinforcement-learning, rl
Pqg Pytorch
Paraphrase Generation model using pair-wise discriminator loss
Stars: ✭ 33 (-44.07%)
Mutual labels:  natural-language-processing, natural-language-generation
Conversational Ai
Conversational AI Reading Materials
Stars: ✭ 34 (-42.37%)
Mutual labels:  natural-language-processing, reinforcement-learning
Blocks
Blocks World -- Simulator, Code, and Models (Misra et al. EMNLP 2017)
Stars: ✭ 39 (-33.9%)
Mutual labels:  natural-language-processing, reinforcement-learning
Amazon Sagemaker Examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Stars: ✭ 6,346 (+10655.93%)
Mutual labels:  reinforcement-learning, rl
Fast abs rl
Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"
Stars: ✭ 569 (+864.41%)
Mutual labels:  natural-language-processing, reinforcement-learning
Dl Nlp Readings
My Reading Lists of Deep Learning and Natural Language Processing
Stars: ✭ 656 (+1011.86%)
Mutual labels:  natural-language-processing, reinforcement-learning
Leakgan
The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.
Stars: ✭ 533 (+803.39%)
Mutual labels:  natural-language-processing, reinforcement-learning
Coursera
Quiz & Assignment of Coursera
Stars: ✭ 774 (+1211.86%)
Mutual labels:  natural-language-processing, reinforcement-learning
Convai Baseline
ConvAI baseline solution
Stars: ✭ 49 (-16.95%)
Mutual labels:  natural-language-processing, natural-language-generation
Ml Mipt
Open Machine Learning course at MIPT
Stars: ✭ 480 (+713.56%)
Mutual labels:  natural-language-processing, reinforcement-learning
Rnnlg
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
Stars: ✭ 487 (+725.42%)
Mutual labels:  natural-language-processing, natural-language-generation
Rl Baselines Zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Stars: ✭ 839 (+1322.03%)
Mutual labels:  reinforcement-learning, rl

NLG-RL

This repository provides PyTorch implementations of our method [1] for reinforcement learning for sentence generation with action-space reduction.

Requirements

  • Python 3.X
  • PyTorch 0.2X or 0.4X
  • Numba

Training NMT models

Please cd to ./code_0.2 or ./code_0.4 to run experiments. The input data format is one sentence per line for each language.

Small softmax

  • Vocabulary predictor (minimal usage with the default settings used in our paper)
    python train_vocgen.py --train_source XX --train_target YY --dev_source ZZ --dev_target WW

  • NMT with cross-entropy (minimal usage after training the vocabulary predictor)
    python train_nmt.py --train_source XX --train_target YY --dev_source ZZ --dev_target WW

  • NMT with REINFORCE and cross-entropy
    python train_nmt_rl.py --train_source XX --train_target YY --dev_source ZZ --dev_target WW

Full softmax (standard baseline)

  • NMT with cross-entropy (minimal usage without using the vocabulary predictor)
    python train_nmt.py --K -1 --train_source XX --train_target YY --dev_source ZZ --dev_target WW

  • NMT with REINFORCE and cross-entropy
    python train_nmt_rl.py --K -1 --train_source XX --train_target YY --dev_source ZZ --dev_target WW

Notes

A new feature in ./code_0.4

I added an additional option --batch_split_size to train_nmt.py and train_nmt_rl.py to avoid using multiple GPUs, mainly for the Full-softmax setting. The idea is very simple; at each mini-batch iteration, the mini-batch is further split into N smaller chunks by setting --batch_split_size N, after sorting the mini-batch examples according to source token lengths. By this, GPU memory consumption can be reduced, and sometimes we can even expect speedup because of less padding computations. Note that partial derivatives of the N chunks are accumurated, and thus this is different from reducing the mini-batch size. Some experimental results are shown in our NAACL2019 camera-ready paper.

BLEU scores

By default, the BLEU scores computed by this codebase do not assume any additional (de-)tokenization. Thus, if you need more specific ways to compute BLEU scores, you need to modify bleu.sh to incorporate your (de-)tokenization tools.

Reference

[1] Kazuma Hashimoto and Yoshimasa Tsuruoka. 2019. Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019), arXiv cs.CL 1809.01694. (bibtex)

Questions?

Any issues and PRs are welcome.

E-mail: hassy [at] logos.t.u-tokyo.ac.jp or k.hashimoto [at] salesforce.com

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].