All Projects → UFAL-DSG → Tgen

UFAL-DSG / Tgen

Licence: other
Statistical NLG for spoken dialogue systems

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tgen

Nndial
NNDial is an open source toolkit for building end-to-end trainable task-oriented dialogue models. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
Stars: ✭ 332 (+85.47%)
Mutual labels:  dialogue, dialogue-systems, natural-language-generation
Rnnlg
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
Stars: ✭ 487 (+172.07%)
Mutual labels:  dialogue, dialogue-systems, natural-language-generation
Multiwoz
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
Stars: ✭ 384 (+114.53%)
Mutual labels:  dialogue, seq2seq, dialogue-systems
Seq2seq Chatbot For Keras
This repository contains a new generative model of chatbot based on seq2seq modeling.
Stars: ✭ 322 (+79.89%)
Mutual labels:  dialogue, seq2seq
Unit Dmkit
Stars: ✭ 279 (+55.87%)
Mutual labels:  dialogue, dialogue-systems
Trade Dst
Source code for transferable dialogue state generator (TRADE, Wu et al., 2019). https://arxiv.org/abs/1905.08743
Stars: ✭ 287 (+60.34%)
Mutual labels:  dialogue, seq2seq
classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Stars: ✭ 61 (-65.92%)
Mutual labels:  seq2seq, natural-language-generation
Meld
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Stars: ✭ 373 (+108.38%)
Mutual labels:  dialogue, dialogue-systems
Practical Pytorch
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Stars: ✭ 4,329 (+2318.44%)
Mutual labels:  seq2seq, natural-language-generation
Augmented seq2seq
enhance seq2seq model for open ended dialog generation
Stars: ✭ 29 (-83.8%)
Mutual labels:  seq2seq, dialogue-systems
Nlg Eval
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Stars: ✭ 822 (+359.22%)
Mutual labels:  dialogue, natural-language-generation
Convai Baseline
ConvAI baseline solution
Stars: ✭ 49 (-72.63%)
Mutual labels:  dialogue-systems, natural-language-generation
Dstc8 Schema Guided Dialogue
The Schema-Guided Dialogue Dataset
Stars: ✭ 277 (+54.75%)
Mutual labels:  dialogue, dialogue-systems
Dialog Generation Paper
A list of recent papers regarding dialogue generation
Stars: ✭ 265 (+48.04%)
Mutual labels:  dialogue, dialogue-systems
Dstc7 End To End Conversation Modeling
Grounded conversational dataset for end-to-end conversational AI (official DSTC7 data)
Stars: ✭ 141 (-21.23%)
Mutual labels:  dialogue, dialogue-systems
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-86.59%)
Mutual labels:  dialogue, dialogue-systems
DlgSystem
Dialogue Plugin System for Unreal Engine | 🪞 Mirror of https://bit.ly/DlgSource
Stars: ✭ 136 (-24.02%)
Mutual labels:  dialogue, dialogue-systems
TalkerMakerDeluxe
A FOSS Branching Game Dialogue Editor
Stars: ✭ 90 (-49.72%)
Mutual labels:  dialogue, dialogue-systems
Cakechat
CakeChat: Emotional Generative Dialog System
Stars: ✭ 1,361 (+660.34%)
Mutual labels:  seq2seq, dialogue-systems
Dialogue Understanding
This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
Stars: ✭ 77 (-56.98%)
Mutual labels:  dialogue, dialogue-systems

TGen

A statistical natural language generator for spoken dialogue systems

TGen is a statistical natural language generator, with two different algorithms supported:

  1. A statistical sentence planner based on A*-style search, with a candidate plan generator and a perceptron ranker
  2. A sequence-to-sequence (seq2seq) recurrent neural network architecture based on the TensorFlow toolkit

Both algoritms can be trained from pairs of source meaning representations (dialogue acts) and target sentences. The newer seq2seq approach is preferrable: it yields higher performance in terms of both speed and quality.

Both algorithms support generating sentence plans (deep syntax trees), which are subsequently converted to text using the existing the surface realizer from Treex NLP toolkit. The seq2seq algorithm also supports direct string generation.

For more details on the algorithms, please refer to our papers:

  • For seq2seq generation, see our ACL 2016 paper.
  • For an improved version of the seq2seq generation that takes previous user utterance into account to generate a more contextually-appropriate response, see our SIGDIAL 2016 paper.
  • For the old A*-search-based generation, see our ACL 2015 paper.

Installation and Usage

Please refer to USAGE.md for instructions on how to use TGen.

Notice

  • TGen is highly experimental and only tested on a few datasets, so bugs are inevitable. If you find a bug, feel free to contact me or open an issue.
  • If you do not require a specific version of TGen, we recommended to install the current master version, which has the latest bugfixes and all the functionality of the ACL2016/SIGDIAL2016 version.
    • To get the version used in our ACL 2015 paper (A*-search only), see this release.
    • To get the version used in our ACL 2016 and SIGDIAL 2016 papers (seq2seq approach for generating sentence plans or strings, optionally using previous context), see this release.

Citing TGen

If you use or refer to the seq2seq generation in TGen, please cite this paper:

  • Ondřej Dušek and Filip Jurčíček (2016): Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany.

If you use or refer to the context-aware improved seq2seq generation, please cite this paper:

  • Ondřej Dušek and Filip Jurčíček (2016): A Context-aware Natural Language Generator for Dialogue Systems. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Los Angeles, CA, USA.

If you use or refer morphology-aware generation (designed for Czech), please cite this paper (link coming soon):

  • Ondřej Dušek and Filip Jurčíček (2019): Neural Generation for Czech: Data and Baselines. In Proceedings of INLG, Tokyo, Japan.

If you use or refer to the A*-search generation in TGen, please cite this paper:

  • Ondřej Dušek and Filip Jurčíček (2015): Training a Natural Language Generator From Unaligned Data. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pages 451–461, Beijing, China.

License

Author: Ondřej Dušek

Copyright © 2014-2019 Institute of Formal and Applied Linguistics, Charles University, Prague.

Licensed under the Apache License, Version 2.0 (see LICENSE.txt).

Acknowledgements

Work on this project was funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221 and core research funding, SVV projects 260 104 and 260 333, and GAUK grant 2058214 of Charles University in Prague, as well as Charles University project PRIMUS/19/SCI/10. It used language resources stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (projects LM201001 and LM2015071).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].