All Projects → ricsinaruto → Seq2seqchatbots

ricsinaruto / Seq2seqchatbots

Licence: mit
A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Seq2seqchatbots

Conversation Tensorflow
TensorFlow implementation of Conversation Models
Stars: ✭ 143 (-69.31%)
Mutual labels:  chatbot, conversation, dataset, seq2seq
Watbot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: ✭ 64 (-86.27%)
Mutual labels:  chatbot, conversation, dialog
Tensorflow seq2seq chatbot
Stars: ✭ 81 (-82.62%)
Mutual labels:  chatbot, neural-networks, seq2seq
Dialog corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+256.65%)
Mutual labels:  chatbot, dataset, dialog
Tensorflow Seq2seq Dialogs
Build conversation Seq2Seq models with TensorFlow
Stars: ✭ 43 (-90.77%)
Mutual labels:  chatbot, neural-networks, seq2seq
Gossiping Chinese Corpus
PTT 八卦版問答中文語料
Stars: ✭ 137 (-70.6%)
Mutual labels:  chatbot, dataset, dialog
Multiturndialogzoo
Multi-turn dialogue baselines written in PyTorch
Stars: ✭ 106 (-77.25%)
Mutual labels:  chatbot, seq2seq, transformer
DSTC6-End-to-End-Conversation-Modeling
DSTC6: End-to-End Conversation Modeling Track
Stars: ✭ 56 (-87.98%)
Mutual labels:  chatbot, dialog, conversation
Chatbot Watson Android
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (-63.73%)
Mutual labels:  chatbot, conversation, dialog
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (-62.23%)
Mutual labels:  chatbot, seq2seq, transformer
Xpersona
XPersona: Evaluating Multilingual Personalized Chatbot
Stars: ✭ 54 (-88.41%)
Mutual labels:  chatbot, dialog, transformer
pytorch-transformer-chatbot
PyTorch v1.2에서 생긴 Transformer API 를 이용한 간단한 Chitchat 챗봇
Stars: ✭ 44 (-90.56%)
Mutual labels:  chatbot, transformer, seq2seq
chatbot
🤖️ 基于 PyTorch 的任务型聊天机器人(支持私有部署和 docker 部署的 Chatbot)
Stars: ✭ 77 (-83.48%)
Mutual labels:  chatbot, seq2seq
neural-chat
An AI chatbot using seq2seq
Stars: ✭ 30 (-93.56%)
Mutual labels:  conversation, seq2seq
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (-94.85%)
Mutual labels:  dialog, conversation
Deepqa
My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot
Stars: ✭ 2,811 (+503.22%)
Mutual labels:  chatbot, seq2seq
SpaceFusion
NAACL'19: "Jointly Optimizing Diversity and Relevance in Neural Response Generation"
Stars: ✭ 73 (-84.33%)
Mutual labels:  chatbot, conversation
AITQA
resources for the IBM Airlines Table-Question-Answering Benchmark
Stars: ✭ 12 (-97.42%)
Mutual labels:  dataset, transformer
Ergo
🧠 A tool that makes AI easier.
Stars: ✭ 264 (-43.35%)
Mutual labels:  dataset, neural-networks
Komputation
Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.
Stars: ✭ 295 (-36.7%)
Mutual labels:  neural-networks, seq2seq

Seq2seqChatbots · twitter

Paper2 Paper1 Poster Code1 Code2 notes documentation blog
A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.
The wiki contains my notes and summaries of over 150 recent publications related to neural dialog modeling.

Features

💾   Run your own trainings or experiment with pre-trained models
✅   4 different dialog datasets integrated with tensor2tensor
🔀   Seemlessly works with any model or hyperparameter set in tensor2tensor
🚀    Easily extendable base class for dialog problems

Setup

Run setup.py which installs required packages and steps you through downloading additional data:

python setup.py

You can download all trained models used in this paper from here. Each training contains two checkpoints, one for the validation loss minimum and another after 150 epochs. The data and the trainings folder structure match each other exactly.

Usage

python t2t_csaky/main.py --mode=train

The mode argument can be one of the following four: {generate_data, train, decode, experiment}. In the experiment mode you can speficy what to do inside the experiment function of the run file. A detailed explanation is given below, for what each mode does.

Config

You can control the flags and parameters of each mode directly in this file. For each run that you initiate this file will be copied to the appropriate directory, so you can quickly access the parameters of any run. There are some flags that you have to set for every mode (the FLAGS dictionary in the config file):

  • t2t_usr_dir: Path to the directory where my code resides. You don't have to change this, unless you rename the directory.
  • data_dir: The path to the directory where you want to generate the source and target pairs, and other data. The dataset will be downloaded one level higher from this directory into a raw_data folder.
  • problem: This is the name of a registered problem that tensor2tensor needs. Detailed in the generate_data section below. All paths should be from the root of the repo.

Generate Data

This mode will download and preprocess the data and generate source and target pairs. Currently there are 6 registered problems, that you can use besides the ones given by tensor2tensor:

The PROBLEM_HPARAMS dictionary in the config file contains problem specific parameters that you can set before generating data:

  • num_train_shards/num_dev_shards: If you want the generated train or dev data to be sharded over several files.
  • vocabulary_size: Size of the vocabulary that we want to use for the problem. Words outside this vocabulary will be replaced with the token.
  • dataset_size: Number of utterance pairs, if we don't want to use the full dataset (defined by 0).
  • dataset_split: Specify a train-val-test split for the problem.
  • dataset_version: This is only relevant to the opensubtitles dataset, since there are several versions of this dataset, you can specify the year of the dataset that you want to download.
  • name_vocab_size: This is only relevant to the cornell problem with separate names. You can set the size of the vocabulary containing only the personas.

Train

This mode allows you to train a model with the specified problem and hyperparameters. The code just calls the tensor2tensor training script, so any model that is in tensor2tensor can be used. Besides these, there is also a subclassed model with small modifications:

  • gradient_checkpointed_seq2seq: Small modification of the lstm based seq2seq model, so that own hparams can be used entirely. Before calculating the softmax the LSTM hidden units are projected to 2048 linear units as here. Finally, I tried to implement gradient checkpointing to this model, but currently it is taken out since it didn't give good results.

There are several additional flags that you can specify for a training run in the FLAGS dictionary in the config file, some of which are:

  • train_dir: Name of the directory where the training checkpoint files will be saved.
  • model: Name of the model: either one of the above or a tensor2tensor defined model.
  • hparams: Specify a registered hparams_set, or leave empty if you want to define hparams in the config file. In order to specify hparams for a seq2seq or transformer model, you can use the SEQ2SEQ_HPARAMS and TRANSFORMER_HPARAMS dictionaries in the config file (check it for more details).

Decode

With this mode you can decode from the trained models. The following parameters affect the decoding (in the FLAGS dictionary in the config file):

  • decode_mode: Can be interactive, where you can chat with the model using the command line. file mode allows you to specify a file with source utterances for which to generate responses, and dataset mode will randomly sample the validation data provided and output responses.
  • decode_dir: Directory where you can provide file to decode from, and outputted responses will be saved here.
  • input_file_name: Name of the file that you have to give in file mode (placed in the decode_dir).
  • output_file_name: Name of the file, inside decode_dir, where output responses will be saved.
  • beam_size: Size of the beam, when using beam search.
  • return_beams: If False return only the top beam, otherwise return beam_size number of beams.

Results & Examples

The following results are from these two papers.

Loss and Metrics of Transformer Trained on Cornell


TRF is the Transformer model, while RT means randomly selected responses from the training set and GT means ground truth responses. For an explanation of the metrics see the paper.

Responses from Transformer and Seq2seq Trained on Cornell and Opensubtitles

S2S is a simple seq2seq model with LSTMs trained on Cornell, others are Transformer models. Opensubtitles F is pre-trained on Opensubtitles and finetuned on Cornell.

Loss and Metrics of Transformer Trained on DailyDialog


TRF is the Transformer model, while RT means randomly selected responses from the training set and GT means ground truth responses. For an explanation of the metrics see the paper.

Responses from Transformer Trained on DailyDialog

Contributing

Check the issues for some additions where help is appreciated. Any contributions are welcome ❤️
Please try to follow the code syntax style used in the repo (flake8, 2 spaces indent, 80 char lines, commenting a lot, etc.)

New problems can be registered by subclassing WordChatbot, or even better to subclass CornellChatbotBasic or OpensubtitleChatbot, because they implement some additional functionalities. Usually it's enough to override the preprocess and create_data functions. Check the documentation for more details and see daily_dialog_chatbot for an example.

New models and hyperparameters can be added by following the tensor2tensor tutorial.

Authors

License

This project is licensed under the MIT License - see the LICENSE file for details.
Please include a link to this repo if you use it in your work and consider citing the following paper:

@InProceedings{Csaky:2017,
  title = {Deep Learning Based Chatbot Models},
  author = {Csaky, Richard},
  year = {2019},
  publisher={National Scientific Students' Associations Conference},
  url ={https://tdk.bme.hu/VIK/DownloadPaper/asdad},
  note={https://tdk.bme.hu/VIK/DownloadPaper/asdad}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].