Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → oswaldoludwig → Seq2seq Chatbot For Keras

oswaldoludwig / Seq2seq Chatbot For Keras

Licence: apache-2.0

This repository contains a new generative model of chatbot based on seq2seq modeling.

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning nlp keras gan chatbot generative-adversarial-network seq2seq dialogue glove

Projects that are alternatives of or similar to Seq2seq Chatbot For Keras

Mlds2018spring

Machine Learning and having it Deep and Structured (MLDS) in 2018 spring

Stars: ✭ 124 (-61.49%)

Mutual labels: chatbot, gan, generative-adversarial-network, seq2seq

Tensorflow Tutorials

텐서플로우를 기초부터 응용까지 단계별로 연습할 수 있는 소스 코드를 제공합니다

Stars: ✭ 2,096 (+550.93%)

Mutual labels: chatbot, gan, seq2seq

Dynamic Seq2seq

seq2seq中文聊天机器人

Stars: ✭ 303 (-5.9%)

Mutual labels: chatbot, seq2seq

Deepqa

My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

Stars: ✭ 2,811 (+772.98%)

Mutual labels: chatbot, seq2seq

Deep Generative Prior

Code for deep generative prior (ECCV2020 oral)

Stars: ✭ 308 (-4.35%)

Mutual labels: gan, generative-adversarial-network

DLSS

Deep Learning Super Sampling with Deep Convolutional Generative Adversarial Networks.

Stars: ✭ 88 (-72.67%)

Mutual labels: generative-adversarial-network, gan

UEGAN

[TIP2020] Pytorch implementation of "Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network"

Stars: ✭ 68 (-78.88%)

Mutual labels: generative-adversarial-network, gan

Alae

[CVPR2020] Adversarial Latent Autoencoders

Stars: ✭ 3,178 (+886.96%)

Mutual labels: gan, generative-adversarial-network

TextBoxGAN

Generate text boxes from input words with a GAN.

Stars: ✭ 50 (-84.47%)

Mutual labels: generative-adversarial-network, gan

Makegirlsmoe web

Create Anime Characters with MakeGirlsMoe

Stars: ✭ 3,144 (+876.4%)

Mutual labels: gan, generative-adversarial-network

Dcgan

The Simplest DCGAN Implementation

Stars: ✭ 286 (-11.18%)

Mutual labels: gan, generative-adversarial-network

Trade Dst

Source code for transferable dialogue state generator (TRADE, Wu et al., 2019). https://arxiv.org/abs/1905.08743

Stars: ✭ 287 (-10.87%)

Mutual labels: dialogue, seq2seq

ezgan

An extremely simple generative adversarial network, built with TensorFlow

Stars: ✭ 36 (-88.82%)

Mutual labels: generative-adversarial-network, gan

keras-3dgan

Keras implementation of 3D Generative Adversarial Network.

Stars: ✭ 20 (-93.79%)

Mutual labels: generative-adversarial-network, gan

Few Shot Patch Based Training

The official implementation of our SIGGRAPH 2020 paper Interactive Video Stylization Using Few-Shot Patch-Based Training

Stars: ✭ 313 (-2.8%)

Mutual labels: gan, generative-adversarial-network

DeepFlow

Pytorch implementation of "DeepFlow: History Matching in the Space of Deep Generative Models"

Stars: ✭ 24 (-92.55%)

Mutual labels: generative-adversarial-network, gan

Seq2seq chatbot links

Links to the implementations of neural conversational models for different frameworks

Stars: ✭ 270 (-16.15%)

Mutual labels: chatbot, seq2seq

Seq2seq chatbot

基于seq2seq模型的简单对话系统的tf实现，具有embedding、attention、beam_search等功能，数据集是Cornell Movie Dialogs

Stars: ✭ 308 (-4.35%)

Mutual labels: chatbot, seq2seq

ADL2019

Applied Deep Learning (2019 Spring) @ NTU

Stars: ✭ 20 (-93.79%)

Mutual labels: generative-adversarial-network, gan

MNIST-invert-color

Invert the color of MNIST images with PyTorch

Stars: ✭ 13 (-95.96%)

Mutual labels: generative-adversarial-network, gan

View All Similar Projects ➔

Seq2seq Chatbot for Keras

This repository contains a new generative model of chatbot based on seq2seq modeling. Further details on this model can be found in Section 3 of the paper End-to-end Adversarial Learning for Generative Conversational Agents. In the case of publication using ideas or pieces of code from this repository, please kindly cite this paper.

The trained model available here used a small dataset composed of ~8K pairs of context (the last two utterances of the dialogue up to the current point) and respective response. The data were collected from dialogues of English courses online. This trained model can be fine-tuned using a closed domain dataset to real-world applications.

The canonical seq2seq model became popular in neural machine translation, a task that has different prior probability distributions for the words belonging to the input and output sequences, since the input and output utterances are written in different languages. The architecture presented here assumes the same prior distributions for input and output words. Therefore, it shares an embedding layer (Glove pre-trained word embedding) between the encoding and decoding processes through the adoption of a new model. To improve the context sensitivity, the thought vector (i.e. the encoder output) encodes the last two utterances of the conversation up to the current point. To avoid forgetting the context during the answer generation, the thought vector is concatenated to a dense vector that encodes the incomplete answer generated up to the current point. The resulting vector is provided to dense layers that predict the current token of the answer. See Section 3.1 of our paper for a better insight into the advantages of our model.

The algorithm iterates by including the predicted token into the incomplete answer and feeding it back to the right-hand side input layer of the model shown below.

As can be seen in the figure above, the two LSTMs are arranged in parallel, while the canonical seq2seq has the recurrent layers of encoder and decoder arranged in series. Recurrent layers are unfolded during backpropagation through time, resulting in a large number of nested functions and, therefore, a higher risk of vanishing gradient, which is worsened by the cascade of recurrent layers of the canonical seq2seq model, even in the case of gated architectures such as the LSTMs. I believe this is one of the reasons why my model behaves better during training than the canonical seq2seq.

The following pseudocode explains the algorithm.

The training of this new model converges in few epochs. Using our dataset of 8K training examples, it was required only 100 epochs to reach categorical cross-entropy loss of 0.0318, at the cost of 139 s/epoch running in a GPU GTX980. The performance of this trained model (provided in this repository) seems as convincing as the performance of a vanilla seq2seq model trained on the ~300K training examples of the Cornell Movie Dialogs Corpus, but requires much less computational effort to train.

To chat with the pre-trained model:

Download the python file "conversation.py", the vocabulary file "vocabulary_movie", and the net weights "my_model_weights20", which can be found here ;
Run conversation.py.

To chat with the new model trained by our new GAN-based training algorithm:

Download the python file "conversation_discriminator.py", the vocabulary file "vocabulary_movie", and the net weights "my_model_weights20.h5", "my_model_weights.h5", and "my_model_weights_discriminator.h5", which can be found here ;
Run conversation_discriminator.py.

This model has a better performance using the same training data. The discriminator of the GAN-based model is used to select the best answer between two models, one trained by teacher forcing and another trained by our new GAN-like training method, whose details can be found in this paper.

To train a new model or to fine tune on your own data:

If you want to train from the scratch, delete the file my_model_weights20.h5. To fine tune on your data, keep this file;
Download the Glove folder 'glove.6B' and include this folder in the directory of the chatbot (you can find this folder here). This algorithm applies transfer learning by using a pre-trained word embedding, which is fine tuned during the training;
Run split_qa.py to split the content of your training data into two files: 'context' and 'answers' and get_train_data.py to store the padded sentences into the files 'Padded_context' and 'Padded_answers';
Run train_bot.py to train the chatbot (it is recommended the use of GPU, to do so type: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,exception_verbosity=high python train_bot.py);

Name your training data as "data.txt". This file must contain one dialogue utterance per line. If your dataset is big, set the variable num_subsets (in line 29 of train_bot.py) to a larger number.

weights_file = 'my_model_weights20.h5' weights_file_GAN = 'my_model_weights.h5' weights_file_discrim = 'my_model_weights_discriminator.h5'

A nice overview of the current implementations of neural conversational models for different frameworks (along with some results) can be found here.

Our model can be applied to other NLP tasks, such as text summarization, see for example Alternate 2: Recursive Model A. We encourage the application of our model in other tasks, in this case, we kindly ask you to cite our work as can be seen in this document, registered in July 2017.

These codes can run in Ubuntu 14.04.3 LTS, Python 2.7.6, Theano 0.9.0, and Keras 2.0.4. The use of another configuration may require some minor adaptations.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 322

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (10) 🔗