All Projects → andi611 → Conditional-SeqGAN-Tensorflow

andi611 / Conditional-SeqGAN-Tensorflow

Licence: MIT license
Conditional Sequence Generative Adversarial Network trained with policy gradient, Implementation in Tensorflow

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Conditional-SeqGAN-Tensorflow

Chatbot
一个可以自己进行训练的中文聊天机器人, 根据自己的语料训练出自己想要的聊天机器人,可以用于智能客服、在线问答、智能聊天等场景。目前包含seq2seq、seqGAN版本、tf2.0版本、pytorch版本。
Stars: ✭ 2,441 (+5093.62%)
Mutual labels:  nlp-machine-learning, seqgan
Naive-Bayes-Evening-Workshop
Companion code for Introduction to Python for Data Science: Coding the Naive Bayes Algorithm evening workshop
Stars: ✭ 23 (-51.06%)
Mutual labels:  nlp-machine-learning
empythy
Automated NLP sentiment predictions- batteries included, or use your own data
Stars: ✭ 17 (-63.83%)
Mutual labels:  nlp-machine-learning
Pix2Pix
Image to Image Translation using Conditional GANs (Pix2Pix) implemented using Tensorflow 2.0
Stars: ✭ 29 (-38.3%)
Mutual labels:  conditional-gan
DeepLearningReading
Deep Learning and Machine Learning mini-projects. Current Project: Deepmind Attentive Reader (rc-data)
Stars: ✭ 78 (+65.96%)
Mutual labels:  nlp-machine-learning
Very-deep-cnn-tensorflow
Very deep CNN for text classification
Stars: ✭ 18 (-61.7%)
Mutual labels:  nlp-machine-learning
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (+8.51%)
Mutual labels:  nlp-machine-learning
lidtk
Language Identification Toolkit
Stars: ✭ 17 (-63.83%)
Mutual labels:  nlp-machine-learning
brand-sentiment-analysis
Scripts utilizing Heartex platform to build brand sentiment analysis from the news
Stars: ✭ 21 (-55.32%)
Mutual labels:  nlp-machine-learning
AI-Sentiment-Analysis-on-IMDB-Dataset
Sentiment Analysis using Stochastic Gradient Descent on 50,000 Movie Reviews Compiled from the IMDB Dataset
Stars: ✭ 55 (+17.02%)
Mutual labels:  nlp-machine-learning
Engine
The Centrifuge process, filter and saves the relevant documents as recommendations to the relevant users
Stars: ✭ 20 (-57.45%)
Mutual labels:  nlp-machine-learning
Machine-Learning-Models
In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.
Stars: ✭ 30 (-36.17%)
Mutual labels:  nlp-machine-learning
anuvada
Interpretable Models for NLP using PyTorch
Stars: ✭ 102 (+117.02%)
Mutual labels:  nlp-machine-learning
Entity Embedding
Reference implementation of the paper "Word Embeddings for Entity-annotated Texts"
Stars: ✭ 19 (-59.57%)
Mutual labels:  nlp-machine-learning
kex
Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.
Stars: ✭ 46 (-2.13%)
Mutual labels:  nlp-machine-learning
gans-2.0
Generative Adversarial Networks in TensorFlow 2.0
Stars: ✭ 76 (+61.7%)
Mutual labels:  conditional-gan
Deception-Detection-on-Amazon-reviews-dataset
A SVM model that classifies the reviews as real or fake. Used both the review text and the additional features contained in the data set to build a model that predicted with over 85% accuracy without using any deep learning techniques.
Stars: ✭ 42 (-10.64%)
Mutual labels:  nlp-machine-learning
Quora QuestionPairs DL
Kaggle Competition: Using deep learning to solve quora's question pairs problem
Stars: ✭ 54 (+14.89%)
Mutual labels:  nlp-machine-learning
coursera-gan-specialization
Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai
Stars: ✭ 277 (+489.36%)
Mutual labels:  conditional-gan
ShortText-Fasttext
ShortText classification
Stars: ✭ 12 (-74.47%)
Mutual labels:  nlp-machine-learning

Daisy: Dialog Analogous Intellectual System

Machine Learning: Chatbot

Conditional Sequence Generative Adversarial Network Trained with Policy Gradient, Implementation in Tensorflow

Requirements:

  • Tensorflow r1.5.1
  • Python 3.6
  • Numpy 1.13.3
  • tempfile (Optional)
  • gtts (Optional)
  • pygame (Optional)

Introduction

Apply Policy Gradient to Generative Adversarial Nets to improve ChatBot dialog generation quality.

  • Generator: seq2seq model with attention and embedding
  • Discriminator: hierarchical encoder and two class classifier
  • Policy Gradient Update: use reward computed by the discriminator to update the generator
  • Monte Carlo Rollout: to compute reward for every generation step
  • Teacher Forcing: add maximum likelihood estimation training to GAN training
  • sampled softmax: to reduce computation complexity from regular softmax
  • model checkpoint supported: training early stopping and save model with best loss

Preperation

  1. Download corpus from:
1) https://www.cs.cornell.edu/%7Ecristian/Cornell_Movie-Dialogs_Corpus.html (official site)
2) https://drive.google.com/file/d/13TvKXXrKVg9X7IEayu7eYpRbxfZ5GDE0/view?usp=sharing (backup link)
  1. extract the .zip file, and save them to the path below:
config.corpus_path = '../cornell_movie_dialog_corpus'

This default path can be modified by changing the '--corpus_dir' option in 'config.py'.

  1. Run Preprocessing on the raw corpus file:
python3 preprocess_data.py

This saves the pre-processed files to the path below:

config.data_dir = '../data'

This default path can be modified by changing the '--data_dir' option in 'config.py'.

  1. After running pre-processing, 5 files will be created:
1. processed_corpus.txt : Encoder input for training
2. word2idx.pkl : word to idx dictionary mapping saved by pickle
3. idx2word.pkl : idx to word dictionary mapping saved by pickle
4. train_encode.pkl: model-ready training data: tokenized and index labeled text ready to be fed to the encoder
5. train_decode.pkl: model-ready training data: tokenized and index labeled text ready to be fed to the decoder

Training

  1. To run MLE pre train on the Generator model:
python3 run.py --pre_train --force_save

(optional: '--force_save' saves model every epoch, otherwise the default model checkpoint is activated and only the model with minimum loss will be saved.)

  1. To run generative adversarial training with policy gradient:
python3 run.py --gan_train --force_save

(optional: '--force_save' saves model every epoch, otherwise the default model checkpoint is activated and only the model with minimum loss will be saved.)

Inference

  1. To chat with the pre-trained model:
python3 run.py --pre_chat --speak

(optional: '--speak' enables audio response)

  1. To chat with the GAN-trained model:
python3 run.py --gan_chat --speak

(optional: '--speak' enables audio response)

  1. To evaluate the BLEU score of the pre-trained model:
python3 run.py --pre_evaluate
  1. To evaluate the BLEU score of the GAN-trained model:
python3 run.py --gan_evaluate

Trained Models

  1. To use exsiting models, place model files under the 'model' directory. Trained Models (pre-trained / gan-trained) can be download from:
https://drive.google.com/drive/folders/1C0A53qqQpdGyKIEB7HVb6BqlBD5gvEJy?usp=sharing

Other

  1. Source code:
1. configuration.py
2. data_loader.py
3. discriminator.py
4. generator.py
5. train.py
6. test.py
7. run.py
8. tf_seq2seq_model.py
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].