All Projects → sekharvth → Simple Chatbot Keras

sekharvth / Simple Chatbot Keras

Licence: mit
Design and build a chatbot using data from the Cornell Movie Dialogues corpus, using Keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Simple Chatbot Keras

Audio Classification using LSTM
Classification of Urban Sound Audio Dataset using LSTM-based model.
Stars: ✭ 47 (+56.67%)
Mutual labels:  lstm, lstm-neural-networks
DSTC6-End-to-End-Conversation-Modeling
DSTC6: End-to-End Conversation Modeling Track
Stars: ✭ 56 (+86.67%)
Mutual labels:  chatbot, lstm
lstm-electric-load-forecast
Electric load forecast using Long-Short-Term-Memory (LSTM) recurrent neural network
Stars: ✭ 56 (+86.67%)
Mutual labels:  lstm, lstm-neural-networks
Chinese Chatbot
中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行,跑不起来直播吃键盘。
Stars: ✭ 124 (+313.33%)
Mutual labels:  chatbot, lstm
Predictive Maintenance Using Lstm
Example of Multiple Multivariate Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras.
Stars: ✭ 352 (+1073.33%)
Mutual labels:  lstm, lstm-neural-networks
DrowsyDriverDetection
This is a project implementing Computer Vision and Deep Learning concepts to detect drowsiness of a driver and sound an alarm if drowsy.
Stars: ✭ 82 (+173.33%)
Mutual labels:  lstm, lstm-neural-networks
OCR
Optical character recognition Using Deep Learning
Stars: ✭ 25 (-16.67%)
Mutual labels:  lstm, lstm-neural-networks
Chameleon recsys
Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (+573.33%)
Mutual labels:  lstm, lstm-neural-networks
Personality Detection
Implementation of a hierarchical CNN based model to detect Big Five personality traits
Stars: ✭ 338 (+1026.67%)
Mutual labels:  lstm, lstm-neural-networks
Paraphraser
Sentence paraphrase generation at the sentence level
Stars: ✭ 283 (+843.33%)
Mutual labels:  lstm, lstm-neural-networks
Nlp Models Tensorflow
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0
Stars: ✭ 1,603 (+5243.33%)
Mutual labels:  chatbot, lstm
Seq2seq Chatbot
Chatbot in 200 lines of code using TensorLayer
Stars: ✭ 777 (+2490%)
Mutual labels:  chatbot, lstm
Tensorflow seq2seq chatbot
Stars: ✭ 81 (+170%)
Mutual labels:  chatbot, lstm
lstm-numpy
Vanilla LSTM with numpy
Stars: ✭ 17 (-43.33%)
Mutual labels:  lstm, lstm-neural-networks
Lstm Siamese Text Similarity
⚛️ It is keras based implementation of siamese architecture using lstm encoders to compute text similarity
Stars: ✭ 216 (+620%)
Mutual labels:  lstm, lstm-neural-networks
Conversational-AI-Chatbot-using-Practical-Seq2Seq
A simple open domain generative based chatbot based on Recurrent Neural Networks
Stars: ✭ 17 (-43.33%)
Mutual labels:  chatbot, lstm-neural-networks
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+6890%)
Mutual labels:  lstm, lstm-neural-networks
Lstm anomaly thesis
Anomaly detection for temporal data using LSTMs
Stars: ✭ 178 (+493.33%)
Mutual labels:  lstm, lstm-neural-networks
object-tracking
Multiple Object Tracking System in Keras + (Detection Network - YOLO)
Stars: ✭ 89 (+196.67%)
Mutual labels:  lstm, lstm-neural-networks
Stockpriceprediction
Stock Price Prediction using Machine Learning Techniques
Stars: ✭ 700 (+2233.33%)
Mutual labels:  lstm, lstm-neural-networks

Chatbot using Keras

Design and build a simple chatbot using data from the Cornell Movie Dialogues corpus, using Keras

Most of the ideas used in this model comes from the original seq2seq model made by the Keras team. It also serves as a brillant tutorial on the working of the architecture, and how it is developed: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html

In short, the input sequence (the question asked to the chatbot) is passed into the encder LSTM, which outputs the final states of the encoder LSTM. These final states are passed into the decoder LSTM, along with the output sequence (the reply for the question, in the training data). The output of this decoder LSTM is the same as the actual reply, but shifted one time step to the left. That is, if the reply (aka, the input to the decoder lstm) is 'I am fine', the output for first time step with input 'I' will be 'am', the input for the second time step will be 'am', with output 'fine', and so on.

In the inference mode, the 'BOS'(beginning of sentence) tag is the initial input to the decoder lstm, along with the final encoder states of the encoder lstm (obtained after passing new query into the encoder lstm). The output of this time step is used as input for the next time step, along with cell states of the current time step. This process repeats till 'EOS' tag (end of sentence) is generated.

But the model in the page above uses a character level model, which at first puzzled me, especially when most of the literature on the subject overwhelmingly adopted word level models. However, when I started with the word level model, I quickly found why the Keras team opted for the char level model.

When using word level models, the vocabulary (no. of unique words) of the enire data set (the Cornell Movie Dialogues corpus in this case) would be more than 50,000. And the number of examples for training amounted to ~300k (150000 pairs). When defining the outputs to the decoder lstm in the decoder model, the shape would be (num_examples, max_length_of_sentences, vocab_size). This would in effect, mean (150000, 20, 50000), which would raise memory errors. When using the char level, instead of 50,000 for the vocab_size, it would reduce to something in the range of 70-80(26 for lowercase alphabets, 26 uppercase, 10 digits, unique symbols like '!', '?' etc), which would have better chances of going through without too many memory constraints. The downside is that it will take an insane amount of epochs to converge, and can only be done on a powerful GPU, which is beyond my current capabilities.

The model shown here is the simplest of models, and for further improvement (definite requirement), more tweaking has to be done (increase the number of LSTM layers, introduce Dropout, play around with the optimizers etc)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].