All Projects → vsuthichai → Paraphraser

vsuthichai / Paraphraser

Licence: mit
Sentence paraphrase generation at the sentence level

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Paraphraser

Repo 2016
R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation
Stars: ✭ 103 (-63.6%)
Mutual labels:  lstm, lstm-neural-networks
Chameleon recsys
Source code of CHAMELEON - A Deep Learning Meta-Architecture for News Recommender Systems
Stars: ✭ 202 (-28.62%)
Mutual labels:  lstm, lstm-neural-networks
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (-50.18%)
Mutual labels:  lstm, lstm-neural-networks
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-65.72%)
Mutual labels:  lstm, lstm-neural-networks
Audio Classification using LSTM
Classification of Urban Sound Audio Dataset using LSTM-based model.
Stars: ✭ 47 (-83.39%)
Mutual labels:  lstm, lstm-neural-networks
Text predictor
Char-level RNN LSTM text generator📄.
Stars: ✭ 99 (-65.02%)
Mutual labels:  lstm, lstm-neural-networks
Lstm anomaly thesis
Anomaly detection for temporal data using LSTMs
Stars: ✭ 178 (-37.1%)
Mutual labels:  lstm, lstm-neural-networks
Image Captioning
Image Captioning: Implementing the Neural Image Caption Generator with python
Stars: ✭ 52 (-81.63%)
Mutual labels:  lstm, lstm-neural-networks
lstm-numpy
Vanilla LSTM with numpy
Stars: ✭ 17 (-93.99%)
Mutual labels:  lstm, lstm-neural-networks
DrowsyDriverDetection
This is a project implementing Computer Vision and Deep Learning concepts to detect drowsiness of a driver and sound an alarm if drowsy.
Stars: ✭ 82 (-71.02%)
Mutual labels:  lstm, lstm-neural-networks
Cnn lstm for text classify
CNN, LSTM, NBOW, fasttext 中文文本分类
Stars: ✭ 90 (-68.2%)
Mutual labels:  lstm, lstm-neural-networks
OCR
Optical character recognition Using Deep Learning
Stars: ✭ 25 (-91.17%)
Mutual labels:  lstm, lstm-neural-networks
Contextual Utterance Level Multimodal Sentiment Analysis
Context-Dependent Sentiment Analysis in User-Generated Videos
Stars: ✭ 86 (-69.61%)
Mutual labels:  lstm, lstm-neural-networks
Deep Generation
I used in this project a reccurent neural network to generate c code based on a dataset of c files from the linux repository.
Stars: ✭ 101 (-64.31%)
Mutual labels:  lstm, lstm-neural-networks
Bitcoin Price Prediction Using Lstm
Bitcoin price Prediction ( Time Series ) using LSTM Recurrent neural network
Stars: ✭ 67 (-76.33%)
Mutual labels:  lstm, lstm-neural-networks
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+640.99%)
Mutual labels:  lstm, lstm-neural-networks
Simple Chatbot Keras
Design and build a chatbot using data from the Cornell Movie Dialogues corpus, using Keras
Stars: ✭ 30 (-89.4%)
Mutual labels:  lstm, lstm-neural-networks
Tensorflow Sentiment Analysis On Amazon Reviews Data
Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.
Stars: ✭ 34 (-87.99%)
Mutual labels:  lstm, lstm-neural-networks
Lstm Siamese Text Similarity
⚛️ It is keras based implementation of siamese architecture using lstm encoders to compute text similarity
Stars: ✭ 216 (-23.67%)
Mutual labels:  lstm, lstm-neural-networks
lstm-electric-load-forecast
Electric load forecast using Long-Short-Term-Memory (LSTM) recurrent neural network
Stars: ✭ 56 (-80.21%)
Mutual labels:  lstm, lstm-neural-networks

Paraphraser

This project providers users the ability to do paraphrase generation for sentences through a clean and simple API. A demo can be seen here: pair-a-phrase

The paraphraser was developed under the Insight Data Science Artificial Intelligence program.

Model

The underlying model is a bidirectional LSTM encoder and LSTM decoder with attention trained using Tensorflow. Downloadable link here: paraphrase model

Prerequisiteis

  • python 3.5
  • Tensorflow 1.4.1
  • spacy

Inference Execution

Download the model checkpoint from the link above and run:

python inference.py --checkpoint=<checkpoint_path/model-171856>

Datasets

The dataset used to train this model is an aggregation of many different public datasets. To name a few:

  • para-nmt-5m
  • Quora question pair
  • SNLI
  • Semeval
  • And more!

I have not included the aggregated dataset as part of this repo. If you're curious and would like to know more, contact me. Pretrained embeddings come from John Wieting's para-nmt-50m project.

Training

Training was done for 2 epochs on a Nvidia GTX 1080 and evaluted on the BLEU score. The Tensorboard training curves can be seen below. The grey curve is train and the orange curve is dev.

TODOs

  • pip installable package
  • Explore deeper number of layers
  • Recurrent layer dropout
  • Greater dataset augmentation
  • Try residual layer
  • Model compression
  • Byte pair encoding for out of set vocabulary

Citations

@inproceedings { wieting-17-millions, 
    author = {John Wieting and Kevin Gimpel}, 
    title = {Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations}, 
    booktitle = {arXiv preprint arXiv:1711.05732}, year = {2017} 
}

@inproceedings { wieting-17-backtrans, 
    author = {John Wieting, Jonathan Mallinson, and Kevin Gimpel}, 
    title = {Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext}, 
    booktitle = {Proceedings of Empirical Methods in Natural Language Processing}, 
    year = {2017} 
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].