All Projects → Vamsi995 → Paraphrase Generator

Vamsi995 / Paraphrase Generator

Licence: mit
A paraphrase generator built using the T5 model which produces paraphrased English sentences.

Projects that are alternatives of or similar to Paraphrase Generator

Ipython Notebooks
Some iPython Notebooks I have created for personal learning
Stars: ✭ 54 (-1.82%)
Mutual labels:  jupyter-notebook
Pytorch Udacity Scholarship
Notes from the PyTorch Udacity / Facebook scholarship course
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Pyplotz
A light weight wrapper for matplotlib users with Chinese characters supported
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Graph Pointer Network
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Ctr model zoo
some ctr model, implemented by PyTorch, such as Factorization Machines, Field-aware Factorization Machines, DeepFM, xDeepFM, Deep Interest Network
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Ko en neural machine translation
Korean English NMT(Neural Machine Translation) with Gluon
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Github Paper
Plos in Computational Biology paper related with github for researchers, code, source and document
Stars: ✭ 54 (-1.82%)
Mutual labels:  jupyter-notebook
Autoaugment
Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow
Stars: ✭ 1,084 (+1870.91%)
Mutual labels:  jupyter-notebook
Introduction To Machine Learning
Introductory Course on Machine Learning in Python
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Reinforcement Learning
Implementation of Reinforcement Learning algorithms in Python, based on Sutton's & Barto's Book (Ed. 2)
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Da detection
Progressive Domain Adaptation for Object Detection
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Deepfly3d
Motion capture (markerless 3D pose estimation) pipeline and helper GUI for tethered Drosophila.
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Sta 663 2018
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Vietnamese Electra
Electra pre-trained model using Vietnamese corpus
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Darknetpy
darknetpy is a simple binding for darknet's yolo detector
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Whitehat
Information about my experiences on ethical hacking 💀
Stars: ✭ 54 (-1.82%)
Mutual labels:  jupyter-notebook
Mri Analysis Pytorch
MRI analysis using PyTorch and MedicalTorch
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook
Ds and ml projects
Data Science & Machine Learning projects and tutorials in python from beginner to advanced level.
Stars: ✭ 56 (+1.82%)
Mutual labels:  jupyter-notebook
Timeseriesanalysiswithpython
Stars: ✭ 1,083 (+1869.09%)
Mutual labels:  jupyter-notebook
Text nn
Text classification models. Used a submodule for other projects.
Stars: ✭ 55 (+0%)
Mutual labels:  jupyter-notebook

Paraphrase Generator with T5

A Paraphrase-Generator built using transformers which takes an English sentence as an input and produces a set of paraphrased sentences. This is an NLP task of conditional text-generation. The model used here is the T5ForConditionalGeneration from the huggingface transformers library. This model is trained on the Google's PAWS Dataset and the model is saved in the transformer model hub of hugging face library under the name Vamsi/T5_Paraphrase_Paws.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

  • Streamlit library
  • Huggingface transformers library
  • Pytorch
  • Tensorflow

Installing

  • Streamlit
$ pip install streamlit
  • Huggingface transformers library
$ pip install transformers
  • Tensorflow
$ pip install --upgrade tensorflow
  • Pytorch
Head to the docs and install a compatible version
https://pytorch.org/

Running the web app

  • Clone the repository
$ git clone [repolink] 
  • Running streamlit app
$ cd Streamlit

$ streamlit run paraphrase.py
  • Running the flask app
$ cd Server

$ python server.py

The initial server call will take some time as it downloads the model parameters. The later calls will be relatively faster as it will store the model params in the cahce.

General Usage

PyTorch and TF models are available ​

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Vamsi/T5_Paraphrase_Paws")  
model = AutoModelForSeq2SeqLM.from_pretrained("Vamsi/T5_Paraphrase_Paws")

sentence = "This is something which i cannot understand at all"

text =  "paraphrase: " + sentence + " </s>"

encoding = tokenizer.encode_plus(text,pad_to_max_length=True, return_tensors="pt")
input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda")


outputs = model.generate(
    input_ids=input_ids, attention_mask=attention_masks,
    max_length=256,
    do_sample=True,
    top_k=200,
    top_p=0.95,
    early_stopping=True,
    num_return_sequences=5
)

for output in outputs:
    line = tokenizer.decode(output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    print(line)

Dockerfile

The repository also contains a minimal reproducible Dockerfile that can be used to spin up a server with the API endpoints to perform text paraphrasing.

Note: The Dockerfile uses the built-in Flask development server, hence it's not recommended for production usage. It should be replaced with a production-ready WSGI server.

After cloning the repository, starting the local server it's a two lines script:

docker build -t paraphrase .
docker run -p 5000:5000 paraphrase

and then the API is available on localhost:5000

curl -XPOST localhost:5000/run_forward \
-H 'content-type: application/json' \
-d '{"sentence": "What is the best paraphrase of a long sentence that does not say much?", "decoding_params": {"tokenizer": "", "max_len": 512, "strategy": "", "top_k": 168, "top_p": 0.95, "return_sen_num": 3}}'

Built With

Authors

Acknowledgments

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].