All Projects → borisdayma → Huggingtweets

borisdayma / Huggingtweets

Tweet Generation with Huggingface

Projects that are alternatives of or similar to Huggingtweets

Pytorch Computer Vision Cookbook
PyTorch Computer Vision Cookbook, Published by Packt
Stars: ✭ 122 (-1.61%)
Mutual labels:  jupyter-notebook
Simplegesturerecognition
A very simple gesture recognition technique using opencv and python
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Gdeltpyr
Python based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Oc Nn
Repository for the One class neural networks paper
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Pygru4rec
PyTorch Implementation of Session-based Recommendations with Recurrent Neural Networks(ICLR 2016, Hidasi et al.)
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Predictive Maintenance
Data Wrangling, EDA, Feature Engineering, Model Selection, Regression, Binary and Multi-class Classification (Python, scikit-learn)
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Curso Series Temporales
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Dash Sample Apps
Open-source demos hosted on Dash Gallery
Stars: ✭ 2,090 (+1585.48%)
Mutual labels:  jupyter-notebook
Rl Quadcopter
Teach a Quadcopter How to Fly!
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Error Detection
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Flask Rest Setup
Notes on Flask REST API and tutorial
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Code2pix
code2pix: Generating Graphical User Interfaces from Code (A Differentiable Compiler)
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Carnd Lenet Lab
Implement the LeNet deep neural network model with TensorFlow.
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Data Science
Toda semana um novo material estará disponível para guiar no estudo de ciência de dados =)
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Nb2mail
Send a notebook as an email
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Mnist Coreml Training
Training MNIST with CoreML on Device
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
Software Training
RoboJackets Software Training
Stars: ✭ 124 (+0%)
Mutual labels:  jupyter-notebook
100 Days Of Nlp
Stars: ✭ 125 (+0.81%)
Mutual labels:  jupyter-notebook
Ipyvolume
3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL
Stars: ✭ 1,696 (+1267.74%)
Mutual labels:  jupyter-notebook
Jupyterlab Demo
Demonstrations of JupyterLab
Stars: ✭ 122 (-1.61%)
Mutual labels:  jupyter-notebook

HuggingTweets - Train a model to generate tweets

Create in 5 minutes a tweet generator based on your favorite Tweeter

Make my own model with the demo →

or access existing models →

Introduction

I developed HuggingTweets to try to predict Elon Musk's next breakthrough 😉

huggingtweets illustration

This project fine-tunes a pre-trained neural network on a user's tweets using HuggingFace Transformers, an awesome open source library for Natural Language Processing. The resulting model can then generate new tweets for you!

Training and results are automatically logged into W&B through the HuggingFace integration.

Usage

To test the demo, click on below link and share your predictions!

Open In Colab

You can also use it locally by installing the dependencies with pipenv or pip and use huggingtweets-demo.ipynb

Results

My favorite sample is definitely on Andrej Karpathy, start of sentence "I don't like":

I don't like this :) 9:20am: Forget this little low code and preprocessor optimization. Even if it's neat, for top-level projects. 9:27am: Other useful code examples? It's not kind of best code, :) 9:37am: Python drawing bug like crazy, restarts regular web browsing ;) 9:46am: Okay, I don't mind. Maybe I should try that out! I'll investigate it :) 10:00am: I think I should try Shigemitsu's imgur page. Or the minimalist website if you're after 10/10 results :) Also maybe Google ImageNet on "Yelp" instead :) 10:05am: Looking forward to watching it talk!

I had a lot of fun running predictions on other people too!

How does it work?

To understand how the model was developed, check my W&B report.

You can also explore the development version huggingtweets-dev.ipynb or use the following link.

Open In Colab

Required files to run W&B sweeps are in dev folder.

Future research

I still have more research to do:

  • evaluate how to "merge" two different personalities ;
  • test training top layers vs bottom layers to see how it affects learning of lexical field (subject of content) vs word predictions, memorization vs creativity ;
  • augment text data with adversarial approaches ;
  • pre-train on large Twitter dataset of many people ;
  • explore few-shot learning approaches as we have limited data per user though there are probably only few writing styles ;
  • implement a pipeline to continuously train the network on new tweets ;
  • cluster users and identify topics, writing style…

About

Built by Boris Dayma

Follow

My main goals with this project are:

  • to experiment with how to train, deploy and maintain neural networks in production ;
  • to make AI accessible to everyone ;
  • to have fun!

For more details, visit the project repository.

GitHub stars

Disclaimer: this project is not to be used to publish any false generated information but to perform research on Natural Language Generation.

FAQ

  1. Does this project pose a risk of being used for disinformation?

    Large NLP models can be misused to publish false data. OpenAI performed a staged release of GPT-2 to study any potential misuse of their models.

    I want to ensure latest AI technologies are accessible to everyone to ensure fairness and prevent social inequality.

    HuggingTweets shall not be used for creating innapropriate content, nor for any illicit or unethical purposes. Any generated text from other users tweets must explicitly be referenced as such and cannot be published with the intent of hiding their origin. No generated content can be published against a person unwilling to have their data used as such.

  2. Why is the demo in colab instead of being a real independent web app?

    It actually looks much better with Voilà as the code cells are hidden and automatically executed. Also we can easily deploy it through for free on Binder.

    However training such large neural networks requires GPU (not available on Binder, and not cheap) and I wanted to make HuggingTweets accessible to everybody. Google Colab generously offers free GPU so is the perfect place to host the demo.

Resources

Got questions about W&B?

If you have any questions about using W&B to track your model performance and predictions, please reach out to the slack community.

Acknowledgements

I was able to make the first version of this program in just a few days.

It would not have been possible without these people and these open-source tools:

  • W&B for the great tracking & visualization tools for ML experiments ;
  • HuggingFace for providing a great framework for Natural Language Understanding ;
  • Tweepy for providing a great API to interact with Twitter (used in the dev notebook) ;
  • Chris Van Pelt for hacking with me on the demo ;
  • Lavanya Shukla and Carey Phelps for their continuous feedback ;
  • Google Colab for letting people access free GPU!
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].