All Projects → cyrilou242 → RapLyrics-Back

cyrilou242 / RapLyrics-Back

Licence: MIT license
Model training, custom generative function and training for raplyrics.eu - A rap music lyrics generation project

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to RapLyrics-Back

Accelerated Text
Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data is converted to textual descriptions varying in wording and structure.
Stars: ✭ 256 (+1728.57%)
Mutual labels:  text-generation, nlg
factedit
🧐 Code & Data for Fact-based Text Editing (Iso et al; ACL 2020)
Stars: ✭ 16 (+14.29%)
Mutual labels:  text-generation, nlg
transformer-drg-style-transfer
This repository have scripts and Jupyter-notebooks to perform all the different steps involved in Transforming Delete, Retrieve, Generate Approach for Controlled Text Style Transfer
Stars: ✭ 97 (+592.86%)
Mutual labels:  text-generation, nlg
Kenlg Reading
Reading list for knowledge-enhanced text generation, with a survey
Stars: ✭ 257 (+1735.71%)
Mutual labels:  text-generation, nlg
recipe-personalization
EMNLP 2019: Generating Personalized Recipes from Historical User Preferences
Stars: ✭ 48 (+242.86%)
Mutual labels:  nlg, nlg-dataset
PyLDA
A Latent Dirichlet Allocation implementation in Python.
Stars: ✭ 51 (+264.29%)
Mutual labels:  machine-learning-algorithms
pen.el
Pen.el stands for Prompt Engineering in emacs. It facilitates the creation, discovery and usage of prompts to language models. Pen supports OpenAI, EleutherAI, Aleph-Alpha, HuggingFace and others. It's the engine for the LookingGlass imaginary web browser.
Stars: ✭ 376 (+2585.71%)
Mutual labels:  nlg
gdc
Code for the ICLR 2021 paper "A Distributional Approach to Controlled Text Generation"
Stars: ✭ 94 (+571.43%)
Mutual labels:  nlg
bitcoin-prediction
bitcoin prediction algorithms
Stars: ✭ 21 (+50%)
Mutual labels:  machine-learning-algorithms
Innovative-Book-Resources
This repository contains books from different topics and perfect enough to give developers a boost in understanding the concepts of Data Science and Artificial Intelligence(other topics are also included but main highlights are these two).
Stars: ✭ 57 (+307.14%)
Mutual labels:  machine-learning-algorithms
SGDLibrary
MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
Stars: ✭ 165 (+1078.57%)
Mutual labels:  machine-learning-algorithms
gpt-j-api
API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend
Stars: ✭ 248 (+1671.43%)
Mutual labels:  text-generation
lbfgsb-gpu
An open source library for the GPU-implementation of L-BFGS-B algorithm
Stars: ✭ 70 (+400%)
Mutual labels:  machine-learning-algorithms
FreebaseQA
The release of the FreebaseQA data set (NAACL 2019).
Stars: ✭ 55 (+292.86%)
Mutual labels:  nlp-datasets
Machine-Learning-Explained
Learn the theory, math and code behind different machine learning algorithms and techniques.
Stars: ✭ 30 (+114.29%)
Mutual labels:  machine-learning-algorithms
spams-python
A rehost of the python version of SPArse Modeling Software (SPAMS)
Stars: ✭ 28 (+100%)
Mutual labels:  machine-learning-algorithms
AgePredictor
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
Stars: ✭ 13 (-7.14%)
Mutual labels:  machine-learning-algorithms
Tf-Rec
Tf-Rec is a python💻 package for building⚒ Recommender Systems. It is built on top of Keras and Tensorflow 2 to utilize GPU Acceleration during training.
Stars: ✭ 18 (+28.57%)
Mutual labels:  machine-learning-algorithms
datascience-mashup
In this repo I will try to gather all of the projects related to data science with clean datasets and high accuracy models to solve real world problems.
Stars: ✭ 36 (+157.14%)
Mutual labels:  machine-learning-algorithms
Text-Generate-RNN
中国古诗生成(文本生成)
Stars: ✭ 106 (+657.14%)
Mutual labels:  text-generation

RapLyrics-Back

Part of the raplyrics.eu project: generate powerful AI-powered rap lyrics.

lyrics generation with curl

This repository contains:

  • a hand-curated, human-cleaned rap lyrics dataset
  • the implementation of a generative rnn model (based on minimaxir work)
  • a custom generative function and a serving implementation

Getting started

Setup

Get the repo. Clone from GitHub:

$ git clone https://github.com/cyrilou242/RapLyrics-Back  

Setup a virtualenv and install the required libraries.

Note: Make sure you have a python --version == python3.6 otherwise some library (especially tensorflow may not be available or behave as expected)

$ cd RapLyrics-Back  
$ mkvirtualenv --python `which python3.6` -r requirements.txt RapLyrics_Back  

Train the model

The training is based on minimaxir's textgenrnn with small pre-processing tweaks. The development of this library is still active, you may want to use it instead of our fork:

$ rm -r textgenrnn  
$ pip install textgenrnn    

train.py contains the parameters and the path of the training dataset.
Have a look at it and launch your own training. Don't change anything to train from the rap lyric dataset provided. This will create a word model and a char model.

Parameters available:
new_model: True to create a new model or False to train based on an existing one.
rnn_layers: Number of recurrent LSTM layers in the model (default: 2)
rnn_size: Number of cells in each LSTM layer (default: 128)
rnn_bidirectional: Whether to use Bidirectional LSTMs, which account for sequences both forwards and backwards. Recommended if the input text follows a specific schema. (default: False)
max_length: Maximum number of previous characters/words to use before predicting the next token. This value should be reduced for word-level models (default: 40)
max_words: Maximum number of words (by frequency) to consider for training (default: 10000)
dim_embeddings: Dimensionality of the character/word embeddings (default: 100)
num_epochs: number of epochs. (model is save at eah epoch by default)
word_level: Whether to train the model at the word level (default: False)
dropout: ratio of character to ignore on each sentence, may lead to better results. Don't use with word model
train_size: ratio to use as training set. The remaining ratio will make a validation set. Default is 1. (no validation set)
gen_epochs: during the training, a generation test will be made at each multiple of gen_epochs. Default is 1 (generation at each epoch)
max_gen_length: length of the generation during training

3 files will be generated by the training:
Model_name_config.json : parameters given for the training
Model_name_vocab.json : vocabulary of the dataset
Model_name_weights.hdf5 : model weights
Put these file in the weights folder.

Serve the model

The paths of the model files are defined in the first lines of server.py.
Change them to use your custom model.

The parameters of the custom generative function can also be tweaked.
(documentation fot this function will be issued in another document)

To test the serving, launch the development flask server:

$ python app.py

Make a request to the server in a different terminal:

$ curl '127.0.0.1:5000/apiUS' -X POST -H "Content-Type: application/x-www-form-urlencoded" -d "input=low life"  

{"output":"low life down the whip\nand i don't mean it 'cause i never had a death\ni guess i have a story of the bitches change\nfor the life\nfor the life that i know will you get through\n'cause i don't give a fuck i got the way before i started inhale"}

For further serving in production, use a production WSGI server. We recommend gunicorn.

Next steps:

  • Further researches and testing with character level models. Seems very promising.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].