All Projects → dlebech → lyrics-generator

dlebech / lyrics-generator

Licence: MIT License
Generating lyrics with a recurrent neural network

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
HTML
75241 projects
CSS
56736 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to lyrics-generator

Meetup-Content
Entirety.ai Intuition to Implementation Meetup Content.
Stars: ✭ 33 (-8.33%)
Mutual labels:  recurrent-neural-networks
Deep-Learning
This repo provides projects on deep-learning mainly using Tensorflow 2.0
Stars: ✭ 22 (-38.89%)
Mutual labels:  recurrent-neural-networks
LSM
Liquid State Machines in Python and NEST
Stars: ✭ 39 (+8.33%)
Mutual labels:  recurrent-neural-networks
Conversational-AI-Chatbot-using-Practical-Seq2Seq
A simple open domain generative based chatbot based on Recurrent Neural Networks
Stars: ✭ 17 (-52.78%)
Mutual labels:  recurrent-neural-networks
classifying-cancer
A Python-Tensorflow neural network for classifying cancer data
Stars: ✭ 30 (-16.67%)
Mutual labels:  recurrent-neural-networks
DeepSeparation
Keras Implementation and Experiments with Deep Recurrent Neural Networks for Source Separation
Stars: ✭ 19 (-47.22%)
Mutual labels:  recurrent-neural-networks
rnn darts fastai
Implement Differentiable Architecture Search (DARTS) for RNN with fastai
Stars: ✭ 21 (-41.67%)
Mutual labels:  recurrent-neural-networks
CS231n
PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"
Stars: ✭ 47 (+30.56%)
Mutual labels:  recurrent-neural-networks
tictactoe-ai-tfjs
Train your own TensorFlow.js Tic Tac Toe
Stars: ✭ 45 (+25%)
Mutual labels:  tensorflowjs
tensorflowjs-webcam-transfer-learning
Tensorflowjs Webcam Transfer Learning
Stars: ✭ 49 (+36.11%)
Mutual labels:  tensorflowjs
digit-recognizer-live
Recognize Digits using Deep Neural Networks in Google Chrome live!
Stars: ✭ 29 (-19.44%)
Mutual labels:  tensorflowjs
dl-relu
Deep Learning using Rectified Linear Units (ReLU)
Stars: ✭ 20 (-44.44%)
Mutual labels:  recurrent-neural-networks
STORN-keras
This is a STORN (Stochastical Recurrent Neural Network) implementation for keras!
Stars: ✭ 23 (-36.11%)
Mutual labels:  recurrent-neural-networks
color-pop
🌈 Automatic Color Pop effect on any image inspired by Google Photos
Stars: ✭ 21 (-41.67%)
Mutual labels:  tensorflowjs
entailment-neural-attention-lstm-tf
(arXiv:1509.06664) Reasoning about Entailment with Neural Attention.
Stars: ✭ 43 (+19.44%)
Mutual labels:  recurrent-neural-networks
imessage-chatbot
💬 Recurrent neural network -- generates messages in your style of speech! Trained on imessage data. Sqlite3, TensorFlow, Flask, Twilio SMS, AWS.
Stars: ✭ 33 (-8.33%)
Mutual labels:  recurrent-neural-networks
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (+19.44%)
Mutual labels:  recurrent-neural-networks
bodymoji
Draws an emoji on your face! Powered by Nuxt.js, Tensorflow.js and Posenet
Stars: ✭ 21 (-41.67%)
Mutual labels:  tensorflowjs
spikeRNN
No description or website provided.
Stars: ✭ 28 (-22.22%)
Mutual labels:  recurrent-neural-networks
splat
Motion-controlled Fruit Ninja clone using Three.js & Tensorflow.js
Stars: ✭ 84 (+133.33%)
Mutual labels:  tensorflowjs

Lyrics Generator

build status codecov

This is a small experiment in generating lyrics with a recurrent neural network, trained with Keras and Tensorflow 2.

It works in the browser with Tensorflow.js! Try it here.

The model can be trained at both word- and character level which each have their own pros and cons.

Pre-trained models

A few pre-trained models can be found here.

Train the model

Install dependencies

Requires Python 3.7+.

pip install -r requirements.txt

The requirement file has been reduced in size so if any of the scripts fail, just install the missing packages :-)

Get the data

  • Create a song dataset. See "Create your own song dataset" below.
    • Save the dataset as songdata.csv file in a data sub-directory.
    • Alternatively, you can name it anything you like and use the --songdata-file parameter when training.
  • Download the Glove embeddings
    • Save the glove.6B.50d.txt file in a data sub-directory.
    • Alternatively, you can create your a word2vec embedding (see below)

Create your own song dataset

The code expects an input dataset to be stored at date/songdata.csv by default (this can be changed in config.py or via CLI parameter --songdata-file).

The file should be in CSV format with the following columns (case sensitive):

  • artist
    • A string, e.g. "The Beatles"
  • text
    • A string with the entire lyrics for one song, including newlines.

You can have any number of other columns, they will just be ignored.

A sample dataset with a simple text is provided in sample.csv. To test things are working, you can train using that file:

python -m lyrics.train --songdata-file sample.csv --early-stopping-patience 50 --artists '*'

Dataset suggestions

  • Download billboardHot100_1999-2019.csv file from the Data on Songs from Billboard 1999-2019
    • Put it into the data/ folder and run python scripts/billboard.py script which will prepare the file for training.
    • (Optional) pip install fasttext to detect language. If it's not installed, language is not detected.

(Optional) Create a word2vec embedding matrix

If you have the songdata.csv file from above, you can simply create the word2vec vectors like this:

python -m lyrics.embedding --name-suffix _myembedding

This will create word2vec_myembedding.model and word2vec_myembedding.txt files in the default data directory data/. Use -h to see other options like artists and custom songdata file.

Run the training

python -m lyrics.train -h

This command by default takes care of all the training. Warning: it takes a very long time on a normal CPU!

Check -h for options. For example, if you want to use a different embedding than the glove embedding:

python -m lyrics.train --embedding-file ./embeddings.txt

The embeddings are still assumed to be 50 dimensional.

The output model and tokenizer is stored in a timestamped folder like export/2020-01-01T010203 by default.

Note: During experimentation, I found that raising the batch size to something like 2048 speeds up processing, but it depends on your hardware resources whether this is feasible of course.

Training on GPU

I have found it easier to train on GPU by using Docker and nvidia-docker, rather than try to install CUDA myself. To do this, first make sure you have nvidia-docker set up correct, and then:

docker build -t lyrics-gpu .
docker run --rm -it --gpus all -v $PWD:/tf/src -u $(id -u):$(id -g) lyrics-gpu bash

Then run the normal commands from there, e.g. python -m lyrics.train.

Tip: You might want to use the parameter --gpu-speedup! Just note that this will disable the Tensorflowjs compatibility, regardless of whether you have set the --tfjs-compatible flag.

Tip: If you get a cryptic Tensorflow error like errors_impl.CancelledError: [_Derived_]RecvAsync is cancelled. while training on GPU, try pre-pending the train command with TF_FORCE_GPU_ALLOW_GROWTH=true, e.g.:

TF_FORCE_GPU_ALLOW_GROWTH=true python -m lyrics.train --transform-words --num-lines-to-include=10 --artists '*' --gpu-speedup

Use transformer network

To use the universal sentence encoder or BERT architecture use the --transformer-network parameter:

python -m lyrics.train --transformer-network [use|bert]

Note: These models are not going to work in Tensorflow JS currently, so it should only be used from the command-line.

Note: I have not been able to get any result with BERT. Only included for illustration purposes.

Character-level predictions

In the default training mode, the model predicts the next word, given a sequence of words. Changing the model to predict the next character can be done using the --char-level flag.

python -m lyrics.train --char-level

Create new lyrics

python -m cli lyrics model.h5 tokenizer.pickle

Try python -m cli lyrics -h to find out more. For example, using --randomness and --text can be recommended.

If you want to add newlines to the seed text via --text, you need to add a space on each side. For example, this works in Bash:

--text $'you are my fire \n the one desire'

Export to Tensorflow JS (used for the app)

Note: Make sure to use the --tfjs-compatible flag during training!

python -m cli export model.h5 tokenizer.pickle

This creates a sub-directory export/js with the relevant files (can be used for the app).

Single-page "app" for creating lyrics

Note: Make sure to use the --tfjs-compatible flag during training!

The lyrics-tfjs sub-directory has a simple web-page that can be used to create lyrics in the browser. The code expects data to be found in a data/ sub-directory. This includes the words.json file, model.json and any extra files generated by the Tensorflow export.

Demo.

Development

Make sure to get all dependencies:

pip install -r requirements_dev.txt

Testing

python -m pytest --cov=lyrics tests/
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].