All Projects → mosheman5 → timbre_painting

mosheman5 / timbre_painting

Licence: other
Hierarchical fast and high-fidelity audio generation

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to timbre painting

char-VAE
Inspired by the neural style algorithm in the computer vision field, we propose a high-level language model with the aim of adapting the linguistic style.
Stars: ✭ 18 (-73.13%)
Mutual labels:  generative-model
generative deep learning
Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)
Stars: ✭ 24 (-64.18%)
Mutual labels:  generative-model
DMXOPL
YMF262-enhanced FM patch set for Doom and source ports.
Stars: ✭ 42 (-37.31%)
Mutual labels:  sound
aura
A fast and lightweight 3D audio engine for Kha.
Stars: ✭ 31 (-53.73%)
Mutual labels:  sound
debeat
Sound Library for the Defold Engine
Stars: ✭ 20 (-70.15%)
Mutual labels:  sound
useAudioPlayer
Custom React hook & context for controlling browser audio
Stars: ✭ 176 (+162.69%)
Mutual labels:  sound
roover
🐱 A lightweight audio library for React apps.
Stars: ✭ 70 (+4.48%)
Mutual labels:  sound
ArduinoProtonPack
Arduino Code for a GhostBusters Proton Pack
Stars: ✭ 57 (-14.93%)
Mutual labels:  sound
uBraids SE
8HP Eurorack module | Voltage-controlled digital oscillator
Stars: ✭ 15 (-77.61%)
Mutual labels:  sound
PdWebParty
An app that allows Pd users to run patches in a web browser and share them with a web link
Stars: ✭ 37 (-44.78%)
Mutual labels:  sound
vae-torch
Variational autoencoder for anomaly detection (in PyTorch).
Stars: ✭ 38 (-43.28%)
Mutual labels:  generative-model
graph-nvp
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Stars: ✭ 69 (+2.99%)
Mutual labels:  generative-model
opendev
OpenDev is a non-profit project that tries to collect as many resources (assets) of free use for the development of video games and applications.
Stars: ✭ 34 (-49.25%)
Mutual labels:  sound
Eisenkraut
A multi-channel and hi-res capable audio file editor.
Stars: ✭ 50 (-25.37%)
Mutual labels:  sound
Simple-Unity-Audio-Manager
A decentralized audio playing system for Unity, designed for simplicity and built to scale!
Stars: ✭ 100 (+49.25%)
Mutual labels:  sound
Mezzanine
A game engine that supports high performance 3d graphics physics and sound
Stars: ✭ 18 (-73.13%)
Mutual labels:  sound
ac-audio-extractor
Audio Commons Audio Extractor
Stars: ✭ 33 (-50.75%)
Mutual labels:  sound
TriangleGAN
TriangleGAN, ACM MM 2019.
Stars: ✭ 28 (-58.21%)
Mutual labels:  generative-model
AudioFile
Audiofile library for Scala.
Stars: ✭ 20 (-70.15%)
Mutual labels:  sound
Generalized-PixelVAE
PixelVAE with or without regularization
Stars: ✭ 64 (-4.48%)
Mutual labels:  generative-model

Hierarchical Timbre-Painting and Articulation Generation

Open In Colab

This repository provides an official PyTorch implementation of "Hierarchical Timbre-Painting and Articulation Generation"

Our method generates high-fidelity audio for a target instrument, based f0 and loudness signal.

During training, loudness and f0 signal are extracted from ground-truth signal, which enables us to convert the melody of any input instrument to the trained instrument - task also known as Timbre Transfer

Audio Samples | Paper | Pretrained Models | Timbre Transfer Colab Demo

We suggest separating the generation process into two consecutive phases:

  • Articulation - We generate the backbone of the audio and the transition between notes. This is done on a low sample rate from the given condition, loudness and f0 inputs. We use a sine excitation based on the extracted f0 signal, hence using the generator as a Neural-Source-Filtering network rather than a classic GAN generator which is condition on random noise.
  • Timbre Painting - The next phase is composed of timbre painting networks: each network gets as input the previously generated audio and serves as a learnable upsample network. Each timbre-painting networks adds sample-rate specific details to the audio clip.

Dependencies

The needed packages are given in requirements.txt

Using a virtual enviroment is recommended:

virtualenv -p python3 .venv
source .venv/bin/activate
pip install -r requirements.txt

To use distributed runs, please install apex

Usage

Hydra is used for configuration and experiments mangement, for more info refer https://hydra.cc/

1. Cloning the repository

$ git clone https://github.com/mosheman5/timbre_painting.git
$ cd timbre_painting

2. Data Preparation

URMP Dataset

To download the URMP dataset used in our paper please fill the form

After download extract the content of the file to a folder named urmp and run the following script to preprocess the data:

python create_data.py

Other datasets

To train the model on any other datasets of monophonic instruments, copy the audio files to data_tmp directory, each instrument in a different folder, and run:

python create_data.py urmp=null

Default parameters are given at conf/data_config.yaml, overrides should be given in command line.

Please note the default parameters are defined for URMP dataset, for other datasets tuning might be needed (especially the data_processor.params.confidence_threshold and data_processor.params.silence_thresh_dB parameters)

3. Training

3.1 Single GPU

To Train with the original paper's parameters run:

python main.py

Default parameters are given at conf/runs/main.yaml, overrides should be given in command line.

for example, the following line runs an experiment on a dataset folder named 'flute' for 400 epochs and batch_size of 4:

python main.py paths.input_data=data.flute optim.epochs=400 optim.batch_size=4

results are saved in the folder outputs/main/${%Y-%m-%d_%H-%M-%S}

3.2 Mulriple GPUs / Machines

DDP is supported in the code by Apex package. To run in distributed mode, use the following template:

python -m torch.distributed.launch --use_env --nproc_per_node {# of gpus} main.py {argument overrides}

It's possible to use CUDA_VISIBLE_DEVICES=0,1 to choose the gpus to run on, in this example gpus 0,1 on the machine.

4. Timbre Transfer

To transfer the timbre of your files using a trained network, run:

python timbre_painting.py trained_dirpath={path/to/trained_model} input_dirpath={path/to/audio_sample_folder}

Default parameters are given at conf/transfer_config.yaml.

The generated files are saved in the experiment folder, in subdirectory generation Each input is generated in 5 versions with varying octave range from [-2, 2]

Pretrained Models

Pretrained models of instruments from URMP dataset are summarazied in the table. The models can be downloaded from the google drive links attached. Download the model, extract and follow timbre transfer to generate audio.

Instrument
Violin
Saxophone
Trumpet
Cello

Citation

If you found this code useful, please cite the following paper:

@inproceedings{michelashvili2020timbre-painting,
  title={Hirearchical Timbre-Painting and Articulation Generation},
  author={Michael Michelashvili and Lior Wolf},
  journal={21st International Society for Music Information Retrieval (ISMIR2020)},
  year={2020}
}

Code References

Acknowledgement

Credit to Adam Polyak for PyTorch CREPE pitch-extraction implementation and helpful discussions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].