Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → maxfrenzel → Samplevae

maxfrenzel / Samplevae

Multi-purpose tool for sound design and music production implemented in TensorFlow.

Labels

jupyter-notebook

Projects that are alternatives of or similar to Samplevae

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Stars: ✭ 86 (-2.27%)

Mutual labels: jupyter-notebook

Few Shot Text Classification

Code for reproducing the results from the paper Few Shot Text Classification with a Human in the Loop

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Game Theory And Python

Game Theory and Python, a workshop investigating repeated games using the prisoner's dilemma

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Detection Hackathon Apt29

Place for resources used during the Mordor Detection hackathon event featuring APT29 ATT&CK evals datasets

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Deep Learning Notes

Experiments with Deep Learning

Stars: ✭ 1,278 (+1352.27%)

Mutual labels: jupyter-notebook

Simple Qa Emnlp 2018

Code for my EMNLP 2018 paper "SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach"

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Deep Learning Boot Camp

A community run, 5-day PyTorch Deep Learning Bootcamp

Stars: ✭ 1,270 (+1343.18%)

Mutual labels: jupyter-notebook

Deprecated Boot Camps

DEPRECATED: please see individual lesson repositories for current material.

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Implementation of the paper - Generative Recurrent Networks for De Novo Drug Design.

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Generative Adversarial Networks for High Energy Physics extended to a multi-layer calorimeter simulation

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Lstm autoencoder classifier

An LSTM Autoencoder for rare event classification

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Airbnb Dynamic Pricing Optimization

[BA project] Dynamic Pricing Optimization for Airbnb listing to optimize yearly profit for host. Use Clustering for competitive analysis, kNN regression for demand forecasting, and find dynamic optimal price with Optimization model.

Stars: ✭ 85 (-3.41%)

Mutual labels: jupyter-notebook

PyTorch tutorials A to Z

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Pascal Voc Python

Repository for reading Pascal VOC data in Python, rather than requiring MATLAB to read the XML files.

Stars: ✭ 86 (-2.27%)

Mutual labels: jupyter-notebook

P5 vehicledetection unet

p5_VehicleDetection_Unet

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Distributed deep learning on Hadoop and Spark clusters.

Stars: ✭ 1,272 (+1345.45%)

Mutual labels: jupyter-notebook

Curso data science

Código para el curso "Aprende Data Science y Machine Learning con Python"

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Repo2docker Action

GitHub Action for repo2docker

Stars: ✭ 88 (+0%)

Mutual labels: jupyter-notebook

Faster Rcnn Densecap Torch

Faster-RCNN based on Densecap(deprecated)

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

Stars: ✭ 87 (-1.14%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

SampleVAE - A multi-purpose tool for sound design and music production

Deep learning-based tool that allows for various types of new sample generation, as well as sound classification, and searching for similar samples in an existing sample library. The deep learning part is implemented in TensorFlow and consists mainly of a Variational Autoencoder (VAE) with Inverse Autoregressive Flows (IAF) and an optional classifier network on top of the VAE's encoder's hidden state. A short article about the tool can be found here: https://towardsdatascience.com/samplevae-a-multi-purpose-ai-tool-for-music-producers-and-sound-designers-e966c7562f22?source=friends_link&sk=588d13c6080568aca63f98e4d3835c87

Making a dataset for training

Use the make_dataset.py script to generate a new dataset for trainig. The main parameters are data_dir and dataset_name. The former is a directory (or multiple directories) in which to look for samples (files ending in .wav, .aiff, or .mp3; only need to specify root directory, the script looks into all sub directories). The latter should a unique name for this dataset. For example, to search for files in the directories /Users/Shared/Decoded Forms Library/Samples and /Users/Shared/Maschine 2 Library, and create the dataset called NativeInstruments, run the command

  python make_dataset.py --data_dir '/Users/Shared/Decoded Forms Library/Samples' '/Users/Shared/Maschine 2 Library' --dataset_name NativeInstruments

By default, the data is split randomly into 90% train and 10% validation data. The optional train_ratio parameter (defaults to 0.9) can be used to specify a different split.

Creating a dataset that includes class information

To add a classifier to the model, use make_dataset_classifier.py instead. This script works essentially in the same way as make_dataset.py, but it treats the immediate sub-directories in data_dir as class names, and assumes all samples within them belong to that resepective class.

Currently only simple multiclass classification is supported. There is also no weighting of the classes happening so you should make sure that classes are as balanced as possible. Also, the current version does the train/validation split randomly; making sure the split happens evenly across classes is a simple future improvement.

Training a model

To train a model, use the train.py script. The main parameters of interest are logdir and dataset.

logdir specifies a unique name for the model to be trained, and creates a directory in which model checkpoints and other files are saved. Training can later be resumed from this.

dataset refers to a dataset created through the make_dataset.py script, e.g. NativeInstruments in the example above.

To train a model called model_NI on the above dataset, use

  python train.py --logdir model_NI --dataset NativeInstruments

On first training on a new dataset, all the features have to be calculated. This may take a while. When restarting the training later, or training a new model with same dataset and audio parameters, existing features are loaded.

Training automatically stops when the model converges. If no improvement is found on the validation data for several test steps, the learning rate is lowered. Once it goes below a threshold, training stops completely. Alternatively one can manually stop the training at any point.

When resuming a previously aborted model training, the dataset does not have to be specified, the script will automatically use the same dataset (and other audio and model parameters).

If the dataset contains classification data, a confusion matrix is plotted and stored in logdir at every test step.

Pre-trained Models

Three trained models are provided:

model_general: A model trained on slightly over 60k samples of all types. This is the same dataset that was used in my NeuralFunk project (https://towardsdatascience.com/neuralfunk-combining-deep-learning-with-sound-design-91935759d628). This model does not have a classifier.

model_drum_classes: A model trained on roughly 10k drum sounds, with a classifier of 9 different drum classes (e.g. kick, snare, etc).

model_drum_machines: A model trained on roughly 4k drum sounds, with a classifier of 71 different drum machine classes (e.g. Ace Tone Rhythm Ace, Akai XE8, Akai XR10 etc). Note that this is a tiny dataset with a very large number of classes, each only containing a handful of examples. This model is only included as an example of what's possible, not as a really useful model in itself.

Running the sound sample tool with a trained model

To use the sample tool, start a python environment and run

from tool_class import *

You can now instantiate a SoundSampleTool class. For example to instatiate a tool based on the above model, run.

tool = SoundSampleTool('model_NI', library_dir='/Users/MySampleLibrary')

The parameter library_dir is optional and specifies a sample library root directory, /Users/MySampleLibrary in the example above. It is required to perform similarity search on this sample library. If specified, an attempt is made to load embeddings for this library. If none are found, new embeddings are calculated which may take a while (depending on sample library size).

Once completely initialised, the tool can be used for sample generation and similarity search.

Generating samples

To generate new samples, use the generate function.

To sample a random point in latent space, decode it, and store the audio to generated.wav, run

tool.generate(out_file='generated.wav')

To encode one or multiple files, pass the filenames as a list of strings to the audio_files parameter. If the parameter weights is not specified, the resulting embeddings will be averaged over before decoding into a single audio file. Alternatively, a list of numbers can be passed to weights to set the respective weights of each input sample in the average. By default, the weights get normalised. E.g. the following code combines an example kick and snare, with a 1:2 ratio:

tool.generate(out_file='generated.wav', audio_files=['/Users/Shared/Maschine 2 Library/Samples/Drums/Kick/Kick Accent 1.wav','/Users/Shared/Decoded Forms Library/Samples/Drums/Snare/Snare Anodyne 3.wav'], weigths=[1,2])

Weight normalisation can be turned off by passing normalize_weights=False. This allows for arbitrary vector arithmetic with the embedding vectors, e.g. using a negative weight to subtract one vector from another.

Additionally the variance parameter (default: 0) can be used to add some Gaussian noise before decoding, to add random variation to the samples.

Finding similar samples

Assuming the tool was initialised with a library_dir, we can look for similar samples in the library.

similar_files, onsets, distances = tool.find_similar(target_file, num_similar=10)

will look for the 10 most similar samples to the sample in the audio file target_file and return them as a list, with similar_files[0] being the most similar, etc. onsets contains the onset (in seconds) of where in the file the similar sample starts. And distances contains the Euclidean distances to the target file in embedding space. By default, the results are also printed on screen.

Classifying samples

Assuming the model was trained on a dataset with class data, we can use the tool to make class predictions for new samples.

probabilities, predicted_class = tool.predict(target_file)

will run the classifier on the provided audio file and return two items, the probability distribution over classes and the name of the most probable class.

To full list of class names (in the same order as their respective probabilities) can be accessed via tool.class_names.

The offset parameter (default: 0.0) can be used to not apply the classification to the first 2 seconds of the audio file, but a later slice of audio.

Note on sample length and audio segmentation

Currently, the tool/models treat all samples as 2 second long clips. Shorter files get padded, longer files crop.

For the purpose of building the library, an additional parameter, library_segmentation, can be set to True when initialising the tool. If False, files in the library are simply considered as their first 2 second. However, if True, the segments within longer files are considered as individual samples for the purpose of the library and similarity search. Note that while this is implemented and technically working, the segmentation currently seems too sensitive.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 88

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗