Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → kylemcdonald → Audionotebooks

kylemcdonald / Audionotebooks

Licence: other

Collection of notebooks and scripts related to audio processing and machine learning.

Labels

jupyter-notebook

Projects that are alternatives of or similar to Audionotebooks

Deepstream python apps

A project demonstrating use of Python for DeepStream sample apps given as a part of SDK (that are currently in C,C++).

Stars: ✭ 359 (-1.91%)

Mutual labels: jupyter-notebook

Advanced Tensorflow

Little More Advanced TensorFlow Implementations

Stars: ✭ 364 (-0.55%)

Mutual labels: jupyter-notebook

Intro programming

A set of IPython notebooks and learning resources for an Introduction to Programming class, focusing on Python.

Stars: ✭ 366 (+0%)

Mutual labels: jupyter-notebook

A Python library for introductory data science

Stars: ✭ 363 (-0.82%)

Mutual labels: jupyter-notebook

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

Stars: ✭ 364 (-0.55%)

Mutual labels: jupyter-notebook

optimization routines for hyperparameter tuning

Stars: ✭ 362 (-1.09%)

Mutual labels: jupyter-notebook

Tensorflow chessbot

Predict chessboard FEN layouts from images using TensorFlow

Stars: ✭ 362 (-1.09%)

Mutual labels: jupyter-notebook

Carnd Vehicle Detection

Vehicle detection using YOLO in Keras runs at 21FPS

Stars: ✭ 367 (+0.27%)

Mutual labels: jupyter-notebook

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

Stars: ✭ 350 (-4.37%)

Mutual labels: jupyter-notebook

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Stars: ✭ 305 (-16.67%)

Mutual labels: jupyter-notebook

lecture notes for cyberwizard workshops

Stars: ✭ 363 (-0.82%)

Mutual labels: jupyter-notebook

🍟 Stanford CS229: Machine Learning

Stars: ✭ 364 (-0.55%)

Mutual labels: jupyter-notebook

Sagemaker Deployment

Code and associated files for the deploying ML models within AWS SageMaker

Stars: ✭ 361 (-1.37%)

Mutual labels: jupyter-notebook

Reinforcement Learning for Portfolio Management

Stars: ✭ 363 (-0.82%)

Mutual labels: jupyter-notebook

NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

Stars: ✭ 362 (-1.09%)

Mutual labels: jupyter-notebook

Natural Language Image Search

Search photos on Unsplash using natural language

Stars: ✭ 359 (-1.91%)

Mutual labels: jupyter-notebook

the data and ipython notebook of my attempt to solve the kaggle titanic problem

Stars: ✭ 363 (-0.82%)

Mutual labels: jupyter-notebook

Integrated Gradients

Attributing predictions made by the Inception network using the Integrated Gradients method

Stars: ✭ 365 (-0.27%)

Mutual labels: jupyter-notebook

Easy Deep Learning With Keras

Keras tutorial for beginners (using TF backend)

Stars: ✭ 367 (+0.27%)

Mutual labels: jupyter-notebook

Synthetic Data Generation for tabular, relational and time series data.

Stars: ✭ 360 (-1.64%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

Audio Notebooks

A collection of Jupyter Notebooks related to audio processing.

The notebooks act like interactive utility scripts for converting between different representations, usually stored in data/project/ where project is the dataset you're working with. Generally, if you change data_root near the top of the notebook and run the reset of the notebook, it will do something useful.

Setup

librosa currently needs some extra help on OS X, make sure to follow the instructions here first.

$ brew install ffmpeg # for loading and saving audio
$ git clone https://github.com/kylemcdonald/AudioNotebooks.git
$ cd AudioNotebooks.git
$ pip install -r requirements.txt
$ jupyter notebook

Terminology

Here are some words used in the names of the notebooks, and what they mean:

Samples refers to one-shot sounds, usually less than 1-2 seconds long. These can be loaded from a directory, like data/project/samples/ or from a precomputed numpy matrix like data/project/samples.npy. When they are stored in a .npy file, all the samples are necessarily concatenated or expanded to be the same length.
Multisamples refers to audio that needs to be segmented into samples.
Fingerprints refer to small images, usually 32x32 pixels, representing a small chunk of time like 250ms or 500ms. These are either calculated with CQT, STFT, or another frequency domain analysis technique. They are useful for running t-SNE or training neural nets.
Spritesheets are single files with multiple sounds, either visually as fingerprints or sonically as a sequence of sounds, organized carefully so they can be chopped up again later.

Some formats in use:

.npy are numpy matrices. Numpy can load and save these very quickly, even for large datasets.
.tsv are tab separated files referring to one sample per line, usally with normalized numbers in each column. These are good for loading into openFrameworks apps, or into the browser.
.txt are like .tsv but only have one item per line, usually a single string. Also good for loading into openFrameworks apps, or into the browser.
.pkl are Pickle files, which is the native Python serialization format, and is used for saving and loading datastructures that have lists of objects with lots of different kinds of values (not just numbers or strings).
.h5 is the way the Keras saves the weights for a neural net.
.json is good for taking what would usually go into a Pickle file, and saving it in a format that can be loaded onto the web. It's also one of the formats used by Keras, part of a saved model.

Example Workflows

Audio spritesheet

Collect Samples
Samples to Audio Spritesheet

t-SNE embedding for samples

Collect Samples
Samples to Fingerprints
Fingerprints to t-SNE (with mode = "fingerprints")

The standard workflow is to create a t-SNE embedding from fingerprints, but it's also possible to create an embedding after learning a classifier:

Collect Samples
Samples to Fingerprints
Collect Metadata
Metadata to Labels
Fingerprints and Labels to Classifier
Fingerprints to t-SNE (with mode = "combined")

t-SNE embedding for phonemes

Right this only really works with extracting phonemes from transcribed speech, using Gentle.

Gentle to Samples (with save_wav = True)
Samples to Fingerprints
Fingerprints to t-SNE

It's also possible to use Sphinx for speech that does not have transcriptions, but it can be very significantly slower:

Sphinx to Samples
Collect Samples
Samples to Fingerprints
Fingerprints to t-SNE

t-SNE grid fingerprints spritesheet

By virtue of creating a rectangular grid, you may lose some points. This technique will only work on 10-20k points maximum

Collect Samples
Samples to Fingerprints
Fingerprints to t-SNE
Run the example-data app from ofxAssignment or use CloudToGrid to convert a 2d t-SNE embedding to a grid embedding.
Fingerprints to Spritesheet

If you only want a spritesheet without any sorting, skip step 4 and only run step 5 partially.

Predict tags given tagged audio

Collect Samples
Samples to Fingerprints
Collect Metadata
Metadata to Labels
Fingerprints and Labels to Classifier

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 366

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗