All Projects โ†’ micah5 โ†’ Pyaudioclassification

micah5 / Pyaudioclassification

Licence: mit
๐ŸŽถ dead simple audio classification

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pyaudioclassification

Kfr
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
Stars: โœญ 985 (+795.45%)
Mutual labels:  audio-processing
Beep
A little package that brings sound to any Go application. Suitable for playback and audio-processing.
Stars: โœญ 1,168 (+961.82%)
Mutual labels:  audio-processing
Dataloaders
Pytorch and TensorFlow data loaders for several audio datasets
Stars: โœญ 97 (-11.82%)
Mutual labels:  audio-processing
Stl
The ITU-T Software Tool Library (G.191)
Stars: โœญ 44 (-60%)
Mutual labels:  audio-processing
Audio Pretrained Model
A collection of Audio and Speech pre-trained models.
Stars: โœญ 61 (-44.55%)
Mutual labels:  audio-processing
Chromaprint.scala
Chromaprint/AcoustID audio fingerprinting for the JVM
Stars: โœญ 81 (-26.36%)
Mutual labels:  audio-processing
Introduction To Programming With Matlab
Coursera Course: Introduction to Programming ๐Ÿ‘ฉโ€๐Ÿ’ป with MATLAB ~by Vanderbilt University ๐ŸŽ“
Stars: โœญ 23 (-79.09%)
Mutual labels:  audio-processing
Aukit
audio toolkit. ๅฅฝ็”จ็š„่ฏญ้Ÿณๅค„็†ๅทฅๅ…ท็ฎฑ๏ผŒๅŒ…ๅซ่ฏญ้Ÿณ้™ๅ™ชใ€้Ÿณ้ข‘ๆ ผๅผ่ฝฌๆขใ€็‰นๅพ้ข‘่ฐฑ็”Ÿๆˆ็ญ‰ๆจกๅ—ใ€‚
Stars: โœญ 105 (-4.55%)
Mutual labels:  audio-processing
Aca Code
Matlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Stars: โœญ 67 (-39.09%)
Mutual labels:  audio-processing
Julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: โœญ 1,258 (+1043.64%)
Mutual labels:  audio-processing
Regrader
VST delay plugin where the repeats degrade in resolution
Stars: โœญ 44 (-60%)
Mutual labels:  audio-processing
Python sound open
่ฏญ้Ÿณไฟกๅทๅค„็†่ฏ•้ชŒๆ•™็จ‹๏ผŒPythonไปฃ็ 
Stars: โœญ 59 (-46.36%)
Mutual labels:  audio-processing
Machinehearing
Machine Learning applied to sound
Stars: โœญ 83 (-24.55%)
Mutual labels:  audio-processing
Urban Sound Classification
Urban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)
Stars: โœญ 39 (-64.55%)
Mutual labels:  audio-processing
Mad Twinnet
The code for the MaD TwinNet. Demo page:
Stars: โœญ 99 (-10%)
Mutual labels:  audio-processing
Guitard
Node based multi effects audio processor
Stars: โœญ 31 (-71.82%)
Mutual labels:  audio-processing
Awesome Web Audio
A list of resources and projects to help learn about audio
Stars: โœญ 73 (-33.64%)
Mutual labels:  audio-processing
Video2description
Video to Text: Generates description in natural language for given video (Video Captioning)
Stars: โœญ 107 (-2.73%)
Mutual labels:  audio-processing
Audio Snr
Mixing an audio file with a noise file at any Signal-to-Noise Ratio (SNR)
Stars: โœญ 100 (-9.09%)
Mutual labels:  audio-processing
Aurio
Audio Fingerprinting & Retrieval for .NET
Stars: โœญ 84 (-23.64%)
Mutual labels:  audio-processing

pyAudioClassification

Dead simple audio classification

PyPI - Python Version PyPI

Who is this for? ๐Ÿ‘ฉโ€๐Ÿ’ป ๐Ÿ‘จโ€๐Ÿ’ป

People who just want to classify some audio quickly, without having to dive into the world of audio analysis. If you need something a little more involved, check out pyAudioAnalysis or panotti

Quick install

pip install pyaudioclassification

Requirements

  • Python 3
  • Keras
  • Tensorflow
  • librosa
  • NumPy
  • Soundfile
  • tqdm
  • matplotlib

Quick start

from pyaudioclassification import feature_extraction, train, predict
features, labels = feature_extraction(<data_path>)
model = train(features, labels)
pred = predict(model, <data_path>)

Or, if you're feeling reckless, you could just string them together like so:

pred = predict(train(feature_extraction(<training_data_path>)), <prediction_data_path>)

A full example with saving, loading & some dummy data can be found here.


Read below for a more detailed look at each of these calls.

Detailed Guide

Step 1: Preprocessing ๐Ÿถ ๐Ÿฑ

First, add all your audio files to a directory in the following structure

data/
โ”œโ”€โ”€ <class_name>/
โ”‚   โ”œโ”€โ”€ <file_name>
โ”‚   โ””โ”€โ”€ ...
โ””โ”€โ”€ ...

For example, if you were trying to classify dog and cat sounds it might look like this

data/
โ”œโ”€โ”€ cat/
โ”‚   โ”œโ”€โ”€ cat1.ogg
โ”‚   โ”œโ”€โ”€ cat2.ogg
โ”‚   โ”œโ”€โ”€ cat3.wav
โ”‚   โ””โ”€โ”€ cat4.wav
โ””โ”€โ”€ dog/
    โ”œโ”€โ”€ dog1.ogg
    โ”œโ”€โ”€ dog2.ogg
    โ”œโ”€โ”€ dog3.wav
    โ””โ”€โ”€ dog4.wav

Great, now we need to preprocess this data. Just call feature_extraction(<data_path>) and it'll return our input and target data. Something like this:

features, labels = feature_extraction('/Users/mac2015/data/')

(If you don't want to print to stdout, just pass verbose=False as a argument)


Depending on how much data you have, this process could take a while... so it might be a good idea to save. You can save and load with NumPy

np.save('%s.npy' % <file_name>, features)
features = np.load('%s.npy' % <file_name>)

Step 2: Training ๐Ÿ’ช

Next step is to train your model on the data. You can just call...

model = train(features, labels)

...but depending on your dataset, you might need to play around with some of the hyper-parameters to get the best results.

Options

  • epochs: The number of iterations. Default is 50.

  • lr: Learning rate. Increase to speed up training time, decrease to get more accurate results (if your loss is 'jumping'). Default is 0.01.

  • optimiser: Choose any of these. Default is 'SGD'.

  • print_summary: Prints a summary of the model you'll be training. Default is False.

  • loss_type: Classification type. Default is categorical for >2 classes, and binary otherwise.

You can add any of these as optional arguments, for example train(features, labels, lr=0.05)


Again, you probably want to save your model once it's done training. You can do this with Keras:

from keras.models import load_model

model.save('my_model.h5')
model = load_model('my_model.h5')

Step 3: Prediction ๐Ÿ™ ๐Ÿ™Œ

Now the fun part- try your trained model on new data!

pred = predict(model, <data_path>)

Your <data_path> should point to a new, untested audio file.

Binary

If you have 2 classes (or if you force selected 'binary' as a type), pred will just be a single number for each file.

The closer it is to 0, the closer the prediction is for the first class, and the closer it is to 1 the closer the prediction is to the second class.

So for our cat/dog example, if it returns 0.2 it's 80% sure the sound is a cat, and if it returns 0.8 it's 80% sure it's a dog.

Categorical

If you have more than 2 classes (or if you force selected 'categorical' as a type), pred will be an array for each sound file.

It'll look something like this

[[1.6454633e-06 3.7017996e-11 9.9999821e-01 1.5900606e-07]]

The index of each item in the array will correspond to the prediction for that class.


You can pretty print the predictions by showing them in a leaderboard, like so:

print_leaderboard(pred, <training_data_path>)

It looks like this:

1. Cow 100.0% (index 2)
2. Rooster 0.0% (index 0)
3. Frog 0.0% (index 3)
4. Pig 0.0% (index 1)

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].