All Projects → vishalshar → Audio-Classification-using-CNN-MLP

vishalshar / Audio-Classification-using-CNN-MLP

Licence: other
Multi class audio classification using Deep Learning (MLP, CNN): The objective of this project is to build a multi class classifier to identify sound of a bee, cricket or noise.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Audio-Classification-using-CNN-MLP

Tensorflow template application
TensorFlow template application for deep learning
Stars: ✭ 1,851 (+5041.67%)
Mutual labels:  cnn, mlp
polyssifier
run a multitude of classifiers on you data and get an AUC report
Stars: ✭ 64 (+77.78%)
Mutual labels:  classifier, mlp
Hate Speech And Offensive Language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Stars: ✭ 543 (+1408.33%)
Mutual labels:  classifier, dataset
Deeplearning tutorials
The deeplearning algorithms implemented by tensorflow
Stars: ✭ 1,580 (+4288.89%)
Mutual labels:  cnn, mlp
audio noise clustering
https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-33.33%)
Mutual labels:  audio-analysis, audio-processing
Robust Lane Detection
Stars: ✭ 110 (+205.56%)
Mutual labels:  cnn, dataset
Dfl Cnn
This is a pytorch re-implementation of Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition
Stars: ✭ 245 (+580.56%)
Mutual labels:  classifier, cnn
Sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
Stars: ✭ 764 (+2022.22%)
Mutual labels:  cnn, audio-processing
Tensorflow-Audio-Classification
Audio classification with VGGish as feature extractor in TensorFlow
Stars: ✭ 105 (+191.67%)
Mutual labels:  audio-classification, audio-processing
MusicVisualizer
A music visualizer based on the ATMEGA328P-AU
Stars: ✭ 30 (-16.67%)
Mutual labels:  audio-analysis, audio-processing
Face landmark dnn
Face Landmark Detector based on Mobilenet V1
Stars: ✭ 92 (+155.56%)
Mutual labels:  cnn, dataset
MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Stars: ✭ 73 (+102.78%)
Mutual labels:  audio-analysis, audio-processing
Recursive Cnns
Implementation of my paper "Real-time Document Localization in Natural Images by Recursive Application of a CNN."
Stars: ✭ 80 (+122.22%)
Mutual labels:  cnn, dataset
Reproducible Image Denoising State Of The Art
Collection of popular and reproducible image denoising works.
Stars: ✭ 1,776 (+4833.33%)
Mutual labels:  cnn, noise
Keras Sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Stars: ✭ 47 (+30.56%)
Mutual labels:  cnn, audio-processing
Keras transfer cifar10
Object classification with CIFAR-10 using transfer learning
Stars: ✭ 120 (+233.33%)
Mutual labels:  classifier, cnn
Aukit
audio toolkit. 好用的语音处理工具箱,包含语音降噪、音频格式转换、特征频谱生成等模块。
Stars: ✭ 105 (+191.67%)
Mutual labels:  audio-analysis, audio-processing
Waveform analysis
Functions and scripts for analyzing waveforms, primarily audio. This is currently somewhat disorganized and unfinished.
Stars: ✭ 193 (+436.11%)
Mutual labels:  audio-analysis, noise
ACA-Slides
Slides and Code for "An Introduction to Audio Content Analysis," also taught at Georgia Tech as MUSI-6201. This introductory course on Music Information Retrieval is based on the text book "An Introduction to Audio Content Analysis", Wiley 2012/2022
Stars: ✭ 84 (+133.33%)
Mutual labels:  audio-analysis, audio-processing
Audio Classification using LSTM
Classification of Urban Sound Audio Dataset using LSTM-based model.
Stars: ✭ 47 (+30.56%)
Mutual labels:  audio-classification, audio-processing

Audio-Classification-using-CNN-MLP

Multi class audio classification using Deep Learning (CNN, MLP)

DOI

Citation

If you find this project helpful, please cite as below:

@software{vishal_sharma_2020_3988690,
  author       = {Vishal Sharma},
  title        = {{vishalshar/Audio-Classification-using-CNN-MLP: 
                   first release}},
  month        = Aug,
  year         = 2020,
  publisher    = {Zenodo},
  version      = {v1.0.0},
  doi          = {10.5281/zenodo.3988690},
  url          = {https://doi.org/10.5281/zenodo.3988690}
}

Project Objectives:

The objective of this project is to build a multi class classifier to identify sound of a bee, cricket or noise.

Dataset Description:

Given dataset contains total of 9,914 audio sample, where 3,300 belongs to Bee, 3,500 belongs to Cricket and 3,114 belongs to noise. Each audio sample is approximately about 2 sec long and has 44,100 amplitude samples/sec. Given dataset was merged and experiments were performed on 80%-20% split.

Bee Cricket Noise Total
Train 2,402 3,000 2,180 7,582
Test 898 500 934 2,332
3,300 3,500 3,114 9,914

Audio Data Preprocessing:

Audio dataset given has very high frame rate, on an average every file had 80,000 frames (amplitude/sec). With frames/sec being so high we have a lot of data and it needs some preprocessing. Reduction of audio frame rate and length was performed using interpolation technique. The audio sample was reduced to 15k sample and total length of 22,000 (approximately 1/4 reduction of the given audio).

alt text

Using ANN

During initial experiments ANN was not performing good and later after several experiments a Multi Layer Perceptron (MLP) model was build based on intuition of CNN. Before we feed audio data in network it was max pooled in 3 different layers and output of pooled layers was given input to the fully connected layers as shows in below figure. To merge features extracted from different pooling layers output of fully connected layer was merged.

Core Idea:

Sample Bee Audio and expected feature extraction using pooling layers and merging fully connected layers

alt text

Performance:

Training was done for 500 epochs using Adaptive Moment Estimation (adam) as optimizer with learning rate of 0.0005. Figure 9 displays accuracy during training.

Training Testing
Accuracy 91.11% 88.25%

alt text

Citation

If you find this project helpful, please cite as below:

@software{vishal_sharma_2020_3988690,
  author       = {Vishal Sharma},
  title        = {{vishalshar/Audio-Classification-using-CNN-MLP: 
                   first release}},
  month        = aug,
  year         = 2020,
  publisher    = {Zenodo},
  version      = {v1.0.0},
  doi          = {10.5281/zenodo.3988690},
  url          = {https://doi.org/10.5281/zenodo.3988690}
}

Accuracy

alt text

Using CNN

A network using Convolution layers was used to build classifier, network architecture is shown in Fig 6. The number of filters for both convolution was 64 and filter_size was 10 and 3 for respective layers followed by 3 fully connected layers, details about activation function used is in code. Max pooling was used after each convolution layer. During training over fitting was observed, to handle that dropout of 50% (keep) was used after first two fully connected layers and also ‘L2’ regularization was added to both layers. Input length was fixed as 22,000 with 1 channel. During training it was also observed, without downsampling data model was not able to generalize well between bee and noise data. Adding downsampling technique helped the model in generalization.

alt text

Performance:

Training was done for 500 epochs using Adaptive Moment Estimation (adam) as optimizer with learning rate of 0.0001.

Training Testing
Accuracy 99.88% 99.45%

Accuracy

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].