All Projects → Hguimaraes → Gtzan.keras

Hguimaraes / Gtzan.keras

Licence: mit
[REPO] Music Genre classification on GTZAN dataset using CNNs

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Gtzan.keras

Keras transfer cifar10
Object classification with CIFAR-10 using transfer learning
Stars: ✭ 120 (-9.09%)
Mutual labels:  cnn
Hyperdensenet
This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.
Stars: ✭ 124 (-6.06%)
Mutual labels:  cnn
Asr syllable
基于卷积神经网络的语音识别声学模型的研究
Stars: ✭ 127 (-3.79%)
Mutual labels:  cnn
What Is My Girlfriend Thinking
用 Tensorflow.js 猜猜我的女朋友在想什么. A Tensorflow.js application for facial classification.
Stars: ✭ 122 (-7.58%)
Mutual labels:  cnn
Deeplearning python
Deep Learning--深度学习
Stars: ✭ 123 (-6.82%)
Mutual labels:  cnn
Libfacedetection
An open source library for face detection in images. The face detection speed can reach 1000FPS.
Stars: ✭ 10,852 (+8121.21%)
Mutual labels:  cnn
3d Densenet
3D Dense Connected Convolutional Network (3D-DenseNet for action recognition)
Stars: ✭ 118 (-10.61%)
Mutual labels:  cnn
Object Localization
Object localization in images using simple CNNs and Keras
Stars: ✭ 130 (-1.52%)
Mutual labels:  cnn
Cnn Inference Engine Quick View
A quick view of high-performance convolution neural networks (CNNs) inference engines on mobile devices.
Stars: ✭ 124 (-6.06%)
Mutual labels:  cnn
Pytorch convlstm
convolutional lstm implementation in pytorch
Stars: ✭ 126 (-4.55%)
Mutual labels:  cnn
Lenet 5
PyTorch implementation of LeNet-5 with live visualization
Stars: ✭ 122 (-7.58%)
Mutual labels:  cnn
Pytorch Sift
PyTorch implementation of SIFT descriptor
Stars: ✭ 123 (-6.82%)
Mutual labels:  cnn
Deeplearning Notes
Notes for Deep Learning Specialization Courses led by Andrew Ng.
Stars: ✭ 126 (-4.55%)
Mutual labels:  cnn
Unified Gesture And Fingertip Detection
A Unified Convolutional Neural Network Approach of Gesture Recognition and Fingertip Detection.
Stars: ✭ 121 (-8.33%)
Mutual labels:  cnn
I3d finetune
TensorFlow code for finetuning I3D model on UCF101.
Stars: ✭ 128 (-3.03%)
Mutual labels:  cnn
Classifier multi label textcnn
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification
Stars: ✭ 116 (-12.12%)
Mutual labels:  cnn
Captcha
基于CNN的验证码整体识别
Stars: ✭ 125 (-5.3%)
Mutual labels:  cnn
Noiseface
Noise-Tolerant Paradigm for Training Face Recognition CNNs
Stars: ✭ 132 (+0%)
Mutual labels:  cnn
Id Cnn Cws
Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"
Stars: ✭ 129 (-2.27%)
Mutual labels:  cnn
Motionblur Detection By Cnn
Stars: ✭ 126 (-4.55%)
Mutual labels:  cnn

gtzan.keras

Music Genre classification using Convolutional Neural Networks. Implemented in Tensorflow 2.0 using the Keras API

Overview

tl;dr: Compare the classic approach of extract features and use a classifier (e.g SVM) against the Deep Learning approach of using CNNs on a representation of the audio (Melspectrogram) to extract features and classify. You can see both approaches on the nbs folder in the Jupyter notebooks.

Resume of the deep learning approach:

  1. Shuffle the input and split into train and test (70%/30%)
  2. Read the audios as melspectrograms, spliting then into 1.5s windows with 50% overlaping resulting in a dataset with shape (samples x time x frequency x channels)
  3. Train the CNN and test on test set using a Majority Voting approach

Results

To compare the result across multiple architectures, we have took two approaches for this problem: One using the classic approach of extracting features and then using a classifier. The second approach, wich is implemented on the src file here is a Deep Learning approach feeding a CNN with a melspectrogram.

You can check in the nbs folder on how we extracted the features, but here are the current results on the test set:

Model Acc
Decision Tree 0.5160
Random Forest 0.6760
ElasticNet 0.6880
Logistic Regression 0.7640
SVM (RBF) 0.7880

For the deep learning approach we have tested a simple custom architecture that can be found at the nbs folder.

Model Acc
CNN 2D 0.832

alt text alt text

Dataset

And how to get the dataset?

  1. Download the GTZAN dataset here

Extract the file in the data folder of this project. The structure should look like this:

├── data/
   ├── genres
      ├── blues
      ├── classical
      ├── country
      .
      .
      .
      ├── rock

How to run

The models are provided as .joblib or .h5 files in the models folder. You just need to use it on your custom file as described bellow.

If you want to run the training process yourself, you need to run the provided notebooks in nbs folder.

To apply the model on a test file, you need to run:

$ cd src/
$ python app.py -t MODEL_TYPE -m ../models/PATH_TO_MODEL -s PATH_TO_SONG

Where MODEL_TYPE = [ml, dl] for classical machine learning approach and for a deep learning approach, respectively.

Usage example:

$ python app.py -t dl -m ../models/custom_cnn_2d.h5 -s ../data/samples/iza_meu_talisma.mp3

and the output will be:

$ ../data/samples/iza_meu_talisma.mp3 is a pop song
$ most likely genres are: [('pop', 0.43), ('hiphop', 0.39), ('country', 0.08)]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].