All Projects → hyphmongo → deepcurator

hyphmongo / deepcurator

Licence: other
A convolutional neural network trained to recognize good* electronic music

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to deepcurator

vinyl-shelf-finder
app that manages a Discogs.com user records collection
Stars: ✭ 41 (+7.89%)
Mutual labels:  discogs
scott
💼 The Podcast Regional Manager
Stars: ✭ 21 (-44.74%)
Mutual labels:  audio-analysis
Preprocessing-Method-for-STEMI-Detection
Official source code of "Preprocessing Method for Performance Enhancement in CNN-based STEMI Detection from 12-lead ECG"
Stars: ✭ 12 (-68.42%)
Mutual labels:  convolutional-neural-network
DeTraC COVId19
Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network
Stars: ✭ 34 (-10.53%)
Mutual labels:  convolutional-neural-network
minirocket
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification
Stars: ✭ 166 (+336.84%)
Mutual labels:  convolutional-neural-network
MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Stars: ✭ 73 (+92.11%)
Mutual labels:  audio-analysis
Keras-MultiClass-Image-Classification
Multiclass image classification using Convolutional Neural Network
Stars: ✭ 48 (+26.32%)
Mutual labels:  convolutional-neural-network
Cat-Dog-CNN-Classifier
Convolutional Neural Network to classify images as either cat or dog, along with using attention heatmaps for localization. Written in python with keras.
Stars: ✭ 17 (-55.26%)
Mutual labels:  convolutional-neural-network
mtg-jamendo-dataset
Metadata, scripts and baselines for the MTG-Jamendo dataset
Stars: ✭ 140 (+268.42%)
Mutual labels:  audio-analysis
DSMSCN
[MultiTemp 2019] Official Tensorflow implementation for Change Detection in Multi-temporal VHR Images Based on Deep Siamese Multi-scale Convolutional Neural Networks.
Stars: ✭ 63 (+65.79%)
Mutual labels:  convolutional-neural-network
SimpNet-Tensorflow
A Tensorflow Implementation of the SimpNet Convolutional Neural Network Architecture
Stars: ✭ 16 (-57.89%)
Mutual labels:  convolutional-neural-network
mdct
A fast MDCT implementation using SciPy and FFTs
Stars: ✭ 42 (+10.53%)
Mutual labels:  audio-analysis
discogstagger
Console based audio-file metadata tagger that uses the Discogs.com API v2 (JSON based). Relies on the Mutagen and discogs-client libraries. Currently supports FLAC and MP3 file types.
Stars: ✭ 65 (+71.05%)
Mutual labels:  discogs
tensorflow-image-classifier
Easily train an image classifier and then use it to label/tag other images
Stars: ✭ 29 (-23.68%)
Mutual labels:  convolutional-neural-network
coursera-ai-for-medicine-specialization
Programming assignments, labs and quizzes from all courses in the Coursera AI for Medicine Specialization offered by deeplearning.ai
Stars: ✭ 80 (+110.53%)
Mutual labels:  convolutional-neural-network
PolyphonicPianoTranscription
Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
Stars: ✭ 146 (+284.21%)
Mutual labels:  convolutional-neural-network
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+34.21%)
Mutual labels:  convolutional-neural-network
tsunami
A simple but powerful audio editor
Stars: ✭ 41 (+7.89%)
Mutual labels:  audio-analysis
Image-Denoising-with-Deep-CNNs
Use deep Convolutional Neural Networks (CNNs) with PyTorch, including investigating DnCNN and U-net architectures
Stars: ✭ 54 (+42.11%)
Mutual labels:  convolutional-neural-network
DumpTS
Extract elementary stream from all kinds of media files, show inside media meta information and reconstruct Transport-Stream, ISOBMFF, Matroska and MMT media files
Stars: ✭ 25 (-34.21%)
Mutual labels:  audio-analysis

DeepCurator

A convolutional neural network trained to recognize good* electronic music. Using freely available data from Discogs 'the largest music database and marketplace in the world' a dataset was crafted of the best and worst rated items tagged as 'Electronic'

Items were scored by taking the have/want ratio, average rating, number of ratings, and recommended sale price, and combining them to a normalized score. The top 15,000 scoring items were labelled as 'good' and the bottom 15,000 as 'not-good' (for lack of a better term)

For each item the audio was extracted using the associated YouTube link provided from community submissions via Discogs. To provide an input to the network log-scaled mel spectrograms were generated for 30 second slices from each track

After training, the neural network can be provided with an unseen clip of audio and assign it a score. The higher the score, the more confidence it has of the audio being categorized as 'good'

*Good is subjective, since taste is subjective. The scoring method is not perfect since it partly relies on the popularity of a release to assign a category. Also, rarer items can have an increased price because of their scarcity, not their audio qualities, which inflates the score erroneously. The scoring metric could be viewed more of a 'what will be sought-after on Discogs' rather than 'what is objectively good'

Installation

This project uses Pipenv to manage dependencies. To install all dependencies run pipenv install. To activate the virtual environment run pipenv shell

Data files are stored on S3, to re-create the results, or train your own model, a configuration file will be needed. Create config.yml in the root of the project and fill in the following values.

s3:
  region: 'your-bucket-region'
  access_key: 'aws-access-key'
  secret: 'aws-access-secret'
  bucket: 'bucket-to-store-data'

Training the model

The Discogs dataset used to generate the labels is provided in dataset.csv. In here you will find over 90,000 items each with a Discogs ID, Haves, Wants, Rating Average, Rating Count, Price, and YouTube ID. From this a set of labels can be created by running python create_labels.py. This step is optional, as a pre-generated labels file is provided. However, you may wish to run this step if you want to modify the scoring function and re-score the items.

To train the model on your own dataset, you will need to create your own labels.csv file. This takes the shape of a comma seperated list of YouTube ID, and category (1 or 0).

Once you have a set of a labels, run the download_data.py script. It's recommended to run this on an EC2 instance with a high number of cores and a high network bandwidth. Multiprocessing is used to automatically scale to the number of available cores. Downloading 30,000 YouTube videos and extracting the mp3 from each can take a couple of hours even on a high spec cloud instance.

Spectrograms are generated by running process_audio.py. Again, use a high spec EC2 instance. This will grab each audio file from your specified bucket and generate a number of 30 second slices of the spectrogram for each track.

Finally, to train the network run train.py.

Making predictions

A small Flask server is provided so predictions against unseen audio can be made by posting to a REST endpoint. A trained model is already provided in the repo, so you don't have to go through the training process if so desired.

To start the server run server.py this will load the trained model, ready for incoming events. The predictor takes in an mp3 file, splits it into n slices, where n is the length of the audio divided by 30 seconds, and creates an average rating using the sum of the individual ratings for each slice. You can try it in your terminal by running:

curl -X POST -F audio=@"audio.mp3" 'http://localhost:5000/rate'

A score between 0 and 100 will be returned, where anything over 50 is what the model deems as 'good'

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].