All Projects → mnpinto → audiotagging2019

mnpinto / audiotagging2019

Licence: MIT license
6th place solution to Freesound Audio Tagging 2019 kaggle competition

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to audiotagging2019

Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (+633.33%)
Mutual labels:  kaggle-competition
StoreItemDemand
(117th place - Top 26%) Deep learning using Keras and Spark for the "Store Item Demand Forecasting" Kaggle competition.
Stars: ✭ 24 (+14.29%)
Mutual labels:  kaggle-competition
argus-tgs-salt
Kaggle | 14th place solution for TGS Salt Identification Challenge
Stars: ✭ 73 (+247.62%)
Mutual labels:  kaggle-competition
Severstal Steel Defect Detection
Kaggle Segmentation Challenge
Stars: ✭ 160 (+661.9%)
Mutual labels:  kaggle-competition
kaggle-house-prices-advanced-regression-techniques
Repository for source code of kaggle competition: House Prices: Advanced Regression Techniques
Stars: ✭ 37 (+76.19%)
Mutual labels:  kaggle-competition
DCASE-models
Python library for rapid prototyping of environmental sound analysis systems
Stars: ✭ 35 (+66.67%)
Mutual labels:  audio-tagging
Machine Learning And Data Science
This is a repository which contains all my work related Machine Learning, AI and Data Science. This includes my graduate projects, machine learning competition codes, algorithm implementations and reading material.
Stars: ✭ 137 (+552.38%)
Mutual labels:  kaggle-competition
fer
Facial Expression Recognition
Stars: ✭ 32 (+52.38%)
Mutual labels:  kaggle-competition
Data-Science-Hackathon-And-Competition
Grandmaster in MachineHack (3rd Rank Best) | Top 70 in AnalyticsVidya & Zindi | Expert at Kaggle | Hack AI
Stars: ✭ 165 (+685.71%)
Mutual labels:  kaggle-competition
digit recognizer
CNN digit recognizer implemented in Keras Notebook, Kaggle/MNIST (0.995).
Stars: ✭ 27 (+28.57%)
Mutual labels:  kaggle-competition
Kaggle dstl submission
Code for a winning model (3 out of 419) in a Dstl Satellite Imagery Feature Detection challenge
Stars: ✭ 159 (+657.14%)
Mutual labels:  kaggle-competition
Detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers.
Stars: ✭ 188 (+795.24%)
Mutual labels:  kaggle-competition
Quora QuestionPairs DL
Kaggle Competition: Using deep learning to solve quora's question pairs problem
Stars: ✭ 54 (+157.14%)
Mutual labels:  kaggle-competition
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+647.62%)
Mutual labels:  kaggle-competition
histopathologic cancer detector
CNN histopathologic tumor identifier.
Stars: ✭ 26 (+23.81%)
Mutual labels:  kaggle-competition
Data science bowl 2018
My 5th place (out of 816 teams) solution to The 2018 Data Science Bowl organized by Booz Allen Hamilton
Stars: ✭ 147 (+600%)
Mutual labels:  kaggle-competition
word2vec-movies
Bag of words meets bags of popcorn in Python 3 中文教程
Stars: ✭ 54 (+157.14%)
Mutual labels:  kaggle-competition
open-solution-ship-detection
Open solution to the Airbus Ship Detection Challenge
Stars: ✭ 54 (+157.14%)
Mutual labels:  kaggle-competition
Kaggle
Kaggle Kernels (Python, R, Jupyter Notebooks)
Stars: ✭ 26 (+23.81%)
Mutual labels:  kaggle-competition
kuzushiji-recognition
5th place solution for the Kaggle Kuzushiji Recognition Challenge
Stars: ✭ 41 (+95.24%)
Mutual labels:  kaggle-competition

6th place solution for Freesound Audio Tagging 2019 Competition

Description of the solution: https://link.medium.com/Kv5kyHjcIX

How to use

  • Install fastai and librosa:
conda install -c pytorch -c fastai fastai
conda install librosa
  • Clone the repository:
git clone https://github.com/mnpinto/audiotagging2019.git
python run.py --n_epochs 1 --max_processors 8

If successful the script will create train_curated_png and train_noisy_png folders with the Mel spectrograms corresponding to all audio clips and train the model for 1 epochs using the default arguments. The max_processors argument will set how many processors to use to this preprocessing step. After the training is complete a folder models will be created and a weights file stage-1.pth will be saved their. Finally a submission file will be generated with the default name submission.csv.

If you find any errors let me know by creating an Issue, the code has not yet been tested on fastai versions after 1.0.51.

Arguments

name type default description
--path str data path to data folder
--working_path str . path to working folder where model weights and outputs will be saved
--base_dim int 128 size to crop the images on the horizontal axis before rescaling with SZ
--SZ int 128 images will be rescaled to SZxSZ
--BS int 64 batch size
--lr float 0.01 maximum learning rate for one_cycle_learning
--n_epochs int 80 number of epochs to train the model
--epoch_size int 1000 number of episodes (with batch size BS each) in each epoch
--f2cl int 1 train only on samples with F2 score (with threshold of 0.2) less than f2cl
--fold_number int 0 KFold cross-validation fold number: (0,1,2,3,4) or -1 to train with all data
--loss_name str BCELoss loss function to use, options are BCELoss and FocalLoss
--csv_name str submission.csv name of csv file to save with test predictions
--model str models.xresnet18 can be a fastai model as the default or xresnet{18,34,50}ssa to use simple self-attention
--weights_file str stage-1 name of file to save the weights
--load_weights str provide the name of weights file (e.g., stage-1) to load before training
--max_processors int 8 number cpu threads to use for converting wav files to png
--force bool False if set to True the pngs will be recomputed for noisy and curated train datasets

Replicating my top scoring solution

Important! This code has not yet been tested. I ran all experiments on Kaggle kernels and refactored the code to create this repository. After the final results of the competition are available, late submissions will be allowed so I will then test the code to check if anything is missing.

My top scoring solution with a score of 0.742 on public LB and 0.75421 on private LB (final results pending...) is the average of the following 6 runs:

python run.py --model xresnet18ssa --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 80 --loss_name FocalLoss --weights_file model1 --csv_name submission1.csv
              
python run.py --model xresnet34ssa --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 60 --loss_name FocalLoss --weights_file model2 --csv_name submission2.csv
              
python run.py --model xresnet18ssa --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 80 --weights_file model3 --csv_name submission3.csv
              
python run.py --model xresnet34ssa --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 60 --weights_file model4 --csv_name submission4.csv

python run.py --model models.xresnet34 --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 90 --loss_name FocalLoss --weights_file model5 --csv_name submission5.csv
              
python run.py --model models.xresnet50 --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 65 --weights_file model6_0
              
python run.py --model models.xresnet50 --base_dim 128 --SZ 256 --fold_number -1 \
              --n_epochs 65 --load_weights model6_0 --weights_file model6 --csv_name submission6.csv             

The penultimate run, generating model6_0 weights is not used for the ensemble, is just to generate the weights that are used to the last identical run. If you are running locally, try a single run with more epochs, the 2x65 epochs is just to accommodate for the 9h run-time limit of Kaggle kernels.

Ablation study (in progress)*

  • Fixed parameters: --base_dim 128 --SZ 256 --fold_number -1 --n_epochs 80 --loss_name FocalLoss
Model private LB scores
xresnet18ssa [0.74211, 0.74695]
xresnet34ssa [0.74545, 0.74875]
xresnet50 [0.73966, 0.74062]
  • Fixed parameters: --base_dim 128 --SZ 256 --fold_number -1 --n_epochs 80 --loss_name BCELoss
Model private LB scores
xresnet18ssa [0.73824, 0.73956]
xresnet34ssa [0.74378, 0.74567]

Citing this repository

@misc{mnpinto2019audio,
  author = {Pinto, M. M.},
  title = {6th place solution for Freesound Audio Tagging 2019 Competition},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mnpinto/audiotagging2019}}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].