drscotthawley / Ml Audio Start
Projects that are alternatives of or similar to Ml Audio Start
Getting Started in 'ML-Audio'
Suggestions for students.
About
Audio and acoustics students sometimes ask "How do I get started learning machine learning?" Not everyone gets their start in a major research environment, so this page is intended to serve as a series of suggestions for those who may find themselves "on their own" in their interest in this area. It was started by @drscotthawley and Ryan Miller, but is intended to serve and evolve with the community.
- This is a collaborative page. Please suggest additions, re-organizations, edits, updates, etc., either via Issues or Pull Requests. (In addition, @drscotthawley may gladly cede control of this content to whichever student or group wants to Wiki-fy it!)
Active Practictioners to Follow
Many of us learn about and contribue to news of new developments, papers, conferences, grants, and networking opportunities via Twitter.
- Audio ML Twitter list by Fabian-Robert Stöter (@faroit). <-- Follow these people!
Quick Quotes
- Justin Salomon: "Anyone working in ML, anyone, should be obliged to curate a dataset before they're allowed to train a single model. The lessons learnt in the process are invaluable, and the dangers of skipping said lessons are manifold (see what I did there?)"
Best Practices
"Tips for Publishing Research Code" courtesy of Papers with Code
General Reference Information
- Machine Learning Glossary - A reference resource for common ML math topics, definitions, concepts, etc.
- Notes on Music Information Retreival
Online Training (ML+audio Specific)
- Valerio Velardo's "Deep Learning for Audio"
- Jordi Pons' "Deep neural networks for music" teaching materials
Online Training (More General, Courses)
- Rebecca Fiebrink's Machine Learning for Musicians and Artists on Kadenze -- No actual audio DSP, but great for concepts, interactive and fun (no math!)
- Advanced Digital Signal Processing series taught by Dr.-Ing Gerald Schuller of Fraunhofer IDMT, with videos and acommpanying Jupyter notebooks by Renato Profeta
- Andrew Ng's ML Course on Coursera (Good all-around ML course)
- Fast.ai (Can get you up and running fast)
- Neural Network Programming - Deep Learning with PyTorch. Learn how to code an image predictor neural network in Pytorch. Provides practical NN fundamentals
- Foundations of Machine Learning taught by David Rosenberg
Tutorials
- Andrew Trask's "Anyone Can Learn To Code an LSTM-RNN in Python"
- Machine Learning & Deep Learning Fundamentals (Good high level intro to ML concepts and how neural networks operate)
Talks (at conferences)
Talks we found helpful/inspiring (and are hopefully still relevant). TODO: add more recent talks!
- Paris Smaragdis at SANE 2015: "NMF? Neural Nets? It’s all the same..."
- Ron Weiss at SANE 2015: "Training neural network acoustic models on waveforms"
- Jordi Pons at DLBCN 2018: "Training neural audio classifiers with few data"
- Sander Dieleman at ISMIR 2019: "Generating Music in the Waveform Domain"
Key Papers / Codes
(Let's try to list "representative" or "landmark" papers, not just our latest tweak, unless it includes a really good intro/review section. ;-) )
- Keunwoo Choi et al, "Automatic tagging using deep convolutional neural networks" (ISMIR 2016 Best Paper)
- SampleRNN
- WaveNet
- WaveRNN, i.e. "Efficient Neural Audio Synthesis"
- GANSynth
- Wave-U-Net
Demos
(Not sure if this only means "deployed models you can play with in your browser," or if other things should count as demos)
- Chris Donahue's WaveGAN Demo
- Scott Hawley's SignalTrain Demo
- Neil Zeghidour and David Grangier's Wavesplit
- David Samuel, Aditya Ganeshan, and Jason Naradowsky's Meta-TasNet
Packages & Libraries
- awesome-python-scientific-audio Curated list of python software and packages related to scientific research in audio
- Librosa Great package for various kinds of audio analysis and manipulation
- Audiomentations, data augmentation for audio
- tf.signal: signal processing for TensorFlow
- fastai_audio (and fastai2_audio), audio libraries for Fast.ai library/MOOC. Primarily for image, text & tabular data processing, there are efforts to add audio. (Work in progress.)
Tools / GUIs / Gists
- Jesse Engel's gist to plot "rainbowgrams"
Books
- Neural Networks and Deep Learning online book. How drscotthawley first started reading.
- Open-Source Tools & Data for Music Source Separation by By Ethan Manilow, Prem Seetharaman, and Justin Salamon (2020). An online, interactive book with Python examples!
- List of Books Recommended by ML expert Juergen Schmidthuber for students entering his lab. (Probably pretty demanding material.)
Computer-Related Topics
Python:
- learnpython.org
- Python notebooks for fundamentals of music processing
Signal Processing Topics
- Advanced Digital Signal Processing series taught by Dr.-Ing Gerald Schuller of Fraunhofer IDMT, with videos and acommpanying Jupyter notebooks by Renato Profeta
- An Interactive Introduction to Fourier Transforms by Jez Swanson. (so good!)
- Yuge Shi's "Gaussian Processes, Not Quite for Dummies" (GPs get used for much more than signal processing, but are also promising there; feel free to suggest a different category for this content)
Statistics / Math Topics
- Gradient Descent
- Principal Component Analysis: "PCA From Scratch" by @drscotthawley
Datasets (raw audio)
One finds that many supposed "audio datasets" are really only features or even just metadata! Here are some "raw audio" datasets:
- NSynth Musical Instruments
- GTZAN Genre Collection (Note critique by Bob Sturm)
- Fraunhofer IDMT Guitar/Bass Effects
- Urban Sound Dataset
- FreeSound Annotator (formerly FreeSound Datasets)
- Birdvox-Full-Night
- SignalTrain LA2A
- Kaggle Heartbeat Sounds
- Search for other audio datasets at Kaggle (list)
- A collated list of MIR datasets can be found here, which is the source for audiocontentanalysis.org,but only some are raw audio
- Another list of "audio datasets" by Christopher Dossman
- ...your dataset here...
"Major" ML-Audio Research/Development Groups
Universities:
(or, "Where should I apply for grad school?")
- QMUL (London)
- UPF (Barcelona)
- CRRMA (Stanford, San Francisco)
- IRCAM (Paris)
- NYU (New York)
Industry:
("Where can I get an internship/job"?)
- Google Magenta
- Google Perception (speech publications)
- Adobe
- Spotify
- Increasingly, everywhere. ;-)
Conferences
("Which conference(s) should I go to?" -- asked by student on the day this doc began)
Audio-Specific
**Long list of Music Technology specific conferences https://conferences.smcnetwork.org/ - which is references from here https://github.com/MTG/conferences
- Audio Engineering Society (AES)
- ASA
- Digital Audio Effects (DAFx)
- ICASSP
- ISMIR
- SANE
- Web Audio Conference (WAC)
- SMC
- LVA/ICA
- Audio Mostly
- WIMP
- DCASE
- CSMC
- MuMe
- ICMC
- CMMR
- IBAC
- MLSP
- Interspeech
- FMA
General ML
- ICLR
- ICML
- NeurIPS
- IJCNN
Journals
("Where can I get published?")
In addition, in machine learning specifically, the tendency is for conference papers to be peer-reviewed and to "count" as journal publications.
Competitions / Benchmarks
Some are yearly, some may be defunct but still interesting.
- MIREX
- SiSEC (Signal Separation Evaluation Campaign)
- Kaggle Heartbeat Sounds
Contributors
If you want your name listed here, you may. ;-)