All Projects → timsainb → Avgn

timsainb / Avgn

Licence: mit
A generative network for animal vocalizations. For dimensionality reduction, sequencing, clustering, corpus-building, and generating novel 'stimulus spaces'. All with notebook examples using freely available datasets.

Projects that are alternatives of or similar to Avgn

Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+960%)
Mutual labels:  jupyter-notebook, notebook
Earthengine Py Notebooks
A collection of 360+ Jupyter Python notebook examples for using Google Earth Engine with interactive mapping
Stars: ✭ 807 (+1514%)
Mutual labels:  jupyter-notebook, notebook
Digital Signal Processing Lecture
Digital Signal Processing - Theory and Computational Examples
Stars: ✭ 532 (+964%)
Mutual labels:  jupyter-notebook, notebook
Pytorch Ntm
Neural Turing Machines (NTM) - PyTorch Implementation
Stars: ✭ 453 (+806%)
Mutual labels:  jupyter-notebook, notebook
Machine Learning Notebooks
Assorted exercises and proof-of-concepts to understand and study machine learning and statistical learning theory
Stars: ✭ 33 (-34%)
Mutual labels:  jupyter-notebook, notebook
Bayesian Analysis Recipes
A collection of Bayesian data analysis recipes using PyMC3
Stars: ✭ 479 (+858%)
Mutual labels:  jupyter-notebook, notebook
Jupyterlab Lsp
Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
Stars: ✭ 796 (+1492%)
Mutual labels:  jupyter-notebook, notebook
Nlp Python Deep Learning
NLP in Python with Deep Learning
Stars: ✭ 374 (+648%)
Mutual labels:  jupyter-notebook, notebook
Deeplabv3 Plus
Tensorflow 2.3.0 implementation of DeepLabV3-Plus
Stars: ✭ 32 (-36%)
Mutual labels:  jupyter-notebook, segmentation
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+1808%)
Mutual labels:  jupyter-notebook, notebook
Fsgan
FSGAN - Official PyTorch Implementation
Stars: ✭ 420 (+740%)
Mutual labels:  jupyter-notebook, segmentation
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-16%)
Mutual labels:  jupyter-notebook, notebook
Hands On Nltk Tutorial
The hands-on NLTK tutorial for NLP in Python
Stars: ✭ 419 (+738%)
Mutual labels:  jupyter-notebook, notebook
Sklearn Classification
Data Science Notebook on a Classification Task, using sklearn and Tensorflow.
Stars: ✭ 518 (+936%)
Mutual labels:  jupyter-notebook, notebook
Human Activity Recognition Using Cnn
Convolutional Neural Network for Human Activity Recognition in Tensorflow
Stars: ✭ 382 (+664%)
Mutual labels:  jupyter-notebook, notebook
Nteract
📘 The interactive computing suite for you! ✨
Stars: ✭ 5,713 (+11326%)
Mutual labels:  jupyter-notebook, notebook
Quantitative Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (+612%)
Mutual labels:  jupyter-notebook, notebook
Fbrs interactive segmentation
[CVPR2020] f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation https://arxiv.org/abs/2001.10331
Stars: ✭ 366 (+632%)
Mutual labels:  jupyter-notebook, segmentation
Lambdaschooldatascience
Completed assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-56%)
Mutual labels:  jupyter-notebook, notebook
Vietnamese Accent Model
A simple deep learning model to add accent to Vietnamese text.
Stars: ✭ 38 (-24%)
Mutual labels:  jupyter-notebook, notebook

Animal Vocalization Generative Network (AVGN)

Tim Sainburg (PhD student, UCSD, Gentner Laboratory)

There are two seperate repositories for AVGN: this one that hosts an earlier and less feature rich, but cleaner version of this code, and the second one at https://github.com/timsainb/avgn_paper. There are more species and examples in this repo, but its not as clean and might be a bit more difficult to figure out. If you want to try out any of the features from our paper, I would reccomend using the second repository.

This is a project for taking animal vocalization audio recordings, and learning a generative model of segments of those vocalizations (e.g. syllables) using modern machine learning techniques. Specifically. This package will take in a dataset of wav files, segment them into units (e.g. syllables of birdsong) and train a generative model on those segments. The learned latent representations can be used to cluster syllables in an unsupervised manner, generate novel syllables, visualize sequences, or perform several other analyses.


Overview of the package

description

Latent space generative modelling of song

Below is an example of a variational autoencoder trained on birdsong. In each example, points are selected from a low dimensional latent space, and are then passed through a decoder to be decoded into syllable spectrograms. We show in other notebook examples how to invert these spectrograms into waveforms (currently using Griffin and Lim inversion).

an example grid sampling from Bengalese Finch song (from a 2D Multidimensional Scaling Autoencoder)

description

an example interpolation of Bengalese finch song (from a 16D Variational Autoencoder)

description

an example interpolation of Cassin's vireo song (from a 16D Variational Autoencoder)

description


An example of transcribed Bengalese Finch song

Below is an example of a combination of the HDBSCAN and UMAP algorithms, first used to reduce the dimensionality of syllables, then used to cluster syllables into discrete categories.

description

(left) Distribution of syllables in UMAP dimensionality reduction, labelled using HDBSCAN. Each dot is a syllable from the same finch. (right) The same plot as to the left, replacing syllables with line segments connecting syllables, representing syllable transitions.

description

The entire sequence dataset from Katahira et al., for the same Bengalese finch as above. Each vertical bar represents one song, and each color represents one syllable.

description

(top) Syllabic transcriptions of the same bird. (bottom) the same syllables, segmented, normalized, and padded.

Documentation

Examples of of different songbirds are located in the notebooks/birdsong folder. There is no explicit documentation, but we will work on adding better docstrings to different functions (as we clean them up), and adding more notes to the example notebooks.

Currently there are two example birds - Cassins vireo, and Bengalese finch. The Cassin's vireo example dataset compares hand labelled syllables to syllable labels learned using out method, and thus uses the same segmentations as the manual method. The Bengalese finch is segmented automatically. I'm currently working on adding a few more species (both songbirds and other species).

To use these notebooks on your own dataset, clone this repo and copy the methods from one of the examples. You will need to change the parameters as well as parse date/time information in 1.0-segment-song-from-wavs.ipynb yourself.

The GAIA autoencoder is not currently implemented in AVGN. I have a GAIA specific repo with that implementation, that will probably need some adjustments to work with AVGN. Feel free to try to pull them together and make a PR.

Some of these functions use a lot of RAM (for example loading your whole dataset into RAM). If RAM is an issue for you, try using the data_interator from https://github.com/timsainb/GAIA

Installation

to install run python setup.py install


Data references

Hedley, Richard (2016): Data used in PLoS One article “Complexity, Predictability and Time Homogeneity of Syntax in the Songs of Cassin’s Vireo (Vireo cassini)” by Hedley (2016). figshare. https://doi.org/10.6084/m9.figshare.3081814.v1

Katahira K, Suzuki K, Kagawa H, Okanoya K (2013) A simple explanation for the evolution of complex song syntax in Bengalese finches. Biology Letters 9(6): 20130842. https://doi.org/10.1098/rsbl.2013.0842 https://datadryad.org//resource/doi:10.5061/dryad.6pt8g

Katahira K, Suzuki K, Kagawa H, Okanoya K (2013) Data from: A simple explanation for the evolution of complex song syntax in Bengalese finches. Dryad Digital Repository. https://doi.org/10.5061/dryad.6pt8g

Arriaga, J. G., Cody, M. L., Vallejo, E. E., & Taylor, C. E. (2015). Bird-DB: A database for annotated bird song sequences. Ecological informatics, 27, 21-25. http://taylor0.biology.ucla.edu/birdDBQuery/

TODO

  • rewrite functions and add docstrings
  • make less RAM heavy
  • add other animal vocalization datasets
  • ...

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].