All Projects → shiba24 → birdsong-generation-project

shiba24 / birdsong-generation-project

Licence: other
Generating birdsong with WaveNet

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to birdsong-generation-project

Deeper Traffic Lights
[repo not maintained] Check out https://diffgram.com if you want to build a visual intelligence
Stars: ✭ 89 (+242.31%)
Mutual labels:  tensorflow-experiments
Deep Steganography
Hiding Images within other images using Deep Learning
Stars: ✭ 136 (+423.08%)
Mutual labels:  tensorflow-experiments
wavenet-like-vocoder
Basic wavenet and fftnet vocoder model.
Stars: ✭ 20 (-23.08%)
Mutual labels:  wavenet
Ml Classifier
A tool for quickly training image classifiers in the browser
Stars: ✭ 97 (+273.08%)
Mutual labels:  tensorflow-experiments
Learn Machine Learning In Two Months
Những kiến thức cần thiết để học tốt Machine Learning trong vòng 2 tháng. Essential Knowledge for learning Machine Learning in two months.
Stars: ✭ 1,726 (+6538.46%)
Mutual labels:  tensorflow-experiments
Ml Classifier Ui
A UI tool for quickly training image classifiers in the browser
Stars: ✭ 224 (+761.54%)
Mutual labels:  tensorflow-experiments
Gin Config
Gin provides a lightweight configuration framework for Python
Stars: ✭ 1,189 (+4473.08%)
Mutual labels:  tensorflow-experiments
Machine-Learning
🌎 I created this repository for educational purposes. It will host a number of projects as part of the process .
Stars: ✭ 38 (+46.15%)
Mutual labels:  tensorflow-experiments
Cramer Gan
Tensorflow Implementation on "The Cramer Distance as a Solution to Biased Wasserstein Gradients" (https://arxiv.org/pdf/1705.10743.pdf)
Stars: ✭ 123 (+373.08%)
Mutual labels:  tensorflow-experiments
vak
a neural network toolbox for animal vocalizations and bioacoustics
Stars: ✭ 21 (-19.23%)
Mutual labels:  birdsong
Self Driving Car
Automated Driving in NFS using CNN.
Stars: ✭ 105 (+303.85%)
Mutual labels:  tensorflow-experiments
A Nice Mc
Code for "A-NICE-MC: Adversarial Training for MCMC"
Stars: ✭ 115 (+342.31%)
Mutual labels:  tensorflow-experiments
Wgan
Tensorflow Implementation of Wasserstein GAN (and Improved version in wgan_v2)
Stars: ✭ 228 (+776.92%)
Mutual labels:  tensorflow-experiments
Nlp
This is where I put all my work in Natural Language Processing
Stars: ✭ 90 (+246.15%)
Mutual labels:  tensorflow-experiments
tensorflow-image-recognition-chrome-extension
Chrome browser extension for using TensorFlow image recognition on web pages
Stars: ✭ 88 (+238.46%)
Mutual labels:  tensorflow-experiments
Neural kbqa
Knowledge Base Question Answering using memory networks
Stars: ✭ 87 (+234.62%)
Mutual labels:  tensorflow-experiments
3d Posenet
Control 3D Virtual Character through Tensroflow.js Posenet
Stars: ✭ 143 (+450%)
Mutual labels:  tensorflow-experiments
TensorFlow-Multiclass-Image-Classification-using-CNN-s
Balanced Multiclass Image Classification with TensorFlow on Python.
Stars: ✭ 57 (+119.23%)
Mutual labels:  tensorflow-experiments
EEG-Motor-Imagery-Classification-CNNs-TensorFlow
EEG Motor Imagery Tasks Classification (by Channels) via Convolutional Neural Networks (CNNs) based on TensorFlow
Stars: ✭ 125 (+380.77%)
Mutual labels:  tensorflow-experiments
Mixture Density Networks For Distribution And Uncertainty Estimation
A generic Mixture Density Networks (MDN) implementation for distribution and uncertainty estimation by using Keras (TensorFlow)
Stars: ✭ 249 (+857.69%)
Mutual labels:  tensorflow-experiments

Birdsong generation project

Generating birdsongs with Wavenet!

Table of Contents

Quick execution

Requirements

  • golang 1.7.3+
  • python 2.7.11
  • wavenet-tensorflow (automatically clone in the script)
  • GPU (recommended)

Command

# preparation
git clone https://github.com/shiba24/birdsong-generation-project.git
bash preparation.sh

# training  
cd tensorflow-wavenet
python train.py --data_dir=../corpus

# generation
cd tensorflow-wavenet
python generate.py --wav_out_path=generated.wav --samples 80000 logdir/train/{DATE_HERE}/model.ckpt-{XXX}

Generated song

Listen to natural song at soundcloud

Listen to generated song at soundcloud

Overview

Abstract in one sentence

Simulate bird song with WaveNet.

Background

What is songbird?

Songbird is one of the best model animals for the neuroscientific studies of human language, vocalization, and auditory processing. Many laboratories around the world including molecular biology, physiology, acoustics, and ethology, are using songbird to answer the questions: why only humans have language? and what is the neural mechanism of language?.

Song structure

Bird song is considered to have syntax like human language, although it does not have semantics within itself. In most species in songbirds only male sings while few species both sexes sing. One function of their song is considered to be sexual attraction to females. Below is typical song structure of songbirds. We can see a bout of several song elements (called syllable or note).

You can listen to an example of java sparrow's (文鳥) song here. This is visualized image, or spectrogram, of zebra finch's (錦華鳥) song. Alphabets on the spectrogram represent type of note. Both of java sparrow and zebra finch are songbirds.

And interestingly, the song structure is expressed as finite-state automaton model, which can be regarded as high-order Markov process. The transition of notes are probablistic, and song is expressed as probabilistic finite-state transition diagram. This is considered to be in parallel with human language (Berwick et al., 2011.) This is song expression as finite-state transition diagram. Line thickness represents the probability of transition from note to note.

(Both figures cited from Honda & Okanoya, 1999)

Brain structure

Many researchers approached what is the neural mechanism enabling finite-state vocalization. And one hypothesis is Markov chain-like representation within neurons in the motor areas. The right figure is neural pathway of vocalization (cited from Bouhuis et al., 2010). The more detailed brain circuitry including audition can be seen here for example.

We can see there is a brain region named HVC (proper name), which is pre-motor area. Many neurons show activity phase-locked to the song. HVC neurons are projecting to RA (robust nucleus of arcopallium), which is motor area, and RA outputs motor signal to mustles of vocal organ for the generation of song element.

The next figure (a) is another expression of finite-state transition of song. And (b) is a simple model of HVC and RA neurons. The hypothesis assumes neurons in HVC are firing in turn like chain. (Cited from Katahira et al, 2007)

There are many studies for modelling (even using neural network) the birdsong and its neural mechanism.

WaveNet

WaveNet is generative neural network model for raw audio file. The original paper was published by Google DeepMind team in 2016. It uses dilated convolutional neural network to generate audio wave. (Gif image cited from Blog post of DeepMind)

Inputs and outputs for the model are only waveform. Hence this model itself does NOT assume that the syllables expressed with finite state, nor Markov chain.

This project

This project is only my own (not belonging to my supervisor), combining latest machine learning result and knowledge of neuroscience about songbirds.

In one sentence: using WaveNet to simulate bird song.

As mentioned above, bird song itself is thought to have Markov-Model structure and syntax like human speech. However, song itself has no semantics.

If the mechanism should be similar between such birds and humans, WaveNet (original blog and implementation of tensorflow) might be successful for simulating birdsong, because it is succssful in generating completely meaningless but locally speech-like sound waveform.

Why is this interesting?

WaveNet itself doesn't use Markov property of song. It only uses information of raw waveform. Therefore, if WaveNet succeeds in generating birdsong:

  1. WaveNet might have an ability to embed Markov property. This is not proved explicitly if we only generate human speech with this model.

  2. Representation obtained by trained model (i.e. activation pattern of neurons in the model) might be comparable with neural representation in actual brain of songbird.

  3. Similaity between human speech and birdsong as syntax could be further supported.

Model configuration

Datasize
??
Sampling rate
and
Other settings
will be here

Result

Training epoch

After 2500 epoch, loss is about 1.5~2.0.

Generated sound

Listen to natural song at soundcloud

Listen to generated song at soundcloud

It sounds like original (natural) song!! This is visualized image, or spectrogram, of song. (It tells us that the wave sound is a bit chattering, though.)

  • Simulated song spectrogram

  • Natural song spectrogram

Discussion

The next step of this project would be:

  1. Investigating whether markov-chain structure in the generated song is similar to that in natural song.

  2. Comparing neural firing patterns known-to-date and activated neuron pattern in the model.

TODOs

  • generate other species songs (e.g. finches, canaries, ...)

Notice

  • The dataset is not open-access yet. Hense you cannot reproduce this project.

  • This project is only my own, not belonging to my supervisor. All the mistakes and misunderstandings belong to myself.

  • If you are interested in this project (e.g. furthe questions, reproduce, comments and/or feedbacks), feel free to contact me!

Copyright

Implementation of Wavenet is done by ibab.

All rights reserved Shintaro Shiba.

  • Started this project November 2016.
  • Updated January 2017.

Any questions or comments are welcomed! Thank you.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].