All Projects → HStuart18 → tfworldhackathon

HStuart18 / tfworldhackathon

Licence: other
GitHub repo for my Tensorflow World hackathon submission

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to tfworldhackathon

pcdarts-tf2
PC-DARTS (PC-DARTS: Partial Channel Connections for Memory-Efficient Differentiable Architecture Search, published in ICLR 2020) implemented in Tensorflow 2.0+. This is an unofficial implementation.
Stars: ✭ 25 (+47.06%)
Mutual labels:  tensorflow2
deep autoviml
Build tensorflow keras model pipelines in a single line of code. Now with mlflow tracking. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Stars: ✭ 98 (+476.47%)
Mutual labels:  tensorflow2
Training-BatchNorm-and-Only-BatchNorm
Experiments with the ideas presented in https://arxiv.org/abs/2003.00152 by Frankle et al.
Stars: ✭ 23 (+35.29%)
Mutual labels:  tensorflow2
mae-scalable-vision-learners
A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners
Stars: ✭ 54 (+217.65%)
Mutual labels:  tensorflow2
Tensorflow2-ObjectDetectionAPI-Colab-Hands-On
Tensorflow2 Object Detection APIのハンズオン用資料です(Hands-on documentation for the Tensorflow2 Object Detection API)
Stars: ✭ 33 (+94.12%)
Mutual labels:  tensorflow2
spectral normalization-tf2
🌈 Spectral Normalization implemented as Tensorflow 2
Stars: ✭ 36 (+111.76%)
Mutual labels:  tensorflow2
Awesome-Tensorflow2
基于Tensorflow2开发的优秀扩展包及项目
Stars: ✭ 45 (+164.71%)
Mutual labels:  tensorflow2
GradCAM and GuidedGradCAM tf2
Implementation of GradCAM & Guided GradCAM with Tensorflow 2.x
Stars: ✭ 16 (-5.88%)
Mutual labels:  tensorflow2
Tensorflow-YOLACT
Implementation of the paper "YOLACT Real-time Instance Segmentation" in Tensorflow 2
Stars: ✭ 97 (+470.59%)
Mutual labels:  tensorflow2
farm-animal-tracking
Farm Animal Tracking (FAT)
Stars: ✭ 19 (+11.76%)
Mutual labels:  tensorflow2
UnitBox
UnitBox: An Advanced Object Detection Network
Stars: ✭ 23 (+35.29%)
Mutual labels:  tensorflow2
transformer-tensorflow2.0
transformer in tensorflow 2.0
Stars: ✭ 53 (+211.76%)
Mutual labels:  tensorflow2
GrouProx
FedGroup, A Clustered Federated Learning framework based on Tensorflow
Stars: ✭ 20 (+17.65%)
Mutual labels:  tensorflow2
datascienv
datascienv is package that helps you to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries
Stars: ✭ 53 (+211.76%)
Mutual labels:  tensorflow2
ntga
Code for "Neural Tangent Generalization Attacks" (ICML 2021)
Stars: ✭ 33 (+94.12%)
Mutual labels:  tensorflow2
WGAN-GP-tensorflow
Tensorflow Implementation of Paper "Improved Training of Wasserstein GANs"
Stars: ✭ 23 (+35.29%)
Mutual labels:  wgan-gp
WGAN-GP-TensorFlow
TensorFlow implementations of Wasserstein GAN with Gradient Penalty (WGAN-GP), Least Squares GAN (LSGAN), GANs with the hinge loss.
Stars: ✭ 42 (+147.06%)
Mutual labels:  wgan-gp
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-17.65%)
Mutual labels:  tensorflow2
CRNN.tf2
Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2
Stars: ✭ 131 (+670.59%)
Mutual labels:  tensorflow2
Autoregressive-models
Tensorflow 2.0 implementation of Deep Autoregressive Models
Stars: ✭ 18 (+5.88%)
Mutual labels:  tensorflow2

MusicGAN

MusicGAN creates 1 second of instrumental audio at 16kHz.

Piano demo

Violin demo

To create your own MusicGAN, just clone this repo and run scripts/WGAN-GP.py after modifying DATA_DIR and INSTRUMENT. I used this data after converting the files to wave. The violin data I used was just scraped from Youtube.

Training expects a GPU and will take several hours to achieve resonable results. Due to time limitations for the hackathon, I only trained for a few hours, but better results are possible if trained for a longer duration.

Run tensorboard --logdir logs/train to view generator and critic loss with Tensorboard.

Inspiration

Since the inception of generative adversarial networks, I have been fascinated by their capacity to perform tasks of unprecedented complexity. They are a prime example of how machines can learn in a similar manner to humans - akin to reinforcement learning. I am also a huge fan of music and love to play the piano. So I thought, why not conflate my love for machine learning and my passion for music!?

Music generation has many different and exciting potential applications such as:

  • Providing melody inspiration to artists
  • Creating infinite, unique and free music without the need for audio file storage (for retail shops, restaurants, cafes, video games, radio stations etc.)

GANs are already well-established in the image-processing domain, but not so much in NLP or audio-processing due to their sequential structure. After some investigaton, I learned about WaveGAN. So, I set out to adapt WaveGAN for piano in Tensorflow 2.0 using WGAN-GP as my training mechanism (as recommended by the paper).

What it does

MusicGAN generates approximately one second of music (from a particular instrument i.e. piano) given a random noise vector. The majority of existing technologies generate MIDI files, which contains information such as the notes and tempo of a song, but do not contain any audio data. This approach loses the character and personality of music that can't simply be transcribed.

I have also created a JavaScript model for implementation in webpages down the track.

How I built it

I adapted code for WGAN-GP and created my own WaveGAN using Tensorflow-GPU 2.0. I tried developing my script to be as transparent as possible so that someone can look at it, change some parameters, and get going.

I took a highly systematic and methodical approach, since a lot of my work was writing code based off of research papers, or needing conversion from Tensorflow 1.x.

Firstly, I trained a regular GAN on the MNIST dataset using WGAN-GP to ensure that I had implemented the training algorithm correctly. Next, I used an old Tensorflow 1.x WaveGAN implementation with my architecture to be certain that my generator and critic models were correct. Then I inserted my generator and critic models into my WGAN-GP infrastructure, replacing the MNIST GAN. Lastly, I tested the script on the same audio datasets used in the WaveGAN paper to make sure everything was ready to go. Finally, I started running my script on piano audio, adjusting hyperparameters and optimizing my models' architecture (trying to avoid mode collapse and failure to converge).

Challenges I ran into

I spent quite a bit of time getting used to tensorflow.GradientTape and watching tensors etc. This was new to me since this project was my first shot at using Tensorflow 2.0. The majority of errors I faced were due to implementation/import mistakes, which I scoured GitHub to solve. In particular, finding elegant workarounds for functions contained in tensorflow.contrib proved to be challenging. Annoyingly, many solutions made use of tf.compat.v1, so I had to circumvent the problem some other way.

Additionally, I had to maintain constant consideration with regards to my computation capacity. My PC has a Nvidia RTX 2060, but training still took many many hours, and I had to use small batch sizes.

Accomplishments that I'm proud of

In light of the fact that I wasn't familiar with the new API, had never heard of WaveGAN or WGAN-GP and had hardware limitations, I am proud to say that I gave the project my best shot.

What I learned

I can now say that I can train a GAN in Tensorflow 2.0, and I have also improved a lot of accessory skills involving numpy, matplotlib, tensorboard. Also, my understanding of CNNs, ReLU, transposed convolutions and general training monitoring techniques has deepened.

What's next for this project

I am currently exploring the generation of other musical instrument sounds, such as the violin and saxaphone. My next goal is to create a recurrent version of WaveGAN by using LSTM's and minature WaveGAN's to produce short segments of audio in a sequentially. This would allow for any duration of audio to be created.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].