Audio2Guitarist-GAN
A two-stage generative adversarial network that generates images of guitarists playing guitar from audio.
Descriptions
To be updated.
Architecture
Stage 1: Audio to binary mask
Stage 2: Binary mask to color image
More information in this blog post.
Result
1. Video output
- Case 1: Test on the same guitarist, 南澤大介.
- Video 1: トゥ・ビー・ウィズ・ユー (acoustic guitar solo)
- Source YouTube video here.
- Video 2: 愛はかげろうのように (acoustic guitar solo)
- Source YouTube video here.
- Video 1: トゥ・ビー・ウィズ・ユー (acoustic guitar solo)
- Case 2: Test on different guitarist, 伍々慧. The playing style and recording are different from the training data.
- Video 3: Autumn Leaves (early version) / Satoshi Gogo
- Source YouTube video here.
- Video 4: I got rhythm / Satoshi Gogo
- Source YouTube video here.
- Video 3: Autumn Leaves (early version) / Satoshi Gogo
- Case 3: Test on different instruments and human voice.
- [TBU]
南澤大介 and 伍々慧.
Here are the official website of2. Conditional output
The following gifs are result images generated from an audio that the model had never seen.
- Source video (audio): tupliのテーマ (acoustic guitar solo) 作曲/編曲:南澤大介
- Top to bottom: audio visualization, stage-1 output, stage-2 output, ground truth.
3. Pose-guided generation
The following gifs show outputs of 2nd-stage model given conditional poses.
- Source video (audio): John Pizzarelli - "I Got Rhythm" (solo) at the Fretboard Journal
- Top: Reference video; Middle: conditional hands input; Bottom: stage 2 output.
References
- Audio to Body Dynamics
- Pose guided person image generation
- Deep Video Generation, Prediction and Completion of Human Action Sequences
- Deformable GANs for Pose-based Human Image Generation
- Skeleton-aided Articulated Motion Generation
- Assessment of Student Music Performances Using Deep Neural Networks
- Dance Dance Convolution