All Projects → xiaoyiT → Course-Project---Speech-Driven-Facial-Animation

xiaoyiT / Course-Project---Speech-Driven-Facial-Animation

Licence: other
ECE 535 - Course Project, Deep Learning Framework

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Course-Project---Speech-Driven-Facial-Animation

Basicocr
BasicOCR是一个致力于解决自然场景文字识别算法研究的项目。该项目由长城数字大数据应用技术研究院佟派AI团队发起和维护。
Stars: ✭ 336 (+433.33%)
Mutual labels:  gan, rnn
CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (-52.38%)
Mutual labels:  gan, rnn
Mydeeplearning
A deep learning library to provide algs in pure Numpy or Tensorflow.
Stars: ✭ 281 (+346.03%)
Mutual labels:  gan, rnn
Rgan
Recurrent (conditional) generative adversarial networks for generating real-valued time series data.
Stars: ✭ 480 (+661.9%)
Mutual labels:  gan, rnn
Tensorflow Tutorials
텐서플로우를 기초부터 응용까지 단계별로 연습할 수 있는 소스 코드를 제공합니다
Stars: ✭ 2,096 (+3226.98%)
Mutual labels:  gan, rnn
Iseebetter
iSeeBetter: Spatio-Temporal Video Super Resolution using Recurrent-Generative Back-Projection Networks | Python3 | PyTorch | GANs | CNNs | ResNets | RNNs | Published in Springer Journal of Computational Visual Media, September 2020, Tsinghua University Press
Stars: ✭ 202 (+220.63%)
Mutual labels:  gan, rnn
Tensorflow Tutorial
Tensorflow tutorial from basic to hard, 莫烦Python 中文AI教学
Stars: ✭ 4,122 (+6442.86%)
Mutual labels:  gan, rnn
Deeplearning
深度学习入门教程, 优秀文章, Deep Learning Tutorial
Stars: ✭ 6,783 (+10666.67%)
Mutual labels:  gan, rnn
Ad examples
A collection of anomaly detection methods (iid/point-based, graph and time series) including active learning for anomaly detection/discovery, bayesian rule-mining, description for diversity/explanation/interpretability. Analysis of incorporating label feedback with ensemble and tree-based detectors. Includes adversarial attacks with Graph Convolutional Network.
Stars: ✭ 641 (+917.46%)
Mutual labels:  gan, rnn
Keraspp
코딩셰프의 3분 딥러닝, 케라스맛
Stars: ✭ 178 (+182.54%)
Mutual labels:  gan, rnn
GAN-RNN Timeseries-imputation
Recurrent GAN for imputation of time series data. Implemented in TensorFlow 2 on Wikipedia Web Traffic Forecast dataset from Kaggle.
Stars: ✭ 107 (+69.84%)
Mutual labels:  gan, rnn
GAN-auto-write
Generative Adversarial Network that learns to generate handwritten digits. (Learning Purposes)
Stars: ✭ 18 (-71.43%)
Mutual labels:  gan
metrics
IS, FID score Pytorch and TF implementation, TF implementation is a wrapper of the official ones.
Stars: ✭ 91 (+44.44%)
Mutual labels:  gan
medical-diagnosis-cnn-rnn-rcnn
分别使用rnn/cnn/rcnn来实现根据患者描述,进行疾病诊断
Stars: ✭ 39 (-38.1%)
Mutual labels:  rnn
StyleGANCpp
Unofficial implementation of StyleGAN's generator
Stars: ✭ 25 (-60.32%)
Mutual labels:  gan
steam-stylegan2
Train a StyleGAN2 model on Colaboratory to generate Steam banners.
Stars: ✭ 30 (-52.38%)
Mutual labels:  gan
Sequence-Models-coursera
Sequence Models by Andrew Ng on Coursera. Programming Assignments and Quiz Solutions.
Stars: ✭ 53 (-15.87%)
Mutual labels:  rnn
TET-GAN
[AAAI 2019] TET-GAN: Text Effects Transfer via Stylization and Destylization
Stars: ✭ 74 (+17.46%)
Mutual labels:  gan
deep-improvisation
Easy-to-use Deep LSTM Neural Network to generate song sounds like containing improvisation.
Stars: ✭ 53 (-15.87%)
Mutual labels:  rnn
mSRGAN-A-GAN-for-single-image-super-resolution-on-high-content-screening-microscopy-images.
Generative Adversarial Network for single image super-resolution in high content screening microscopy images
Stars: ✭ 52 (-17.46%)
Mutual labels:  gan

Speeck-Driven-Facial-Animation

Motivation

Nonverbal behaviour signals, such as facial expressions, provide key information about what we think, act or react. It is an attractive but also challenging to study the signals because they are always hidden or may vary from different people. In this project, we are trying to use machine learning methods for modelling human facial expressions. Therefore, we can get a framework which enables us to predict the facial expression of a never-seen-person when we only hear that person speak.

Technical details

Datasets:

RAVDESS

Framework Architecture:

Our project mainly contains three parts:

  • feature extraction for audio and video (Fast Fourier Transform(FFT) and landmarks transformation)
  • mapping from speech to facial feature (CNN + RNN)
  • face feature translation to images (Deep Convolution Generated Adversarial Networks (DCGAN))

Instruction

Preprocess Dataset:

  • Put all speech files in ./speech folder
  • Put train video files in ./train/video folder
  • Put test video files in ./test/video folder
  • Use python3 to run preprocess_train.py and preprocess_test.py to get feature array, such as: python3 preprocess_train.py OR python3 preprocess_test.py

Train CNN:

  • Create folder called CNN_record in the same path(./)
  • Use python3 to run CNN_train.py, such as: python3 CNN_train.py

Evaluation CNN:

  • Use python3 to run CNN_eval.py, such as: python3 CNN_train.py

Make Prediction of landmarks:

  • Creat folder called predict_landmark
  • Use python3 to run predict.py with arguments which is an speech file endwith .wav , such as: python3 predict.py ./speech/03-01-01-01-01-01-01.wav
  • The output is some images and a gif files. Images are in predict_landmark folder. GIF file is in the same path called landmarks.gif

DCGAN:

  • Download video from RAVDESS database.
  • Put videos under folder /video.
  • mkdir data/fr.
  • Run "python prep_gan.py" to extract facial images from videos, the output images will be saved in /data/fr/ folder and landmarks will be saved in ./data/landmark.npy.
  • Run "python main.py --train --epoch 25" to train GAN network.
  • Run "python main.py" to generate sample images.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].