All Projects → hassanhub → Lipreading

hassanhub / Lipreading

Licence: mit

Projects that are alternatives of or similar to Lipreading

Concise Ipython Notebooks For Deep Learning
Ipython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.
Stars: ✭ 23 (-53.06%)
Mutual labels:  jupyter-notebook, deep-neural-networks, autoencoder
Csc deeplearning
3-day dive into deep learning at csc
Stars: ✭ 22 (-55.1%)
Mutual labels:  jupyter-notebook, deep-neural-networks
All Classifiers 2019
A collection of computer vision projects for Acute Lymphoblastic Leukemia classification/early detection.
Stars: ✭ 22 (-55.1%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Tf Keras Surgeon
Pruning and other network surgery for trained TF.Keras models.
Stars: ✭ 25 (-48.98%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Gans In Action
Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks
Stars: ✭ 748 (+1426.53%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Deep Learning Time Series
List of papers, code and experiments using deep learning for time series forecasting
Stars: ✭ 796 (+1524.49%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Deepfake Detection
DeepFake Detection: Detect the video is fake or not using InceptionResNetV2.
Stars: ✭ 23 (-53.06%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Speech Emotion Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (+1191.84%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Servenet
Service Classification based on Service Description
Stars: ✭ 21 (-57.14%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Pytorch Mnist Vae
Stars: ✭ 32 (-34.69%)
Mutual labels:  jupyter-notebook, autoencoder
Densedepth
High Quality Monocular Depth Estimation via Transfer Learning
Stars: ✭ 963 (+1865.31%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Keras Idiomatic Programmer
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework
Stars: ✭ 720 (+1369.39%)
Mutual labels:  jupyter-notebook, autoencoder
Pytorch Multi Style Transfer
Neural Style and MSG-Net
Stars: ✭ 687 (+1302.04%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Deep Embedded Memory Networks
https://arxiv.org/abs/1707.00836
Stars: ✭ 19 (-61.22%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Saliency
TensorFlow implementation for SmoothGrad, Grad-CAM, Guided backprop, Integrated Gradients and other saliency techniques
Stars: ✭ 648 (+1222.45%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Bipropagation
Stars: ✭ 41 (-16.33%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Tensorflow Book
Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.
Stars: ✭ 4,448 (+8977.55%)
Mutual labels:  jupyter-notebook, autoencoder
Stock Analysis Engine
Backtest 1000s of minute-by-minute trading algorithms for training AI with automated pricing data from: IEX, Tradier and FinViz. Datasets and trading performance automatically published to S3 for building AI training datasets for teaching DNNs how to trade. Runs on Kubernetes and docker-compose. >150 million trading history rows generated from +5000 algorithms. Heads up: Yahoo's Finance API was disabled on 2019-01-03 https://developer.yahoo.com/yql/
Stars: ✭ 605 (+1134.69%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Kalasalingam
IEEE "Invited Talk on Deep Learning" 03/02/2018
Stars: ✭ 13 (-73.47%)
Mutual labels:  jupyter-notebook, deep-neural-networks
Dl Colab Notebooks
Try out deep learning models online on Google Colab
Stars: ✭ 969 (+1877.55%)
Mutual labels:  jupyter-notebook, deep-neural-networks

LipReading

This is the keras implementation of Lip2AudSpec: Speech reconstruction from silent lip movements video.

Main Network

Abstract

In this study, we propose a deep neural network for reconstructing intelligible speech from silent lip movement videos. We use auditory spectrogram as spectral representation of speech and its corresponding sound generation method resulting in a more natural sounding reconstructed speech. Our proposed network consists of an autoencoder to extract bottleneck features from the auditory spectrogram which is then used as target to our main lip reading network comprising of CNN, LSTM and fully connected layers. Our experiments show that the autoencoder is able to reconstruct the original auditory spectrogram with a 98% correlation and also improves the quality of reconstructed speech from the main lip reading network. Our model, trained jointly on different speakers is able to extract individual speaker characteristics and gives promising results of reconstructing intelligible speech with superior word recognition accuracy.

Full paper for this work can be found here.

Requirements

We implemented the code in python2 using tensorflow, keras, scipy, numpy, cv2, sklearn, IPython, fnmatch. The mentioned libraries should be installed before running the codes. All the libraries can be easily installed using pip:

pip install tensorflow-gpu keras scipy opencv-python sklearn

The backend for Keras can be changed easily if needed.

Data preparation

This study is based on GRID corpus(http://spandh.dcs.shef.ac.uk/gridcorpus/). To run the codes, you need to first download and preprocess both videos and audios.

By running prepare_crop_files.py data will be downloaded and frames will be cropped by a manual mask. In order to generate auditory spectrograms, the audios should be processed by NSLTools(http://www.isr.umd.edu/Labs/NSL/Software.htm) using wav2aud function in Matlab.

Since some of the frames in the dataset are corrupted, we generate a path for valid data by create_path.py. Last step before training the network is windowing and integration of all data in .mat formats. This can be done by running data_integration.py

Training the models

Once data preparation steps are done, autoencoder model could be trained on the auditory spectrograms corresponding to valid videos using train_autoencoder.py. Training the main network could be performed using train_main.py.

Demo

You can find all demo files here.

A few samples of the network output are given below:

Speaker 1

Sample1

Speaker 29

Sample2

Cite

If you found this work/code helpful, please cite:

@article{akbari2017lip2audspec,
  title={Lip2AudSpec: Speech reconstruction from silent lip movements video},
  author={Akbari, Hassan and Arora, Himani and Cao, Liangliang and Mesgarani, Nima},
  journal={arXiv preprint arXiv:1710.09798},
  year={2017}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].