Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lucko515 → Speech Recognition Neural Network

lucko515 / Speech Recognition Neural Network

This is the end-to-end Speech Recognition neural network, deployed in Keras. This was my final project for Artificial Intelligence Nanodegree @Udacity.

Labels

html deep-learning speech-recognition recurrent-neural-networks lstm-neural-networks gru

Projects that are alternatives of or similar to Speech Recognition Neural Network

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1316.89%)

Mutual labels: speech-recognition, recurrent-neural-networks, lstm-neural-networks, gru

Rnn ctc

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Stars: ✭ 220 (+48.65%)

Mutual labels: speech-recognition, recurrent-neural-networks, gru

bitcoin-prediction

bitcoin prediction algorithms

Stars: ✭ 21 (-85.81%)

Mutual labels: recurrent-neural-networks, gru, lstm-neural-networks

Deep Learning Time Series

List of papers, code and experiments using deep learning for time series forecasting

Stars: ✭ 796 (+437.84%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

Da Rnn

📃 **Unofficial** PyTorch Implementation of DA-RNN (arXiv:1704.02971)

Stars: ✭ 256 (+72.97%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

Ctcwordbeamsearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model for TensorFlow.

Stars: ✭ 398 (+168.92%)

Mutual labels: speech-recognition, recurrent-neural-networks

Conversational-AI-Chatbot-using-Practical-Seq2Seq

A simple open domain generative based chatbot based on Recurrent Neural Networks

Stars: ✭ 17 (-88.51%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

Tensorflow Sentiment Analysis On Amazon Reviews Data

Implementing different RNN models (LSTM,GRU) & Convolution models (Conv1D, Conv2D) on a subset of Amazon Reviews data with TensorFlow on Python 3. A sentiment analysis project.

Stars: ✭ 34 (-77.03%)

Mutual labels: lstm-neural-networks, gru

Rnn lstm gesture recog

For recognising hand gestures using RNN and LSTM... Implementation in TensorFlow

Stars: ✭ 14 (-90.54%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

Tensorflow Lstm Sin

TensorFlow 1.3 experiment with LSTM (and GRU) RNNs for sine prediction

Stars: ✭ 52 (-64.86%)

Mutual labels: recurrent-neural-networks, gru

Bitcoin Price Prediction Using Lstm

Bitcoin price Prediction ( Time Series ) using LSTM Recurrent neural network

Stars: ✭ 67 (-54.73%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

ms-convSTAR

[RSE21] Pytorch code for hierarchical time series classification with multi-stage convolutional RNN

Stars: ✭ 17 (-88.51%)

Mutual labels: gru, lstm-neural-networks

dts

A Keras library for multi-step time-series forecasting.

Stars: ✭ 130 (-12.16%)

Mutual labels: recurrent-neural-networks, gru

Ctcdecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, prefix search, beam search and token passing. Implemented in Python.

Stars: ✭ 529 (+257.43%)

Mutual labels: speech-recognition, recurrent-neural-networks

Deep-Learning

This repo provides projects on deep-learning mainly using Tensorflow 2.0

Stars: ✭ 22 (-85.14%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

Theano Kaldi Rnn

THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.

Stars: ✭ 31 (-79.05%)

Mutual labels: recurrent-neural-networks, gru

Gdax Orderbook Ml

Application of machine learning to the Coinbase (GDAX) orderbook

Stars: ✭ 60 (-59.46%)

Mutual labels: recurrent-neural-networks, gru

Gru Svm

[ICMLC 2018] A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection

Stars: ✭ 76 (-48.65%)

Mutual labels: recurrent-neural-networks, gru

Rnn Text Classification Tf

Tensorflow Implementation of Recurrent Neural Network (Vanilla, LSTM, GRU) for Text Classification

Stars: ✭ 114 (-22.97%)

Mutual labels: recurrent-neural-networks, gru

Sequence-to-Sequence-Learning-of-Financial-Time-Series-in-Algorithmic-Trading

My bachelor's thesis—analyzing the application of LSTM-based RNNs on financial markets. 🤓

Stars: ✭ 64 (-56.76%)

Mutual labels: recurrent-neural-networks, lstm-neural-networks

View All Similar Projects ➔

Project Overview

In this notebook, you will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline!

We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. Your algorithm will first convert any raw audio to feature representations that are commonly used for ASR. You will then move on to building neural networks that can map these audio features to transcribed text. After learning about the basic types of layers that are often used for deep learning-based approaches to ASR, you will engage in your own investigations by creating and testing your own state-of-the-art models. Throughout the notebook, we provide recommended research papers for additional reading and links to GitHub repositories with interesting implementations.

Project Instructions

Getting Started

Clone the repository, and navigate to the downloaded folder.

git clone https://github.com/udacity/AIND-VUI-Capstone.git
cd AIND-VUI-Capstone

Create (and activate) a new environment with Python 3.6 and the numpy package.

Linux or Mac:

conda create --name aind-vui python=3.5 numpy
source activate aind-vui

Windows:

conda create --name aind-vui python=3.5 numpy scipy
activate aind-vui

Install TensorFlow.
- Option 1: To install TensorFlow with GPU support, follow the guide to install the necessary NVIDIA software on your system. If you are using the Udacity AMI, you can skip this step and only need to install the tensorflow-gpu package:
```
pip install tensorflow-gpu==1.1.0
```
- Option 2: To install TensorFlow with CPU support only,
```
pip install tensorflow==1.1.0
```
Install a few pip packages.

pip install -r requirements.txt

Switch Keras backend to TensorFlow.

Linux or Mac:

KERAS_BACKEND=tensorflow python -c "from keras import backend"

Windows:

set KERAS_BACKEND=tensorflow
python -c "from keras import backend"

Obtain the libav package.
- Linux: sudo apt-get install libav-tools
- Mac: brew install libav
- Windows: Browse to the Libav website
  - Scroll down to "Windows Nightly and Release Builds" and click on the appropriate link for your system (32-bit or 64-bit).
  - Click nightly-gpl.
  - Download most recent archive file.
  - Extract the file. Move the usr directory to your C: drive.
  - Go back to your terminal window from above.
```
rename C:\usr avconv
set PATH=C:\avconv\bin;%PATH%
```
Obtain the appropriate subsets of the LibriSpeech dataset, and convert all flac files to wav format.
- Linux or Mac:
```
wget http://www.openslr.org/resources/12/dev-clean.tar.gz
tar -xzvf dev-clean.tar.gz
wget http://www.openslr.org/resources/12/test-clean.tar.gz
tar -xzvf test-clean.tar.gz
mv flac_to_wav.sh LibriSpeech
cd LibriSpeech
./flac_to_wav.sh
```
- Windows: Download two files (file 1 and file 2) via browser and save in the AIND-VUI-Capstone directory. Extract them with an application that is compatible with tar and gz such as 7-zip or WinZip. Convert the files from your terminal window.
```
move flac_to_wav.sh LibriSpeech
cd LibriSpeech
powershell ./flac_to_wav.sh
```
Create JSON files corresponding to the train and validation datasets.

cd ..
python create_desc_json.py LibriSpeech/dev-clean/ train_corpus.json
python create_desc_json.py LibriSpeech/test-clean/ valid_corpus.json

Create an IPython kernel for the aind-vui environment. Open the notebook.

python -m ipykernel install --user --name aind-vui --display-name "aind-vui"
jupyter notebook vui_notebook.ipynb

Before running code, change the kernel to match the aind-vui environment by using the drop-down menu. Then, follow the instructions in the notebook.

NOTE: While some code has already been implemented to get you started, you will need to implement additional functionality to successfully answer all of the questions included in the notebook. Unless requested, do not modify code that has already been included.

Amazon Web Services

If you do not have access to a local GPU, you could use Amazon Web Services to launch an EC2 GPU instance. Please refer to the Udacity instructions for setting up a GPU instance for this project.

Evaluation

Your project will be reviewed by a Udacity reviewer against the CNN project rubric. Review this rubric thoroughly, and self-evaluate your project before submission. All criteria found in the rubric must meet specifications for you to pass.

Project Submission

When you are ready to submit your project, collect the following files and compress them into a single archive for upload:

The vui_notebook.ipynb file with fully functional code, all code cells executed and displaying output, and all questions answered.
An HTML or PDF export of the project notebook with the name report.html or report.pdf.
The sample_models.py file with all model architectures that were trained in the project Jupyter notebook.
The results/ folder containing all HDF5 and pickle files corresponding to trained models.

Alternatively, your submission could consist of the GitHub link to your repository.

Project Rubric

Files Submitted

Criteria	Meets Specifications
Submission Files	The submission includes all required files.

STEP 2: Model 0: RNN

Criteria	Meets Specifications
Trained Model 0	The submission trained the model for at least 20 epochs, and none of the loss values in `model_0.pickle` are undefined. The trained weights for the model specified in `simple_rnn_model` are stored in `model_0.h5`.

STEP 2: Model 1: RNN + TimeDistributed Dense

Criteria	Meets Specifications
Completed `rnn_model` Module	The submission includes a `sample_models.py` file with a completed `rnn_model` module containing the correct architecture.
Trained Model 1	The submission trained the model for at least 20 epochs, and none of the loss values in `model_1.pickle` are undefined. The trained weights for the model specified in `rnn_model` are stored in `model_1.h5`.

STEP 2: Model 2: CNN + RNN + TimeDistributed Dense

Criteria	Meets Specifications
Completed `cnn_rnn_model` Module	The submission includes a `sample_models.py` file with a completed `cnn_rnn_model` module containing the correct architecture.
Trained Model 2	The submission trained the model for at least 20 epochs, and none of the loss values in `model_2.pickle` are undefined. The trained weights for the model specified in `cnn_rnn_model` are stored in `model_2.h5`.

STEP 2: Model 3: Deeper RNN + TimeDistributed Dense

Criteria	Meets Specifications
Completed `deep_rnn_model` Module	The submission includes a `sample_models.py` file with a completed `deep_rnn_model` module containing the correct architecture.
Trained Model 3	The submission trained the model for at least 20 epochs, and none of the loss values in `model_3.pickle` are undefined. The trained weights for the model specified in `deep_rnn_model` are stored in `model_3.h5`.

STEP 2: Model 4: Bidirectional RNN + TimeDistributed Dense

Criteria	Meets Specifications
Completed `bidirectional_rnn_model` Module	The submission includes a `sample_models.py` file with a completed `bidirectional_rnn_model` module containing the correct architecture.
Trained Model 4	The submission trained the model for at least 20 epochs, and none of the loss values in `model_4.pickle` are undefined. The trained weights for the model specified in `bidirectional_rnn_model` are stored in `model_4.h5`.

STEP 2: Compare the Models

Criteria	Meets Specifications
Question 1	The submission includes a detailed analysis of why different models might perform better than others.

STEP 2: Final Model

Criteria	Meets Specifications
Completed `final_model` Module	The submission includes a `sample_models.py` file with a completed `final_model` module containing a final architecture that is not identical to any of the previous architectures.
Trained Final Model	The submission trained the model for at least 20 epochs, and none of the loss values in `model_end.pickle` are undefined. The trained weights for the model specified in `final_model` are stored in `model_end.h5`.
Question 2	The submission includes a detailed description of how the final model architecture was designed.

Suggestions to Make your Project Stand Out!

(1) Add a Language Model to the Decoder

The performance of the decoding step can be greatly enhanced by incorporating a language model. Build your own language model from scratch, or leverage a repository or toolkit that you find online to improve your predictions.

(2) Train on Bigger Data

In the project, you used some of the smaller downloads from the LibriSpeech corpus. Try training your model on some larger datasets - instead of using dev-clean.tar.gz, download one of the larger training sets on the website.

(3) Try out Different Audio Features

In this project, you had the choice to use either spectrogram or MFCC features. Take the time to test the performance of both of these features. For a special challenge, train a network that uses raw audio waveforms!

Special Thanks

We have borrowed the create_desc_json.py and flac_to_wav.sh files from the ba-dls-deepspeech repository, along with some functions used to generate spectrograms.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 148

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗