All Projects → udacity → Aind Vui Capstone

udacity / Aind Vui Capstone

Licence: mit
AIND Term 2 -- VUI Capstone Project

Projects that are alternatives of or similar to Aind Vui Capstone

P5 vehicledetection unet
p5_VehicleDetection_Unet
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Smiles Transformer
Original implementation of the paper "SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery" by Shion Honda et al.
Stars: ✭ 86 (-2.27%)
Mutual labels:  jupyter-notebook
Fairness In Ml
This repository contains the full code for the "Towards fairness in machine learning with adversarial networks" blog post.
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Deprecated Boot Camps
DEPRECATED: please see individual lesson repositories for current material.
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Fcos tensorflow
FCOS: Fully Convolutional One-Stage Object Detection.
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Magnet
MAGNet: Multi-agents control using Graph Neural Networks
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Calogan
Generative Adversarial Networks for High Energy Physics extended to a multi-layer calorimeter simulation
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Wine Deep Learning
Exploring applications of deep learning to the world of wine
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Stanford Project Predicting Stock Prices Using A Lstm Network
Stanford Project: Artificial Intelligence is changing virtually every aspect of our lives. Today’s algorithms accomplish tasks that until recently only expert humans could perform. As it relates to finance, this is an exciting time to adopt a disruptive technology that will transform how everyone invests for generations. Models that explain the returns of individual stocks generally use company and stock characteristics, e.g., the market prices of financial instruments and companies’ accounting data. These characteristics can also be used to predict expected stock returns out-of-sample. Most studies use simple linear models to form these predictions [1] or [2]. An increasing body of academic literature documents that more sophisticated tools from the Machine Learning (ML) and Deep Learning (DL) repertoire, which allow for nonlinear predictor interactions, can improve the stock return forecasts [3], [4] or [5]. The main goal of this project is to investigate whether modern DL techniques can be utilized to more efficiently predict the movements of the stock market. Specifically, we train a LSTM neural network with time series price-volume data and compare its out-of-sample return predictability with the performance of a simple logistic regression (our baseline model).
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Tensorfieldnetworks
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Repo2docker Action
GitHub Action for repo2docker
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Ageprogression
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Faster Rcnn Densecap Torch
Faster-RCNN based on Densecap(deprecated)
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Machine learning code
机器学习与深度学习算法示例
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Game Theory And Python
Game Theory and Python, a workshop investigating repeated games using the prisoner's dilemma
Stars: ✭ 87 (-1.14%)
Mutual labels:  jupyter-notebook
Intro2musictech
公众号“无痛入门音乐科技”开源代码
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Basketball analytics
Repository which contains various scripts and work with various basketball statistics
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Dash Wind Streaming
https://plot.ly/dash/gallery/live-wind-data/
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook
Novel Deep Learning Model For Traffic Sign Detection Using Capsule Networks
capsule networks that achieves outstanding performance on the German traffic sign dataset
Stars: ✭ 88 (+0%)
Mutual labels:  jupyter-notebook

Project Overview

In this notebook, you will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline!

ASR Pipeline

We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. Your algorithm will first convert any raw audio to feature representations that are commonly used for ASR. You will then move on to building neural networks that can map these audio features to transcribed text. After learning about the basic types of layers that are often used for deep learning-based approaches to ASR, you will engage in your own investigations by creating and testing your own state-of-the-art models. Throughout the notebook, we provide recommended research papers for additional reading and links to GitHub repositories with interesting implementations.

Project Instructions

Amazon Web Services

This project requires GPU acceleration to run efficiently. Please refer to the Udacity instructions for setting up a GPU instance for this project, and refer to the project instructions in the classroom for setup. link for AIND students

  1. Follow the Cloud Computing Setup instructions lesson to create an EC2 instance. (The lesson includes all the required package and library installation instructions.)

  2. Obtain the appropriate subsets of the LibriSpeech dataset, and convert all flac files to wav format.

wget http://www.openslr.org/resources/12/dev-clean.tar.gz
tar -xzvf dev-clean.tar.gz
wget http://www.openslr.org/resources/12/test-clean.tar.gz
tar -xzvf test-clean.tar.gz
mv flac_to_wav.sh LibriSpeech
cd LibriSpeech
./flac_to_wav.sh
  1. Create JSON files corresponding to the train and validation datasets.
cd ..
python create_desc_json.py LibriSpeech/dev-clean/ train_corpus.json
python create_desc_json.py LibriSpeech/test-clean/ valid_corpus.json
  1. Start Jupyter:
jupyter notebook --ip=0.0.0.0 --no-browser
  1. Look at the output in the window, and find the line that looks like: http://0.0.0.0:8888/?token=3156e... Copy and paste the complete URL into the address bar of a web browser (Firefox, Safari, Chrome, etc). Before navigating to the URL, replace 0.0.0.0 in the URL with the "IPv4 Public IP" address from the EC2 Dashboard.

Local Environment Setup

You should run this project with GPU acceleration for best performance.

  1. Clone the repository, and navigate to the downloaded folder.
git clone https://github.com/udacity/AIND-VUI-Capstone.git
cd AIND-VUI-Capstone
  1. Create (and activate) a new environment with Python 3.6 and the numpy package.

    • Linux or Mac:
    conda create --name aind-vui python=3.5 numpy
    source activate aind-vui
    
    • Windows:
    conda create --name aind-vui python=3.5 numpy scipy
    activate aind-vui
    
  2. Install TensorFlow.

    • Option 1: To install TensorFlow with GPU support, follow the guide to install the necessary NVIDIA software on your system. If you are using an EC2 GPU instance, you can skip this step and only need to install the tensorflow-gpu package:
    pip install tensorflow-gpu==1.1.0
    
    • Option 2: To install TensorFlow with CPU support only,
    pip install tensorflow==1.1.0
    
  3. Install a few pip packages.

pip install -r requirements.txt
  1. Switch Keras backend to TensorFlow.

    • Linux or Mac:
    KERAS_BACKEND=tensorflow python -c "from keras import backend"
    
    • Windows:
    set KERAS_BACKEND=tensorflow
    python -c "from keras import backend"
    
    • NOTE: a Keras/Windows bug may give this error after the first epoch of training model 0: ‘rawunicodeescape’ codec can’t decode bytes in position 54-55: truncated \uXXXX. To fix it:
      • Find the file keras/utils/generic_utils.py that you are using for the capstone project. It should be in your environment under Lib/site-packages . This may vary, but if using miniconda, for example, it might be located at C:/Users/username/Miniconda3/envs/aind-vui/Lib/site-packages/keras/utils.
      • Copy generic_utils.py to OLDgeneric_utils.py just in case you need to restore it.
      • Open the generic_utils.py file and change this code line:marshal.dumps(func.code).decode(‘raw_unicode_escape’)to this code line:marshal.dumps(func.code).replace(b’\’,b’/’).decode(‘raw_unicode_escape’)
  2. Obtain the libav package.

    • Linux: sudo apt-get install libav-tools
    • Mac: brew install libav
    • Windows: Browse to the Libav website
      • Scroll down to "Windows Nightly and Release Builds" and click on the appropriate link for your system (32-bit or 64-bit).
      • Click nightly-gpl.
      • Download most recent archive file.
      • Extract the file. Move the usr directory to your C: drive.
      • Go back to your terminal window from above.
    rename C:\usr avconv
    set PATH=C:\avconv\bin;%PATH%
    
  3. Obtain the appropriate subsets of the LibriSpeech dataset, and convert all flac files to wav format.

    • Linux or Mac:
    wget http://www.openslr.org/resources/12/dev-clean.tar.gz
    tar -xzvf dev-clean.tar.gz
    wget http://www.openslr.org/resources/12/test-clean.tar.gz
    tar -xzvf test-clean.tar.gz
    mv flac_to_wav.sh LibriSpeech
    cd LibriSpeech
    ./flac_to_wav.sh
    
    • Windows: Download two files (file 1 and file 2) via browser and save in the AIND-VUI-Capstone directory. Extract them with an application that is compatible with tar and gz such as 7-zip or WinZip. Convert the files from your terminal window.
    move flac_to_wav.sh LibriSpeech
    cd LibriSpeech
    powershell ./flac_to_wav.sh
    
  4. Create JSON files corresponding to the train and validation datasets.

cd ..
python create_desc_json.py LibriSpeech/dev-clean/ train_corpus.json
python create_desc_json.py LibriSpeech/test-clean/ valid_corpus.json
  1. Create an IPython kernel for the aind-vui environment. Open the notebook.
python -m ipykernel install --user --name aind-vui --display-name "aind-vui"
jupyter notebook vui_notebook.ipynb
  1. Before running code, change the kernel to match the aind-vui environment by using the drop-down menu. Then, follow the instructions in the notebook.

select aind-vui kernel

NOTE: While some code has already been implemented to get you started, you will need to implement additional functionality to successfully answer all of the questions included in the notebook. Unless requested, do not modify code that has already been included.

Evaluation

Your project will be reviewed by a Udacity reviewer against the CNN project rubric. Review this rubric thoroughly, and self-evaluate your project before submission. All criteria found in the rubric must meet specifications for you to pass.

Project Submission

When you are ready to submit your project, collect the following files and compress them into a single archive for upload:

  • The vui_notebook.ipynb file with fully functional code, all code cells executed and displaying output, and all questions answered.
  • An HTML or PDF export of the project notebook with the name report.html or report.pdf.
  • The sample_models.py file with all model architectures that were trained in the project Jupyter notebook.
  • The results/ folder containing all HDF5 and pickle files corresponding to trained models.

Alternatively, your submission could consist of the GitHub link to your repository.

Project Rubric

Files Submitted

Criteria Meets Specifications
Submission Files The submission includes all required files.

STEP 2: Model 0: RNN

Criteria Meets Specifications
Trained Model 0 The submission trained the model for at least 20 epochs, and none of the loss values in model_0.pickle are undefined. The trained weights for the model specified in simple_rnn_model are stored in model_0.h5.

STEP 2: Model 1: RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed rnn_model Module The submission includes a sample_models.py file with a completed rnn_model module containing the correct architecture.
Trained Model 1 The submission trained the model for at least 20 epochs, and none of the loss values in model_1.pickle are undefined. The trained weights for the model specified in rnn_model are stored in model_1.h5.

STEP 2: Model 2: CNN + RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed cnn_rnn_model Module The submission includes a sample_models.py file with a completed cnn_rnn_model module containing the correct architecture.
Trained Model 2 The submission trained the model for at least 20 epochs, and none of the loss values in model_2.pickle are undefined. The trained weights for the model specified in cnn_rnn_model are stored in model_2.h5.

STEP 2: Model 3: Deeper RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed deep_rnn_model Module The submission includes a sample_models.py file with a completed deep_rnn_model module containing the correct architecture.
Trained Model 3 The submission trained the model for at least 20 epochs, and none of the loss values in model_3.pickle are undefined. The trained weights for the model specified in deep_rnn_model are stored in model_3.h5.

STEP 2: Model 4: Bidirectional RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed bidirectional_rnn_model Module The submission includes a sample_models.py file with a completed bidirectional_rnn_model module containing the correct architecture.
Trained Model 4 The submission trained the model for at least 20 epochs, and none of the loss values in model_4.pickle are undefined. The trained weights for the model specified in bidirectional_rnn_model are stored in model_4.h5.

STEP 2: Compare the Models

Criteria Meets Specifications
Question 1 The submission includes a detailed analysis of why different models might perform better than others.

STEP 2: Final Model

Criteria Meets Specifications
Completed final_model Module The submission includes a sample_models.py file with a completed final_model module containing a final architecture that is not identical to any of the previous architectures.
Trained Final Model The submission trained the model for at least 20 epochs, and none of the loss values in model_end.pickle are undefined. The trained weights for the model specified in final_model are stored in model_end.h5.
Question 2 The submission includes a detailed description of how the final model architecture was designed.

Suggestions to Make your Project Stand Out!

(1) Add a Language Model to the Decoder

The performance of the decoding step can be greatly enhanced by incorporating a language model. Build your own language model from scratch, or leverage a repository or toolkit that you find online to improve your predictions.

(2) Train on Bigger Data

In the project, you used some of the smaller downloads from the LibriSpeech corpus. Try training your model on some larger datasets - instead of using dev-clean.tar.gz, download one of the larger training sets on the website.

(3) Try out Different Audio Features

In this project, you had the choice to use either spectrogram or MFCC features. Take the time to test the performance of both of these features. For a special challenge, train a network that uses raw audio waveforms!

Special Thanks

We have borrowed the create_desc_json.py and flac_to_wav.sh files from the ba-dls-deepspeech repository, along with some functions used to generate spectrograms.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].