Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.

Stars: ✭ 192 (-13.9%)

Mutual labels: cuda

Pine

🌲 Aimbot powered by real-time object detection with neural networks, GPU accelerated with Nvidia. Optimized for use with CS:GO.

Stars: ✭ 202 (-9.42%)

Mutual labels: cuda

Hip

HIP: C++ Heterogeneous-Compute Interface for Portability

Stars: ✭ 2,609 (+1069.96%)

Mutual labels: cuda

Nvidia Docker

Build and run Docker containers leveraging NVIDIA GPUs

Stars: ✭ 13,961 (+6160.54%)

Mutual labels: cuda

Softmax Splatting

an implementation of softmax splatting for differentiable forward warping using PyTorch

Stars: ✭ 218 (-2.24%)

Mutual labels: cuda

Oneflow

OneFlow is a performance-centered and open-source deep learning framework.

Stars: ✭ 2,868 (+1186.1%)

Mutual labels: cuda

Genomeworks

SDK for GPU accelerated genome assembly and analysis

Stars: ✭ 215 (-3.59%)

Mutual labels: cuda

Viseron

Self-hosted NVR with object detection

Stars: ✭ 192 (-13.9%)

Mutual labels: cuda

Simplegpuhashtable

A simple GPU hash table implemented in CUDA using lock free techniques

Stars: ✭ 198 (-11.21%)

Mutual labels: cuda

Bohrium

Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX

Stars: ✭ 209 (-6.28%)

Mutual labels: cuda

Ck Caffe

Collective Knowledge workflow for Caffe to automate installation across diverse platforms and to collaboratively evaluate and optimize Caffe-based workloads across diverse hardware, software and data sets (compilers, libraries, tools, models, inputs):

Stars: ✭ 192 (-13.9%)

Mutual labels: cuda

Nicehashquickminer

Super simple & easy Windows 10 cryptocurrency miner made by NiceHash.

Stars: ✭ 211 (-5.38%)

Mutual labels: cuda

Macos Egpu Cuda Guide

Set up CUDA for machine learning (and gaming) on macOS using a NVIDIA eGPU

Stars: ✭ 187 (-16.14%)

Mutual labels: cuda

Amgx

Distributed multigrid linear solver library on GPU

Stars: ✭ 207 (-7.17%)

Mutual labels: cuda

Pedestrian alignment

TCSVT2018 Pedestrian Alignment Network for Large-scale Person Re-identification

Stars: ✭ 223 (+0%)

Mutual labels: cuda

Relion

Image-processing software for cryo-electron microscopy

Stars: ✭ 219 (-1.79%)

Mutual labels: cuda

Haste

Haste: a fast, simple, and open RNN library

Stars: ✭ 214 (-4.04%)

Mutual labels: cuda

View All Similar Projects ➔

Implementation of Deep Speech 2 in neon

This repository contains an implementation of Baidu SVAIL's Deep Speech 2 model in neon. Much of the model is readily available in mainline neon; to also support the CTC cost function, we have included a neon-compatible wrapper for Baidu's Warp-CTC.

Deep Speech 2 models are computationally intensive, and thus they can require long periods of time to run. Even with near-perfect GPU utilization, the model can take up to 1 week to train on large enough datasets to see respectable performance. Please keep this in mind when exploring this repo.

We have used this code to train models on both the Wall Street Journal (81 hours) and Librispeech (1000 hours) datasets. The WSJ dataset is available through the LDC only; however, Librispeech can be freely acquired from Librispeech corpus.

The model presented here uses a basic argmax-based decoder:

Choose the most probable character in each frame
Collapse the resulting output string according to CTC's rules: remove repeat characters first, remove blank characters next.

After decoding, you might expect outputs like this when trained on WSJ data:

Ground truth	Model output
united presidential is a life insurance company	younited presidentiol is a lefe in surance company
that was certainly true last week	that was sertainly true last week
we're not ready to say we're in technical default a spokesman said	we're now ready to say we're intechnical default a spokesman said

Or outputs like this when trained on Librispeech (see "Decoding and evaluating a trained model"):

Ground truth	Model output
this had some effect in calming him	this had some offectind calming him
he went in and examined his letters but there was nothing from carrie	he went in an examined his letters but there was nothing from carry
the design was different but the thing was clearly the same	the design was differampat that thing was clarly the same

Getting Started

neon 2.3.0 and the aeon dataloader (v1.0.0) must both be installed.
Clone the repo: git clone https://github.com/NervanaSystems/deepspeech.git && cd deepspeech.
Within a neon virtualenv, run pip install -r requirements.txt.
Run make to build warp-ctc.

Training a model

1. Prepare a manifest file for your dataset.

The details on how to go about doing this are determined by the specifics of the dataset.

Example: Librispeech recipe

A recipe for ingesting Librispeech data is provided in data/ingest_librispeech.py. Note that Librispeech provides distinct datasets for training and validation, and each set must be ingested separately. Additionally, we'll have to get around the quirky way that the Librispeech data is distributed; after "unpacking" the archives, we should re-pack them in a consistent manner.

To be more precise, Librispeech data is distributed in zipped tar files, e.g. train-clean-100.tar.gz for training and dev-clean.tar.gz for validation. Upon unpacking, each archive creates a directory named LibriSpeech, so trying to unpack both files together in the same directory is a bad idea. To get around this, try something like:

$ mkdir librispeech && cd librispeech
$ wget http://www.openslr.org/resources/12/train-clean-100.tar.gz
$ wget http://www.openslr.org/resources/12/dev-clean.tar.gz
$ tar xvzf dev-clean.tar.gz LibriSpeech/dev-clean  --strip-components=1
$ tar xvzf train-clean-100.tar.gz LibriSpeech/train-clean-100  --strip-components=1

Follow the above prescription and you will have the training data as a subdirectory librispeech/train-clean-100 and the validation data in a subdirectory librispeech/dev-clean. To ingest the data, you would then run the python script on the directory where you've unpacked the clean training data, followed by directions to where you want the script to write the transcripts and training mainfests for that dataset:

$ python data/ingest_librispeech.py <absolute path to train-clean-100 directory> <absolute path to directory to write transcripts to> <absolute path to where to write training manifest to>

For example, if the absolute path to the train-clean-100 directory is located in /usr/local/data/librispeech/train-clean-100, run:

$ python data/ingest_librispeech.py  /usr/local/data/librispeech/train-clean-100  /usr/local/data/librispeech/train-clean-100/transcripts_dir  /usr/local/data/librispeech/train-clean-100/train-manifest.csv

which would create a training manifest file named train-manifest.csv. Similarly, if the absolute path to the dev-clean directory is located at /usr/local/data/librispeech/dev-clean, run:

$ python data/ingest_librispeech.py  /usr/local/data/librispeech/dev-clean  /usr/local/data/librispeech/dev-clean/transcripts_dir  /usr/local/data/librispeech/train-clean-100/val-manifest.csv

To train on the full 1000 hours, execute the same commands for the 360 hour and 540 hour training datasets as well. The manifest files can then be concatenated with a simple:

$ cat /path/to/100_hour_manifest.csv /path/to/360_hour_manifest.csv /path/to/540_hour_manifest.csv > /path/to/1000_hour_manifest.csv

2a. Train a new model

$ python train.py --manifest train:<training manifest> --manifest val:<validation manifest> -e <num_epochs> -z <batch_size> -s </path/to/model_output.pkl> [-b <backend>]

where <training manifest> is the path to the training manifest file produced in the ingest. For the example above, that path is /usr/local/data/librispeech/train-clean-100/train-manifest.csv) and <validation manifest> is the path to the validation manifest file.

2b. Continue training after pause on a previous model

For a previously-trained model that wasn't trained for the full time needed, it's possible to resume training by passing the --model_file </path/to/pre-trained_model> argument to train.py. For example, you could continue training a pre-trained model from our Model Zoo sample. This particular model was trained using 1000 hours of speech data from the Librispeech corpus. The model was trained for 16 epochs after attaining a Character Error Rate (CER) of 14% without using a language model. You could continue training it for, say, an additional 4 epochs, by calling:

$ python train.py --manifest train:<training manifest> --manifest val:<validation manifest> -e20  -z <batch_size> -s </path/to/model_output.prm> --model_file </path/to/pre-trained_model> [-b <backend>]

which will save a new model to model_output.prm.

Decoding and evaluating a trained model

After you have a trained model, it's easy to evaluate its performance on any given dataset. Simply create a manifest file and then call:

$ python evaluate.py --manifest val:/path/to/manifest.csv --model_file /path/to/saved_model.prm

replacing the file paths as needed. It prints CERs (Character Error Rates) by default. To instead print WERs (Word Error Rates), include the argument --use_wer.

For example, you could evaluate our pre-trained model from our Model Zoo. To evaluate the pre-trained model, follow these steps:

Download some test data from the Librispeech ASR corpus and prepare a manifest file for the dataset that follows the prescription provided above.
Download the pre-trained DS2 model from our Model Zoo.
Subject the pre-trained model and the manifest file for the test data to the evaluate.py script, as described above.
Optionally inspect the transcripts produced by the trained model; this can be done by appending it with the argument --inference_file <name_of_file_to_save_results_to.pkl>. The result dumps the model transcripts together with the corresponding "ground truth" transcripts to a pickle file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 223

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (11) 🔗