Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → HawkAaron → E2e Asr

HawkAaron / E2e Asr

PyTorch Implementations for End-to-End Automatic Speech Recognition

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch speech-recognition asr end-to-end

Projects that are alternatives of or similar to E2e Asr

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

Stars: ✭ 175 (+65.09%)

Mutual labels: speech-recognition, asr, end-to-end

Espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Stars: ✭ 808 (+662.26%)

Mutual labels: speech-recognition, asr, end-to-end

Rnn Transducer

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

Stars: ✭ 114 (+7.55%)

Mutual labels: speech-recognition, asr, end-to-end

Kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.

Stars: ✭ 190 (+79.25%)

Mutual labels: speech-recognition, asr, end-to-end

kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Stars: ✭ 456 (+330.19%)

Mutual labels: end-to-end, speech-recognition, asr

kosr

Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식)

Stars: ✭ 25 (-76.42%)

Mutual labels: end-to-end, speech-recognition, asr

End-to-End-Mandarin-ASR

End-to-end speech recognition on AISHELL dataset.

Stars: ✭ 20 (-81.13%)

Mutual labels: end-to-end, speech-recognition, asr

Tensorflow end2end speech recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Stars: ✭ 305 (+187.74%)

Mutual labels: speech-recognition, asr, end-to-end

Wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 5,907 (+5472.64%)

Mutual labels: speech-recognition, end-to-end

Eesen

The official repository of the Eesen project

Stars: ✭ 738 (+596.23%)

Mutual labels: speech-recognition, asr

Pykaldi

A Python wrapper for Kaldi

Stars: ✭ 756 (+613.21%)

Mutual labels: speech-recognition, asr

Mongolian Speech Recognition

Mongolian speech recognition with PyTorch

Stars: ✭ 97 (-8.49%)

Mutual labels: speech-recognition, asr

Libreasr

💬 An On-Premises, Streaming Speech Recognition System

Stars: ✭ 633 (+497.17%)

Mutual labels: speech-recognition, asr

Ktspeechcrawler

Automatically constructing corpus for automatic speech recognition from YouTube videos

Stars: ✭ 92 (-13.21%)

Mutual labels: speech-recognition, asr

Wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Stars: ✭ 617 (+482.08%)

Mutual labels: speech-recognition, asr

Bigcidian

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Stars: ✭ 99 (-6.6%)

Mutual labels: speech-recognition, asr

Speech Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Stars: ✭ 565 (+433.02%)

Mutual labels: asr, end-to-end

Sincnet

SincNet is a neural architecture for efficiently processing raw audio samples.

Stars: ✭ 764 (+620.75%)

Mutual labels: speech-recognition, asr

Delta

DELTA is a deep learning based natural language and speech processing platform.

Stars: ✭ 1,479 (+1295.28%)

Mutual labels: speech-recognition, asr

Vosk Api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Stars: ✭ 1,357 (+1180.19%)

Mutual labels: speech-recognition, asr

View All Similar Projects ➔

Graves 2013 experiments

File description

model.py: rnnt joint model
model2012.py: graves2012 model
train_rnnt.py: rnnt training script
train_ctc.py: ctc acoustic model training script
eval.py: rnnt & ctc decode
DataLoader.py: kaldi feature loader

Run

Extract feature link kaldi timit example dirs (local steps utils ) excute run.sh to extract 40 dim fbank feature run feature_transform.sh to get 123 dim feature as described in Graves2013
Train CTC acoustic model

python train_ctc.py --lr 1e-3 --bi --dropout 0.5 --out exp/ctc_bi_lr1e-3 --schedule

Train RNNT joint model

python train_rnnt.py --lr 4e-4 --bi --dropout 0.5 --out exp/rnnt_bi_lr4e-4 --schedule

Decode

python eval.py <path to best model> [--ctc] --bi

Results

Model	PER
CTC	21.38
RNN-T	20.59

Requirements

Python 3.6
PyTorch >= 0.4
numpy 1.14
warp-transducer

Reference

RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
RNNT joint (Graves 2013): Speech Recognition with Deep Recurrent Neural Networks
(PyTorch End-to-End Models for ASR)[https://github.com/awni/speech]
(A Fast Sequence Transducer GPU Implementation with PyTorch Bindings)[https://github.com/HawkAaron/warp-transducer/tree/add_network_accelerate]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 106

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗