Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → iamjanvijay → rnnt_decoder_cuda

iamjanvijay / rnnt_decoder_cuda

Licence: MIT license

An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.

Programming Languages

1817 projects

36643 projects - #6 most used programming language

139335 projects - #7 most used programming language

30231 projects

Labels

cuda speech-recognition beam-search speech-to-text transducer handwriting-recognition prefix-search rnnt

Projects that are alternatives of or similar to rnnt decoder cuda

Tensorflow end2end speech recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Stars: ✭ 305 (+408.33%)

Mutual labels: speech-recognition, beam-search, speech-to-text

A live speech recognition using Facebooks wav2vec 2.0 model.

Stars: ✭ 205 (+241.67%)

Mutual labels: speech-recognition, speech-to-text

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (+48.33%)

Mutual labels: speech-recognition, speech-to-text

web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.

Stars: ✭ 35 (-41.67%)

Mutual labels: speech-recognition, speech-to-text

Voice control for your websites and applications

Stars: ✭ 53 (-11.67%)

Mutual labels: speech-recognition, speech-to-text

revai-python-sdk

Rev AI Python SDK

Stars: ✭ 35 (-41.67%)

Mutual labels: speech-recognition, speech-to-text

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (-11.67%)

Mutual labels: speech-recognition, speech-to-text

Speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Stars: ✭ 242 (+303.33%)

Mutual labels: speech-recognition, speech-to-text

web-voice-processor

A library for real-time voice processing in web browsers

Stars: ✭ 69 (+15%)

Mutual labels: speech-recognition, speech-to-text

On-device speech-to-index engine powered by deep learning.

Stars: ✭ 30 (-50%)

Mutual labels: speech-recognition, speech-to-text

It recognize your speech and trained AI Bot will respond(i.e Customer Service, Personal Assistant) using Machine Learning API (DialogFlow, apiai), Speech Recognition, GraphQL, Next.js, React, redux

Stars: ✭ 38 (-36.67%)

Mutual labels: speech-recognition, speech-to-text

speech-recognition-evaluation

Evaluate results from ASR/Speech-to-Text quickly

Stars: ✭ 25 (-58.33%)

Mutual labels: speech-recognition, speech-to-text

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-65%)

Mutual labels: speech-recognition, speech-to-text

On-device speech-to-text engine powered by deep learning

Stars: ✭ 354 (+490%)

Mutual labels: speech-recognition, speech-to-text

Speech recognition with tensorflow

Implementation of a seq2seq model for Speech Recognition using the latest version of TensorFlow. Architecture similar to Listen, Attend and Spell.

Stars: ✭ 253 (+321.67%)

Mutual labels: speech-recognition, speech-to-text

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (+198.33%)

Mutual labels: speech-recognition, speech-to-text

a simple speech recognition app using the Web Speech API Interfaces

Stars: ✭ 18 (-70%)

Mutual labels: speech-recognition, speech-to-text

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Stars: ✭ 220 (+266.67%)

Mutual labels: speech-recognition, speech-to-text

NeMo: a toolkit for conversational AI

Stars: ✭ 3,685 (+6041.67%)

Mutual labels: speech-recognition, speech-to-text

Rev.ai Java SDK

Stars: ✭ 16 (-73.33%)

Mutual labels: speech-recognition, speech-to-text

View All Similar Projects ➔

RNN-Transducer Prefix Beam Search

This repository provides an optimised implementation of prefix beam search for RNN-Tranducer loss function (as described in "Sequence Transduction with Recurrent Neural Networks" paper). This implementation takes ~100 milliseconds for a speech segment of ~5 seconds and beam size of 10 (beam size of 10 is adequate for production level error rates).

Sample Run

To execute a sample run of prefix beam search on your machine, execute the following commands:

Clone this repository.

git clone https://github.com/iamjanvijay/rnnt_decoder_cuda.git;

Clean the output folder.

rm rnnt_decoder_cuda/data/outputs/*;

Make the deocder object file.

cd rnnt_decoder_cuda/decoder;
make clean;
make;

Execute the decoder - decoded beams will be saved to data/output folder.

CUDA_VISIBLE_DEVICES=0 ./decoder ../data/inputs/metadata.txt 0 9 10 5001;
CUDA_VISIBLE_DEVICES=$GPU_ID$ ./decoder ../data/inputs/metadata.txt $index_of_first_file_to_read_from_metadata$ $index_of_last_file_to read_from_metadata$ $beam_size$ $vocabulary_size_excluding_blank$;

Contributing

Contributions are welcomed and greatly appreciated.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 60

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗