Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → pannous → Tensorflow Speech Recognition

pannous / Tensorflow Speech Recognition

Licence: other

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Programming Languages

139335 projects - #7 most used programming language

15916 projects

Labels

deep-learning tensorflow neural-network speech-recognition speech-to-text stt

Projects that are alternatives of or similar to Tensorflow Speech Recognition

speech-recognition-evaluation

Evaluate results from ASR/Speech-to-Text quickly

Stars: ✭ 25 (-98.82%)

Mutual labels: speech-recognition, speech-to-text, stt

On-device speech-to-text engine powered by deep learning

Stars: ✭ 354 (-83.29%)

Mutual labels: speech-recognition, speech-to-text, stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (-95.8%)

Mutual labels: speech-recognition, speech-to-text, stt

Speech to text bot for Discord using Mozilla's DeepSpeech

Stars: ✭ 14 (-99.34%)

Mutual labels: speech-recognition, speech-to-text, stt

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Stars: ✭ 841 (-60.29%)

Mutual labels: speech-recognition, speech-to-text, stt

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-94.19%)

Mutual labels: speech-recognition, speech-to-text, stt

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

Stars: ✭ 82 (-96.13%)

Mutual labels: speech-recognition, speech-to-text, stt

Vietnamese Speech Recognition

Stars: ✭ 22 (-98.96%)

Mutual labels: speech-recognition, speech-to-text, stt

Self Supervised Speech Recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework

Stars: ✭ 106 (-95%)

Mutual labels: speech-recognition, speech-to-text

Kalliope is a framework that will help you to create your own personal assistant.

Stars: ✭ 1,509 (-28.75%)

Mutual labels: speech-recognition, speech-to-text

Tensorflow Ctc Speech Recognition

Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).

Stars: ✭ 127 (-94%)

Mutual labels: speech-recognition, speech-to-text

Wav2letter.pytorch

A fully convolution-network for speech-to-text, built on pytorch.

Stars: ✭ 104 (-95.09%)

Mutual labels: speech-recognition, speech-to-text

Spokestack Python

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.

Stars: ✭ 103 (-95.14%)

Mutual labels: speech-recognition, speech-to-text

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+426.49%)

Mutual labels: speech-recognition, speech-to-text

Speech And Text

Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字（PocketSphinx、百度 API、科大讯飞 API）和文字转语音（pyttsx3）

Stars: ✭ 102 (-95.18%)

Mutual labels: speech-recognition, speech-to-text

Awesome Ai Services

An overview of the AI-as-a-service landscape

Stars: ✭ 133 (-93.72%)

Mutual labels: speech-recognition, speech-to-text

Go Astideepspeech

Golang bindings for Mozilla's DeepSpeech speech-to-text library

Stars: ✭ 137 (-93.53%)

Mutual labels: speech-recognition, speech-to-text

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Stars: ✭ 1,378 (-34.94%)

Mutual labels: speech-recognition, speech-to-text

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (-93.96%)

Mutual labels: speech-recognition, speech-to-text

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Stars: ✭ 171 (-91.93%)

Mutual labels: speech-recognition, speech-to-text

View All Similar Projects ➔

Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Update Mozilla released DeepSpeech

They achieve good error rates. Free Speech is in good hands, go there if you are an end user. For now this project is only maintained for educational purposes.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Installation

clone code

git clone https://github.com/pannous/tensorflow-speech-recognition
cd tensorflow-speech-recognition
git clone https://github.com/pannous/layer.git
git clone https://github.com/pannous/tensorpeers.git

pyaudio

requirements portaudio from http://www.portaudio.com/

git clone  https://git.assembla.com/portaudio.git
./configure --prefix=/path/to/your/local
make
make install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/local/lib
export LIDRARY_PATH=$LIBRARY_PATH:/path/to/your/local/lib
export CPATH=$CPATH:/path/to/your/local/include
source ~/.bashrc

install pyaudio

pip install pyaudio

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Update: Nervana demonstrated that it is possible for 'independents' to build speech recognizers that are state of the art.

Fun tasks for newcomers

Watch video : https://www.youtube.com/watch?v=u9FPqkuoEJ8
Understand and correct the corresponding code: lstm-tflearn.py
Data Augmentation : create on-the-fly modulation of the data: increase the speech frequency, add background noise, alter the pitch etc,...

Extensions

Extensions to current tensorflow which are probably needed:

WarpCTC on the GPU see issue
Incremental collaborative snapshots ('P2P learning') !
Modular graphs/models + persistance

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out to [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 2,118

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (31) 🔗