All Projects → soheil-mpg → Speech-Recognition

soheil-mpg / Speech-Recognition

Licence: other
End-to-End Speech Recognition using Neural Networks.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Speech-Recognition

kaldi-long-audio-alignment
Long audio alignment using Kaldi
Stars: ✭ 21 (-32.26%)
Mutual labels:  automatic-speech-recognition, asr
leopard
On-device speech-to-text engine powered by deep learning
Stars: ✭ 354 (+1041.94%)
Mutual labels:  automatic-speech-recognition, asr
sova-asr
SOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (+296.77%)
Mutual labels:  automatic-speech-recognition, asr
wave2vec-recognize-docker
Wave2vec 2.0 Recognize pipeline
Stars: ✭ 30 (-3.23%)
Mutual labels:  automatic-speech-recognition, asr
demo vietasr
Vietnamese Speech Recognition
Stars: ✭ 22 (-29.03%)
Mutual labels:  automatic-speech-recognition, asr
wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Stars: ✭ 2,384 (+7590.32%)
Mutual labels:  automatic-speech-recognition, asr
kaldi helpers
🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-58.06%)
Mutual labels:  automatic-speech-recognition
lightning-asr
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.
Stars: ✭ 36 (+16.13%)
Mutual labels:  asr
torchain
WIP: pytorch FFI wrapper for Kaldi chain loss (a.k.a. Lattice Free MMI)
Stars: ✭ 20 (-35.48%)
Mutual labels:  asr
AESRC2020
a deep accent recognition network
Stars: ✭ 35 (+12.9%)
Mutual labels:  asr
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+67.74%)
Mutual labels:  asr
speech-recognition
SDKs and docs for Skit's speech to text service
Stars: ✭ 20 (-35.48%)
Mutual labels:  asr
opensnips
Open source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (+61.29%)
Mutual labels:  asr
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-12.9%)
Mutual labels:  asr
kaldi-alligner
scripts to align a given wave to its transcription using trained models by Kaldi
Stars: ✭ 24 (-22.58%)
Mutual labels:  asr
commonvoice-utils
Linguistic processing for Common Voice
Stars: ✭ 32 (+3.23%)
Mutual labels:  asr
syn-speech-samples
An application that demostrate the usage of Syn.Speech library for Speech Recognition
Stars: ✭ 24 (-22.58%)
Mutual labels:  asr
vosk-asterisk
Speech Recognition in Asterisk with Vosk Server
Stars: ✭ 52 (+67.74%)
Mutual labels:  asr
IR-GAN
Augmenting Room Impulse Response
Stars: ✭ 21 (-32.26%)
Mutual labels:  automatic-speech-recognition
soxan
Wav2Vec for speech recognition, classification, and audio classification
Stars: ✭ 113 (+264.52%)
Mutual labels:  automatic-speech-recognition

Automatic Speech Recognition (ASR)

Project Overview

we will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline! The completed pipeline will accept raw audio as input and return a predicted transcription of the spoken language. The full pipeline is summarized in the figure below.

  • STEP 1 is a pre-processing step that converts raw audio to one of two feature representations that are commonly used for ASR.
  • STEP 2 is an acoustic model which accepts audio features as input and returns a probability distribution over all potential transcriptions. After learning about the basic types of neural networks that are often used for acoustic modeling, we will engage in our own investigations, to design your own acoustic model!
  • STEP 3 in the pipeline takes the output from the acoustic model and returns a predicted transcription.

Dataset

We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. The algorithm will first convert any raw audio to feature representations that are commonly used for ASR. We will then move on to building neural networks that can map these audio features to transcribed text.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].