All Projects → Kyubyong → Tacotron_asr

Kyubyong / Tacotron_asr

Licence: apache-2.0
Speech Recognition Using Tacotron

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tacotron asr

speech to text
how to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (-78.79%)
Mutual labels:  speech, speech-recognition, speech-to-text
Java Speech Api
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+196.97%)
Mutual labels:  speech-recognition, speech, speech-to-text
sova-asr
SOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (-25.45%)
Mutual labels:  speech, speech-recognition, speech-to-text
simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (-46.06%)
Mutual labels:  speech, speech-recognition, speech-to-text
Syn Speech
Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-65.45%)
Mutual labels:  speech-recognition, speech, speech-to-text
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-91.52%)
Mutual labels:  speech, speech-recognition, speech-to-text
Kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+6658.18%)
Mutual labels:  speech-recognition, speech, speech-to-text
anycontrol
Voice control for your websites and applications
Stars: ✭ 53 (-67.88%)
Mutual labels:  speech, speech-recognition, speech-to-text
Discordspeechbot
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: ✭ 35 (-78.79%)
Mutual labels:  speech-recognition, speech, speech-to-text
Annyang
💬 Speech recognition for your site
Stars: ✭ 6,216 (+3667.27%)
Mutual labels:  speech-recognition, speech, speech-to-text
KeenASR-Android-PoC
A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-87.27%)
Mutual labels:  speech, speech-recognition, speech-to-text
Deepspeech
A PaddlePaddle implementation of ASR.
Stars: ✭ 1,219 (+638.79%)
Mutual labels:  speech-recognition, speech, speech-to-text
ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+8.48%)
Mutual labels:  speech, speech-recognition, speech-to-text
deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-50.3%)
Mutual labels:  speech, speech-recognition, speech-to-text
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (+24.24%)
Mutual labels:  speech, speech-recognition, speech-to-text
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+138.18%)
Mutual labels:  speech-recognition, speech, speech-to-text
Edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (+24.24%)
Mutual labels:  speech-recognition, speech, speech-to-text
Speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+46.67%)
Mutual labels:  speech-recognition, speech, speech-to-text
Sonus
💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (+222.42%)
Mutual labels:  speech-recognition, speech, speech-to-text
Openasr
A pytorch based end2end speech recognition system.
Stars: ✭ 69 (-58.18%)
Mutual labels:  speech-recognition, speech, speech-to-text

Speech Recognition Using Tacotron

Motivation

Tacotron is an end-to-end speech generation model which was first introduced in Towards End-to-End Speech Synthesis. It takes as input text at the character level, and targets mel filterbanks and the linear spectrogram. Although it is a generation model, I felt like testing how well it can be applied to the speech recognition task.

Requirements

  • NumPy >= 1.11.1
  • TensorFlow == 1.1
  • librosa

Model description

Tacotron—Speech Synthesis Model (From_ Figure 1 in Towards End-to-End Speech Synthesis)

Modified architecture for speech recognition

Data

The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its text and audio recordings are freely available here. Unfortunately, however, each of the audio files matches a chapter, not a verse, so is too long for many machine learning tasks. I had someone slice them by verse manually. You can download the audio data and its text from my dropbox.

File description

  • hyperparams.py includes all hyper parameters.
  • prepro.py creates training and evaluation data to data/ folder.
  • data_load.py loads data and put them in queues so multiple mini-bach data are generated in parallel.
  • utils.py has some operational functions.
  • modules.py contains building blocks for encoding and decoding networks.
  • networks.py defines encoding and decoding networks.
  • train.py executes training.
  • eval.py executes evaluation.

Training

  • STEP 1. Adjust hyper parameters in hyperparams.py if necessary.
  • STEP 2. Download and extract the audio data and its text.
  • STEP 3. Run train.py. Or you can download my pretrained file

Evaluation

  • Run eval.py to get speech recognition results for the test set.

Results

The training curve looks like

Sample results are

Expected: the third poured out his bowl into the rivers and springs of water and they became blood
Got : the first will lie down to the rivers and springs of waters and it became blood

Expected: i heard the altar saying yes lord god the almighty true and righteous are your judgments
Got : i heard the altar saying yes were like your own like you tree in righteousness for your judgments

Expected: the fourth poured out his bowl on the sun and it was given to him to scorch men with fire
Got : the foolish very armed were on the sun and was given to him to spoke to him with fire

Expected: he gathered them together into the place which is called in hebrew megiddo
Got : he gathered them together into the place which is called and he weep and at every

Expected: every island fled away and the mountains were not found
Got : hadad and kedemoth aroen and another and spread out them

Expected: here is the mind that has wisdom the seven heads are seven mountains on which the woman sits
Got : he is the mighty have wisdom the seven heads of seven rountains are with the wind sixter

Expected: these have one mind and they give their power and authority to the beast
Got : these are those who are mine and they give holl of a fool in the deeps

Expected: the woman whom you saw is the great city which reigns over the kings of the earth
Got : the woman whom he saw it his degrection which ran and to advening to be ear

Expected: for her sins have reached to the sky and god has remembered her iniquities
Got : for he sends a least in the sky and god has remembered her iniquities

Expected: the merchants of the earth weep and mourn over her for no one buys their merchandise any more
Got : the mittites of the earth weeps in your own are before from knowing babylon busine backsliding all t

Expected: and cried out as they looked at the smoke of her burning saying 'what is like the great city'
Got : and cried all the wicked beside of a good one and saying when is like the great sight

Expected: in her was found the blood of prophets and of saints and of all who have been slain on the earth
Got : and her with stones a dwellified confidence and all who have been slain on the earth

Expected: a second said hallelujah her smoke goes up forever and ever
Got : as set him said how many men utter for smoke go down for every male it

Expected: he is clothed in a garment sprinkled with blood his name is called the word of god
Got : he is close in a garment speaking in the blood his name is called 'the word of god'

Expected: the armies which are in heaven followed him on white horses clothed in white pure fine linen
Got : the army which are in heaven falls on the mighty one horses clothes driven on the affliction

Expected: he has on his garment and on his thigh a name written king of kings and lord of lords
Got : he has understandings on his folly among widow the king of kings and yahweh of armies

Expected: i saw an angel coming down out of heaven having the key of the abyss and a great chain in his hand
Got : i saw an even become young lion having you trust of the ages and a great chamber is hand

Expected: and after the thousand years satan will be released from his prison
Got : and after the palace and mizpah and eleven eleenth were the twentieth

Expected: death and hades were thrown into the lake of fire this is the second death the lake of fire
Got : let them hate with one and to wait for fire this is the second death and lead a time

Expected: if anyone was not found written in the book of life he was cast into the lake of fire
Got : the ten man will not think within your demon as with a blood he will cast him to ram for fire

Expected: he who overcomes i will give him these things i will be his god and he will be my son
Got : he who recompenses i will give him be stings i will be his god and he will be my son

Expected: its wall is one hundred fortyfour cubits by the measure of a man that is of an angel
Got : is through all his womb home before you for accusation that we may know him by these are in egypt

Expected: the construction of its wall was jasper the city was pure gold like pure glass
Got : if he struck him of his wallor is not speaking with torment hold on her grass

Expected: i saw no temple in it for the lord god the almighty and the lamb are its temple
Got : i saw in a tenth wind for we will dry up you among the linen ox skillful

Expected: its gates will in no way be shut by day for there will be no night there
Got : his greech wind more redeems shameful the redeemer man don't know

Expected: and they shall bring the glory and the honor of the nations into it so that they may enter
Got : and they shall bring the glory in the high mountains and the egyptian into the midst of the needy

Expected: they will see his face and his name will be on their foreheads
Got : they will see his face and his name on their follows

Expected: behold i come quickly blessed is he who keeps the words of the prophecy of this book
Got : behold i happened with me when i could see me to still it is a prophet his bueld

Expected: he said to me don't seal up the words of the prophecy of this book for the time is at hand
Got : he said to him why sil with the words of the prophets it is book for the times and her

Expected: behold i come quickly my reward is with me to repay to each man according to his work
Got : behold i come perfect i yahweh is with me to repent to be shamed according to his work

Expected: i am the alpha and the omega the first and the last the beginning and the end
Got : i have you hope from you and you and the first from aloes of the dew and the enemy

Expected: he who testifies these things says yes i come quickly amen yes come lord jesus
Got : he who testifies these things says yes i come proclaim i man listen will jesus

Related projects

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].