Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → oliverguhr → wav2vec2-live

oliverguhr / wav2vec2-live

Licence: MIT license

A live speech recognition using Facebooks wav2vec 2.0 model.

Programming Languages

139335 projects - #7 most used programming language

Labels

pyaudio speech speech-recognition speech-to-text asr wav2vec wav2vec2

Projects that are alternatives of or similar to wav2vec2-live

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (-37.56%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

A pytorch based end2end speech recognition system.

Stars: ✭ 69 (-66.34%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-40%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework

Stars: ✭ 57 (-72.2%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

Lingvo

Stars: ✭ 2,361 (+1051.71%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (-12.68%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Stars: ✭ 205 (+0%)

Mutual labels: speech, speech-recognition, speech-to-text, asr

A PaddlePaddle implementation of ASR.

Stars: ✭ 1,219 (+494.63%)

Mutual labels: speech, speech-recognition, speech-to-text

Speech Recognition Using Tacotron

Stars: ✭ 165 (-19.51%)

Mutual labels: speech, speech-recognition, speech-to-text

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+922.93%)

Mutual labels: speech, speech-recognition, asr

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+5339.51%)

Mutual labels: speech, speech-recognition, speech-to-text

A Python wrapper for Kaldi

Stars: ✭ 756 (+268.78%)

Mutual labels: speech, speech-recognition, asr

DELTA is a deep learning based natural language and speech processing platform.

Stars: ✭ 1,479 (+621.46%)

Mutual labels: speech, speech-recognition, asr

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

Stars: ✭ 175 (-14.63%)

Mutual labels: speech, speech-recognition, asr

Discordspeechbot

A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

Stars: ✭ 35 (-82.93%)

Mutual labels: speech, speech-recognition, speech-to-text

Speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Stars: ✭ 242 (+18.05%)

Mutual labels: speech, speech-recognition, speech-to-text

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Stars: ✭ 245 (+19.51%)

Mutual labels: speech, speech-to-text, asr

A merged version of multiple open-source German speech datasets.

Stars: ✭ 21 (-89.76%)

Mutual labels: speech-recognition, speech-to-text, asr

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Stars: ✭ 532 (+159.51%)

Mutual labels: speech, speech-recognition, speech-to-text

💬 Speech recognition for your site

Stars: ✭ 6,216 (+2932.2%)

Mutual labels: speech, speech-recognition, speech-to-text

View All Similar Projects ➔

automatic speech recognition with wav2vec2

Use any wav2vec model with a microphone.

Setup

I recommend to install this project in a virtual environment.

python3 -m venv ./venv
source ./venv/bin/activate
pip install -r requirements.txt

Depending on linux distribution you might encounter an error that portaudio was not found when installing pyaudio. For Ubuntu you can solve that issue by installing the "portaudio19-dev" package.

sudo apt install portaudio19-dev

Finally you can test the speech recognition:

python live_asr.py

Possible Issues:

The code uses the systems default audio device. Please make sure that you set your systems default audio device correctly.
"attempt to connect to server failed" you can safely ignore this message from pyaudio. It just means, that pyaudio can't connect to the jack audio server.

Usage

You can use any wav2vec2 model from the huggingface model hub. Just set the model name, all files will be downloaded on first execution.

from live_asr import LiveWav2Vec2

english_model = "facebook/wav2vec2-large-960h-lv60-self"
german_model = "maxidl/wav2vec2-large-xlsr-german"
asr = LiveWav2Vec2(german_model,device_name="default")
asr.start()

try:        
    while True:
        text,sample_length,inference_time = asr.get_last_text()                        
        print(f"{sample_length:.3f}s"
        +f"\t{inference_time:.3f}s"
        +f"\t{text}")
        
except KeyboardInterrupt:   
    asr.stop()

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 205

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗