wav2vec2-liveA live speech recognition using Facebooks wav2vec 2.0 model.
Stars: ✭ 205 (-8.48%)
simple-obs-sttSpeech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (-60.27%)
Annyang💬 Speech recognition for your site
Stars: ✭ 6,216 (+2675%)
PnccA implementation of Power Normalized Cepstral Coefficients: PNCC
Stars: ✭ 40 (-82.14%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+118.75%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: ✭ 1,258 (+461.61%)
bobBob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob
Stars: ✭ 38 (-83.04%)
ASR-Audio-Data-LinksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (-20.09%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+836.16%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (-39.73%)
EdgedictWorking online speech recognition based on RNN Transducer. ( Trained model release available in release )
Stars: ✭ 205 (-8.48%)
QuantumSpeech-QCNNIEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
Stars: ✭ 71 (-68.3%)
TF-Speech-Recognition-Challenge-SolutionSource code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.
Stars: ✭ 58 (-74.11%)
UHV-OTS-SpeechA data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Stars: ✭ 94 (-58.04%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+82.14%)
anycontrolVoice control for your websites and applications
Stars: ✭ 53 (-76.34%)
torchsubbandPytorch implementation of subband decomposition
Stars: ✭ 63 (-71.87%)
awesome-speech-enhancementA curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.
Stars: ✭ 48 (-78.57%)
PhomemeSimple sentence mixing tool (work in progress)
Stars: ✭ 18 (-91.96%)
react-clientAn React client library for Speechly API
Stars: ✭ 71 (-68.3%)
VoiceBridgeVoiceBridge - an AI-TOOLKIT Open Source C++ Speech Recognition Toolkit
Stars: ✭ 17 (-92.41%)
VAD-LTSDEfficient voice activity detection algorithm using long-term speech information
Stars: ✭ 37 (-83.48%)
PCPMPresenting Collection of Pretrained Models. Links to pretrained models in NLP and voice.
Stars: ✭ 21 (-90.62%)
InimesedAn Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Stars: ✭ 65 (-70.98%)
ml-with-audioHF's ML for Audio study group
Stars: ✭ 104 (-53.57%)
DeepSpeech-APIThe code enables users to use Mozilla's Deep Speech model over the Web Browser.
Stars: ✭ 31 (-86.16%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-94.64%)
speaker extractiontarget speaker extraction and verification for multi-talker speech
Stars: ✭ 85 (-62.05%)
AmazonSpeechTranslatorEnd-to-end Solution for Speech Recognition, Text Translation, and Text-to-Speech for iOS using Amazon Translate and Amazon Polly as AWS Machine Learning managed services.
Stars: ✭ 50 (-77.68%)
speechlessSpeech-to-text based on wav2letter built for transfer learning
Stars: ✭ 92 (-58.93%)
apiSpeechly public API definitions and generated code
Stars: ✭ 15 (-93.3%)
2018-dlslUPC Deep Learning for Speech and Language 2018
Stars: ✭ 18 (-91.96%)
A chronology of deep learningTracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense research in AI.
Stars: ✭ 47 (-79.02%)
Unity live captionUse Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
Stars: ✭ 26 (-88.39%)
rnnt decoder cudaAn efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
Stars: ✭ 60 (-73.21%)
mongolian-nlpUseful resources for Mongolian NLP
Stars: ✭ 119 (-46.87%)
salutejsSmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке JavaScript
Stars: ✭ 35 (-84.37%)
Android-TTS-STTOne line solution for Android Text to speech(TTS) & Speech to Text(STT) translation problem
Stars: ✭ 77 (-65.62%)
Speech Feature ExtractionFeature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (-65.18%)
BookLibraryBook Library of P&W Studio
Stars: ✭ 13 (-94.2%)
melganMelGAN implementation with Multi-Band and Full Band supports...
Stars: ✭ 54 (-75.89%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-85.27%)
CNN-VADA Convolutional Neural Network based Voice Activity Detector for Smartphones
Stars: ✭ 60 (-73.21%)
telltimeiOS application to tell the time in the British way 🇬🇧⏰
Stars: ✭ 49 (-78.12%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+6091.96%)
StyleSpeechOfficial implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (-28.12%)
KhronosThe open source intelligent personal assistant
Stars: ✭ 25 (-88.84%)
web-speech-cognitive-servicesPolyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Stars: ✭ 35 (-84.37%)
ctc-asrEnd-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
Stars: ✭ 112 (-50%)