Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+6308.62%)
Android SpeechAndroid speech recognition and text to speech made easy
Stars: ✭ 310 (+78.16%)
OpenasrA pytorch based end2end speech recognition system.
Stars: ✭ 69 (-60.34%)
Pocketsphinx PythonPython interface to CMU Sphinxbase and Pocketsphinx libraries
Stars: ✭ 298 (+71.26%)
WavegradA fast, high-quality neural vocoder.
Stars: ✭ 138 (-20.69%)
Sednndeep learning based speech enhancement using keras or pytorch, make it easy to use
Stars: ✭ 288 (+65.52%)
WatbotAn Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: ✭ 64 (-63.22%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+48.85%)
Code Switching PapersA curated list of research papers and resources on code-switching
Stars: ✭ 122 (-29.89%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-71.84%)
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-67.24%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-85.63%)
Tts Papers🐸 collection of TTS papers
Stars: ✭ 160 (-8.05%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (-29.31%)
StlThe ITU-T Software Tool Library (G.191)
Stars: ✭ 44 (-74.71%)
Speech256An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
Stars: ✭ 51 (-70.69%)
ser-with-w2v2Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Stars: ✭ 40 (-77.01%)
Dialectid e2eEnd to End Dialect Identification using Convolutional Neural Network
Stars: ✭ 40 (-77.01%)
torch-asgAuto Segmentation Criterion (ASG) implemented in pytorch
Stars: ✭ 42 (-75.86%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (-22.41%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-58.05%)
WsayWindows "say"
Stars: ✭ 36 (-79.31%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (-57.47%)
HolobotHoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.
Stars: ✭ 114 (-34.48%)
speech recognition ctcUse ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
Stars: ✭ 40 (-77.01%)
PykaldiA Python wrapper for Kaldi
Stars: ✭ 756 (+334.48%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-9.2%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+1105.17%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-89.08%)
PraatPraat: Doing Phonetics By Computer
Stars: ✭ 675 (+287.93%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-86.78%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-71.26%)
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (+263.79%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (-45.4%)
Avpian open source voice command macro software
Stars: ✭ 130 (-25.29%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-61.49%)
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Stars: ✭ 545 (+213.22%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+28.74%)
AudiomatePython library for handling audio datasets.
Stars: ✭ 99 (-43.1%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-93.1%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-91.95%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-90.23%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+181.61%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+7871.26%)
GttsPython library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+648.85%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-52.87%)
CboardAAC communication system with text-to-speech for the browser
Stars: ✭ 437 (+151.15%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-92.53%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (-26.44%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+134.48%)
Chatbot Watson AndroidAn Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (-2.87%)
Tacotron asrSpeech Recognition Using Tacotron
Stars: ✭ 165 (-5.17%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: ✭ 1,756 (+909.2%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: ✭ 1,258 (+622.99%)
Tts🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+75.29%)