wikipronMassively multilingual pronunciation mining
Stars: ✭ 167 (-75.26%)
fadeA Simulation Framework for Auditory Discrimination Experiments
Stars: ✭ 12 (-98.22%)
PysptkA python wrapper for Speech Signal Processing Toolkit (SPTK).
Stars: ✭ 297 (-56%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-89.04%)
TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-97.33%)
Ios 10 SamplerCode examples for new APIs of iOS 10.
Stars: ✭ 3,341 (+394.96%)
LIUMScripts for LIUM SpkDiarization tools
Stars: ✭ 28 (-95.85%)
SpecaugmentA Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Stars: ✭ 408 (-39.56%)
KARENKAREN: Unifying Hatespeech Detection and Benchmarking
Stars: ✭ 18 (-97.33%)
Amazing Python Scripts🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Stars: ✭ 229 (-66.07%)
tt-vae-ganTimbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.
Stars: ✭ 37 (-94.52%)
web-speech-demoLearn how to build a simple text-to-speech voice app for the web using the Web Speech API.
Stars: ✭ 19 (-97.19%)
InaspeechsegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Stars: ✭ 352 (-47.85%)
jarvisJarvis Home Automation
Stars: ✭ 81 (-88%)
Xr3player🎧 🎼 Advanced JavaFX Media Player
Stars: ✭ 472 (-30.07%)
spokestack-androidExtensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-92.3%)
Css10CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Stars: ✭ 302 (-55.26%)
jackpairp2p speech encrypting device with analog audio interface suitable for GSM phones
Stars: ✭ 26 (-96.15%)
Sonus💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (-21.19%)
nabaztag-phpa simple php implementation of a Nabaztag server
Stars: ✭ 14 (-97.93%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (-94.81%)
Awesome KaldiThis is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (-41.78%)
hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Stars: ✭ 88 (-86.96%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (-81.78%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-98.22%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (-46.37%)
Speech256An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
Stars: ✭ 51 (-92.44%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (-27.41%)
ser-with-w2v2Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Stars: ✭ 40 (-94.07%)
Tts🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+704%)
torch-asgAuto Segmentation Criterion (ASG) implemented in pytorch
Stars: ✭ 42 (-93.78%)
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Stars: ✭ 545 (-19.26%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-89.19%)
Android SpeechAndroid speech recognition and text to speech made easy
Stars: ✭ 310 (-54.07%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (-89.04%)
CboardAAC communication system with text-to-speech for the browser
Stars: ✭ 437 (-35.26%)
speech recognition ctcUse ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
Stars: ✭ 40 (-94.07%)
Pocketsphinx PythonPython interface to CMU Sphinxbase and Pocketsphinx libraries
Stars: ✭ 298 (-55.85%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-76.59%)
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (-6.22%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-97.19%)
Sednndeep learning based speech enhancement using keras or pytorch, make it easy to use
Stars: ✭ 288 (-57.33%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-96.59%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (-39.56%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-92.59%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (-61.63%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (-85.93%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-90.07%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-92.74%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (-66.81%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-96.3%)
SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Stars: ✭ 661 (-2.07%)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: ✭ 622 (-7.85%)
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Stars: ✭ 493 (-26.96%)
Tts🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (-54.81%)
flite-goGo bindings for Flite (festival-lite)
Stars: ✭ 14 (-97.93%)