SpecaugmentA Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Stars: ✭ 408 (+248.72%)
Ios 10 SamplerCode examples for new APIs of iOS 10.
Stars: ✭ 3,341 (+2755.56%)
DiscordspeechbotA speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: ✭ 35 (-70.09%)
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Stars: ✭ 493 (+321.37%)
Amazing Python Scripts🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Stars: ✭ 229 (+95.73%)
SoloudFree, easy, portable audio engine for games
Stars: ✭ 1,048 (+795.73%)
Tts🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+160.68%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: ✭ 1,258 (+975.21%)
PysptkA python wrapper for Speech Signal Processing Toolkit (SPTK).
Stars: ✭ 297 (+153.85%)
Annyang💬 Speech recognition for your site
Stars: ✭ 6,216 (+5212.82%)
Sonus💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Stars: ✭ 532 (+354.7%)
flite-goGo bindings for Flite (festival-lite)
Stars: ✭ 14 (-88.03%)
Xr3player🎧 🎼 Advanced JavaFX Media Player
Stars: ✭ 472 (+303.42%)
Awesome KaldiThis is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+235.9%)
Dc ttsA TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Stars: ✭ 1,017 (+769.23%)
InaspeechsegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Stars: ✭ 352 (+200.85%)
DeltaDELTA is a deep learning based natural language and speech processing platform.
Stars: ✭ 1,479 (+1164.1%)
Css10CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Stars: ✭ 302 (+158.12%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (-73.5%)
DeepspeechA PaddlePaddle implementation of ASR.
Stars: ✭ 1,219 (+941.88%)
hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Stars: ✭ 88 (-24.79%)
SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Stars: ✭ 661 (+464.96%)
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Stars: ✭ 545 (+365.81%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (+5.13%)
WatbotAn Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: ✭ 64 (-45.3%)
GttsPython library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+1013.68%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+318.8%)
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-51.28%)
CboardAAC communication system with text-to-speech for the browser
Stars: ✭ 437 (+273.5%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+248.72%)
StlThe ITU-T Software Tool Library (G.191)
Stars: ✭ 44 (-62.39%)
AudioData manipulation and transformation for audio signal processing, powered by PyTorch
Stars: ✭ 1,262 (+978.63%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (+209.4%)
Dialectid e2eEnd to End Dialect Identification using Convolutional Neural Network
Stars: ✭ 40 (-65.81%)
Tts🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+4538.46%)
HolobotHoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.
Stars: ✭ 114 (-2.56%)
Android SpeechAndroid speech recognition and text to speech made easy
Stars: ✭ 310 (+164.96%)
WsayWindows "say"
Stars: ✭ 36 (-69.23%)
Pocketsphinx PythonPython interface to CMU Sphinxbase and Pocketsphinx libraries
Stars: ✭ 298 (+154.7%)
TtsTools to convert text to speech 📚💬
Stars: ✭ 84 (-28.21%)
Sednndeep learning based speech enhancement using keras or pytorch, make it easy to use
Stars: ✭ 288 (+146.15%)
PykaldiA Python wrapper for Kaldi
Stars: ✭ 756 (+546.15%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+121.37%)
AudiomatePython library for handling audio datasets.
Stars: ✭ 99 (-15.38%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-58.12%)
PraatPraat: Doing Phonetics By Computer
Stars: ✭ 675 (+476.92%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-78.63%)
OpenasrA pytorch based end2end speech recognition system.
Stars: ✭ 69 (-41.03%)
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (+441.03%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-5.13%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (-15.38%)
Nlp Paper自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Stars: ✭ 67 (-42.74%)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: ✭ 622 (+431.62%)