Top 184 speech open source projects

Nlp Paper
自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Watbot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Sound Source Localization Algorithm doa estimation
关于语音信号声源定位DOA估计所用的一些传统算法
Syn Speech
Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stl
The ITU-T Software Tool Library (G.191)
Dc tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Dialectid e2e
End to End Dialect Identification using Convolutional Neural Network
Discordspeechbot
A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Lightspeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Praat
Praat: Doing Phonetics By Computer
✭ 675
cspeech
Speech Emotion Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Vad
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Nodejs Speech
Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Sonus
💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
Tacotron
Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Java Speech Api
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Cboard
AAC communication system with text-to-speech for the browser
Specaugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Voice Converter Cyclegan
Voice Converter Using CycleGAN and Non-Parallel Data
Tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Voice Builder
An opensource text-to-speech (TTS) voice building tool
Inaspeechsegmenter
CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Android Speech
Android speech recognition and text to speech made easy
Css10
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Pocketsphinx Python
Python interface to CMU Sphinxbase and Pocketsphinx libraries
Pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Sednn
deep learning based speech enhancement using keras or pytorch, make it easy to use
Speech Vad Demo
集成Webrtc的VAD,用于切分音频文件
Speech Aligner
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Amazing Python Scripts
🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Noise2Noise-audio denoising without clean training data
Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
hifigan-denoiser
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
flite-go
Go bindings for Flite (festival-lite)
tt-vae-gan
Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.
Speech256
An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
editts
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
torch-asg
Auto Segmentation Criterion (ASG) implemented in pytorch
Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
LIUM
Scripts for LIUM SpkDiarization tools
speech recognition ctc
Use ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
jackpair
p2p speech encrypting device with analog audio interface suitable for GSM phones
61-120 of 184 speech projects