Nlp Paper自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
WatbotAn Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
SoloudFree, easy, portable audio engine for games
StlThe ITU-T Software Tool Library (G.191)
Dc ttsA TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Dialectid e2eEnd to End Dialect Identification using Convolutional Neural Network
DiscordspeechbotA speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Annyang💬 Speech recognition for your site
PraatPraat: Doing Phonetics By Computer
SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Nodejs SpeechNode.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Sonus💬 /so.nus/ STT (speech to text) for Node with offline hotword detection
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
CboardAAC communication system with text-to-speech for the browser
SpecaugmentA Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Neural spEnd-to-end ASR/LM implementation with PyTorch
Awesome KaldiThis is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Tts🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
InaspeechsegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Tts🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Css10CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
PysptkA python wrapper for Speech Signal Processing Toolkit (SPTK).
Sednndeep learning based speech enhancement using keras or pytorch, make it easy to use
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Amazing Python Scripts🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
minutes🔭 Speaker diarization via transfer learning
flite-goGo bindings for Flite (festival-lite)
sova-asrSOVA ASR (Automatic Speech Recognition)
tt-vae-ganTimbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.
Speech256An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
ser-with-w2v2Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
torch-asgAuto Segmentation Criterion (ASG) implemented in pytorch
wikipronMassively multilingual pronunciation mining
spokestack-androidExtensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
LIUMScripts for LIUM SpkDiarization tools
jackpairp2p speech encrypting device with analog audio interface suitable for GSM phones
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)