AdaSpeechAdaSpeech: Adaptive Text to Speech for Custom Voice
Stars: ✭ 108 (+468.42%)
DiffwaveDiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Stars: ✭ 139 (+631.58%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (+252.63%)
Tacotron pytorchPyTorch implementation of Tacotron speech synthesis model.
Stars: ✭ 242 (+1173.68%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (+63.16%)
LingvoLingvo
Stars: ✭ 2,361 (+12326.32%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (+289.47%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (+484.21%)
PysptkA python wrapper for Speech Signal Processing Toolkit (SPTK).
Stars: ✭ 297 (+1463.16%)
WavegradA fast, high-quality neural vocoder.
Stars: ✭ 138 (+626.32%)
StyleSpeechOfficial implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (+747.37%)
idear🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (+342.11%)
WavegradImplementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+1189.47%)
IMS-ToucanText-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+1452.63%)
melganMelGAN implementation with Multi-Band and Full Band supports...
Stars: ✭ 54 (+184.21%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+731.58%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+2478.95%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (+1805.26%)
spokestack-androidExtensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+173.68%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (+284.21%)
WsayWindows "say"
Stars: ✭ 36 (+89.47%)
TFGANTFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Stars: ✭ 65 (+242.11%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (+73.68%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-26.32%)
Speech Feature ExtractionFeature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (+310.53%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (+163.16%)
speech-transformerTransformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (+110.53%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (+400%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-10.53%)
VAD-LTSDEfficient voice activity detection algorithm using long-term speech information
Stars: ✭ 37 (+94.74%)
TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-5.26%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+72900%)
Relation-Network-PyTorchImplementation of Relation Network and Recurrent Relational Network using PyTorch v1.3. Original papers: (RN) https://arxiv.org/abs/1706.01427 (RRN): https://arxiv.org/abs/1711.08028
Stars: ✭ 17 (-10.53%)
Parallel-Tacotron2PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Stars: ✭ 149 (+684.21%)
JD-NMFJoint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (+5.26%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+331.58%)
TinyCogSmall Robot, Toy Robot platform
Stars: ✭ 29 (+52.63%)
ConvLSTM-PyTorchConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST
Stars: ✭ 202 (+963.16%)
spokestack-iosSpokestack: give your iOS app a voice interface!
Stars: ✭ 27 (+42.11%)
D-TDNNPyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (+215.79%)
SelfOrganizingMap-SOMPytorch implementation of Self-Organizing Map(SOM). Use MNIST dataset as a demo.
Stars: ✭ 33 (+73.68%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-31.58%)
WaveGrad2PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Stars: ✭ 55 (+189.47%)
DAF3DDeep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound
Stars: ✭ 60 (+215.79%)
Daft-ExprtPyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Stars: ✭ 41 (+115.79%)
nabaztag-phpa simple php implementation of a Nabaztag server
Stars: ✭ 14 (-26.32%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (+21.05%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (+84.21%)
ExtensibleTTS-PyTorchAn extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery
Stars: ✭ 25 (+31.58%)
tldrTLDR is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self-supervised learning losses
Stars: ✭ 95 (+400%)
Speech-BackbonesThis is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Stars: ✭ 205 (+978.95%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+1078.95%)
open-speech-corpora💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+4326.32%)
nlp classificationImplementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 224 (+1078.95%)
klatt-synKlatt formant synthesizer
Stars: ✭ 18 (-5.26%)