Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+1452.63%)

Mutual labels: speech, speech-synthesis

melgan

MelGAN implementation with Multi-Band and Full Band supports...

Stars: ✭ 54 (+184.21%)

Mutual labels: speech, speech-synthesis

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (+731.58%)

Mutual labels: speech, speech-synthesis

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+2478.95%)

Mutual labels: speech, speech-synthesis

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+1805.26%)

Mutual labels: speech, speech-synthesis

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (+173.68%)

Mutual labels: speech, speech-synthesis

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (+284.21%)

Mutual labels: speech, speech-synthesis

Wsay

Windows "say"

Stars: ✭ 36 (+89.47%)

Mutual labels: speech, speech-synthesis

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+10036.84%)

Mutual labels: speech, speech-synthesis

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+242.11%)

Mutual labels: speech, speech-synthesis

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (+73.68%)

Mutual labels: speech, speech-synthesis

linear16

Converts an audio file to LINEAR16 Google-speech compatible file.

Stars: ✭ 14 (-26.32%)

Mutual labels: speech

Speech Feature Extraction

Feature extraction of speech signal is the initial stage of any speech recognition system.

Stars: ✭ 78 (+310.53%)

Mutual labels: speech

spoken-word

Spoken Word

Stars: ✭ 46 (+142.11%)

Mutual labels: speech-synthesis

opensnips

Open source projects related to Snips https://snips.ai/.

Stars: ✭ 50 (+163.16%)

Mutual labels: speech

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (+110.53%)

Mutual labels: speech

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (+400%)

Mutual labels: speech

DeepSegmentor

Sequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)

Stars: ✭ 17 (-10.53%)

Mutual labels: speech

VAD-LTSD

Efficient voice activity detection algorithm using long-term speech information

Stars: ✭ 37 (+94.74%)

Mutual labels: speech

TASNET

Time-domain Audio Separation Network (IN PYTORCH)

Stars: ✭ 18 (-5.26%)

Mutual labels: speech

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Stars: ✭ 13,870 (+72900%)

Mutual labels: speech

Relation-Network-PyTorch

Implementation of Relation Network and Recurrent Relational Network using PyTorch v1.3. Original papers: (RN) https://arxiv.org/abs/1706.01427 (RRN): https://arxiv.org/abs/1711.08028

Stars: ✭ 17 (-10.53%)

Mutual labels: pytorch-implementation

Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Stars: ✭ 149 (+684.21%)

Mutual labels: speech-synthesis

JD-NMF

Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)

Stars: ✭ 20 (+5.26%)

Mutual labels: speech

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

Stars: ✭ 82 (+331.58%)

Mutual labels: speech

TinyCog

Small Robot, Toy Robot platform

Stars: ✭ 29 (+52.63%)

Mutual labels: speech-synthesis

ConvLSTM-PyTorch

ConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST

Stars: ✭ 202 (+963.16%)

Mutual labels: pytorch-implementation

spokestack-ios

Spokestack: give your iOS app a voice interface!

Stars: ✭ 27 (+42.11%)

Mutual labels: speech-synthesis

D-TDNN

PyTorch implementation of Densely Connected Time Delay Neural Network

Stars: ✭ 60 (+215.79%)

Mutual labels: speech

SelfOrganizingMap-SOM

Pytorch implementation of Self-Organizing Map(SOM). Use MNIST dataset as a demo.

Stars: ✭ 33 (+73.68%)

Mutual labels: pytorch-implementation

Audio Signal Processing

Audio or speech signal processing guide.

Stars: ✭ 45 (+136.84%)

Mutual labels: speech

kaldi helpers

🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.

Stars: ✭ 13 (-31.58%)

Mutual labels: speech

WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Stars: ✭ 55 (+189.47%)

Mutual labels: speech-synthesis

DAF3D

Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound

Stars: ✭ 60 (+215.79%)

Mutual labels: pytorch-implementation

Daft-Exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Stars: ✭ 41 (+115.79%)

Mutual labels: speech-synthesis

nabaztag-php

a simple php implementation of a Nabaztag server

Stars: ✭ 14 (-26.32%)

Mutual labels: speech

HTK

The Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.

Stars: ✭ 23 (+21.05%)

Mutual labels: speech

speech to text

how to use the Google Cloud Speech API to transcribe audio/video files.

Stars: ✭ 35 (+84.21%)

Mutual labels: speech

ExtensibleTTS-PyTorch

An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery

Stars: ✭ 25 (+31.58%)

Mutual labels: speech-synthesis

tldr

TLDR is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self-supervised learning losses

Stars: ✭ 95 (+400%)

Mutual labels: pytorch-implementation

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Stars: ✭ 205 (+978.95%)

Mutual labels: speech-synthesis

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (+1078.95%)

Mutual labels: speech

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies