The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

Stars: ✭ 242 (+29.41%)

Mutual labels: speech, speech-processing

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+57.75%)

Mutual labels: speech, speech-processing

Wavenet Enhancement

Speech Enhancement using Bayesian WaveNet

Stars: ✭ 86 (-54.01%)

Mutual labels: speech, wavenet

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (+19.79%)

Mutual labels: speech, speech-processing

Neural Voice Cloning With Few Samples

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Stars: ✭ 211 (+12.83%)

Mutual labels: speech, speech-processing

LIUM

Scripts for LIUM SpkDiarization tools

Stars: ✭ 28 (-85.03%)

Mutual labels: speech, speech-processing

Pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

Stars: ✭ 297 (+58.82%)

Mutual labels: speech, speech-processing

Pytorch Asr

ASR with PyTorch

Stars: ✭ 124 (-33.69%)

Mutual labels: speech

Tutorial separation

This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

Stars: ✭ 151 (-19.25%)

Mutual labels: speech-processing

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (-34.76%)

Mutual labels: speech

Tts

Text-to-Speech for Arduino

Stars: ✭ 118 (-36.9%)

Mutual labels: speech

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1021.39%)

Mutual labels: speech

Zzz Retired openstt

RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:

Stars: ✭ 146 (-21.93%)

Mutual labels: speech-processing

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (-37.43%)

Mutual labels: speech

Tf Kaldi Speaker

Neural speaker recognition/verification system based on Kaldi and Tensorflow

Stars: ✭ 117 (-37.43%)

Mutual labels: speech-processing

Tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Stars: ✭ 1,756 (+839.04%)

Mutual labels: speech

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (-40.64%)

Mutual labels: speech

Numpy Ml

Machine learning, in numpy

Stars: ✭ 11,100 (+5835.83%)

Mutual labels: wavenet

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+5863.1%)

Mutual labels: speech

Aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Stars: ✭ 1,942 (+938.5%)

Mutual labels: speech

A Convolutional Recurrent Neural Network For Real Time Speech Enhancement

A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch

Stars: ✭ 123 (-34.22%)

Mutual labels: speech-processing

Chatbot Watson Android

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

Stars: ✭ 169 (-9.63%)

Mutual labels: speech

Deepvoice3 pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Stars: ✭ 1,654 (+784.49%)

Mutual labels: speech-processing

Dtln

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

Stars: ✭ 147 (-21.39%)

Mutual labels: speech-processing

Nonautoreggenprogress

Tracking the progress in non-autoregressive generation (translation, transcription, etc.)

Stars: ✭ 118 (-36.9%)

Mutual labels: speech-processing

Siricontrol System

Control anything with Siri voice commands.

Stars: ✭ 180 (-3.74%)

Mutual labels: speech

Tacotron asr

Speech Recognition Using Tacotron

Stars: ✭ 165 (-11.76%)

Mutual labels: speech

Diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Stars: ✭ 139 (-25.67%)

Mutual labels: speech

Wave U Net For Speech Enhancement

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Stars: ✭ 106 (-43.32%)

Mutual labels: speech-processing

Holobot

HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.

Stars: ✭ 114 (-39.04%)

Mutual labels: speech

Speech Enhancement

Deep neural network based speech enhancement toolkit

Stars: ✭ 167 (-10.7%)

Mutual labels: speech-processing

Pytorch Gan Timeseries

GANs for time series generation in pytorch

Stars: ✭ 109 (-41.71%)

Mutual labels: wavenet

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (-26.2%)

Mutual labels: speech

Python Speech recognition

A simple example for use speech recognition baidu api with python.

Stars: ✭ 106 (-43.32%)

Mutual labels: speech

Source Separation Wavenet

A neural network for end-to-end music source separation

Stars: ✭ 185 (-1.07%)

Mutual labels: wavenet

Delta

DELTA is a deep learning based natural language and speech processing platform.

Stars: ✭ 1,479 (+690.91%)

Mutual labels: speech

Fast Wavenet

Speedy Wavenet generation using dynamic programming ⚡

Stars: ✭ 1,705 (+811.76%)

Mutual labels: wavenet

Nsynth wavenet

parallel wavenet based on nsynth

Stars: ✭ 100 (-46.52%)

Mutual labels: wavenet

Pytorch Kaldi Neural Speaker Embeddings

A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.

Stars: ✭ 99 (-47.06%)

Mutual labels: speech-processing

Tts Papers

🐸 collection of TTS papers

Stars: ✭ 160 (-14.44%)

Mutual labels: speech

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (-27.81%)

Mutual labels: speech

Audiomate

Python library for handling audio datasets.

Stars: ✭ 99 (-47.06%)

Mutual labels: speech

Wikipron

Massively multilingual pronunciation mining

Stars: ✭ 99 (-47.06%)

Mutual labels: speech

Voice activity detection

Voice Activity Detection based on Deep Learning & TensorFlow

Stars: ✭ 132 (-29.41%)

Mutual labels: speech

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

Stars: ✭ 1,303 (+596.79%)

Mutual labels: speech

End2end Asr Pytorch

End-to-End Automatic Speech Recognition on PyTorch

Stars: ✭ 175 (-6.42%)

Mutual labels: speech

Vocgan

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Stars: ✭ 158 (-15.51%)

Mutual labels: speech-processing

Avpi

an open source voice command macro software

Stars: ✭ 130 (-30.48%)

Mutual labels: speech

Audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Stars: ✭ 1,262 (+574.87%)

Mutual labels: speech

Julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

Stars: ✭ 1,258 (+572.73%)

Mutual labels: speech

Voc

A physical model of the human vocal tract using literate programming, based on Pink Trombone.

Stars: ✭ 129 (-31.02%)

Mutual labels: speech

Tts

Tools to convert text to speech 📚💬

Stars: ✭ 84 (-55.08%)

Mutual labels: speech

1-60 of 269 similar projects

›