pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+709.65%)

Mutual labels: speech, kaldi

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+4205.41%)

Mutual labels: speech, kaldi

Lhotse

Stars: ✭ 236 (-8.88%)

Mutual labels: speech, kaldi

opensnips

Open source projects related to Snips https://snips.ai/.

Stars: ✭ 50 (-80.69%)

Mutual labels: speech, kaldi

Voice2Mesh

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Stars: ✭ 67 (-74.13%)

Mutual labels: speech

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (-63.32%)

Mutual labels: speech

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-79.92%)

Mutual labels: speech

Speech256

An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.

Stars: ✭ 51 (-80.31%)

Mutual labels: speech

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (-13.51%)

Mutual labels: speech

LIUM

Scripts for LIUM SpkDiarization tools

Stars: ✭ 28 (-89.19%)

Mutual labels: speech

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-95.37%)

Mutual labels: speech

KaldiBasedSpeakerVerification

Kaldi based speaker verification

Stars: ✭ 43 (-83.4%)

Mutual labels: kaldi

jackpair

p2p speech encrypting device with analog audio interface suitable for GSM phones

Stars: ✭ 26 (-89.96%)

Mutual labels: speech

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (-84.56%)

Mutual labels: speech

VAD-LTSD

Efficient voice activity detection algorithm using long-term speech information

Stars: ✭ 37 (-85.71%)

Mutual labels: speech

flite-go

Go bindings for Flite (festival-lite)

Stars: ✭ 14 (-94.59%)

Mutual labels: speech

ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

Stars: ✭ 40 (-84.56%)

Mutual labels: speech

fade

A Simulation Framework for Auditory Discrimination Experiments

Stars: ✭ 12 (-95.37%)

Mutual labels: speech

JD-NMF

Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)

Stars: ✭ 20 (-92.28%)

Mutual labels: speech

D-TDNN

PyTorch implementation of Densely Connected Time Delay Neural Network

Stars: ✭ 60 (-76.83%)

Mutual labels: speech

speech to text

how to use the Google Cloud Speech API to transcribe audio/video files.

Stars: ✭ 35 (-86.49%)

Mutual labels: speech

speech-to-text

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras

Stars: ✭ 61 (-76.45%)

Mutual labels: kaldi

TASNET

Time-domain Audio Separation Network (IN PYTORCH)

Stars: ✭ 18 (-93.05%)

Mutual labels: speech

tt-vae-gan

Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

Stars: ✭ 37 (-85.71%)

Mutual labels: speech

Audio Signal Processing

Audio or speech signal processing guide.

Stars: ✭ 45 (-82.63%)

Mutual labels: speech

SER-datasets

A collection of datasets for the purpose of emotion recognition/detection in speech.

Stars: ✭ 74 (-71.43%)

Mutual labels: speech

web-speech-demo

Learn how to build a simple text-to-speech voice app for the web using the Web Speech API.

Stars: ✭ 19 (-92.66%)

Mutual labels: speech

minutes

🔭 Speaker diarization via transfer learning

Stars: ✭ 25 (-90.35%)

Mutual labels: speech

Speech Feature Extraction

Feature extraction of speech signal is the initial stage of any speech recognition system.

Stars: ✭ 78 (-69.88%)

Mutual labels: speech

speech recognition ctc

Use ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别

Stars: ✭ 40 (-84.56%)

Mutual labels: speech

linear16

Converts an audio file to LINEAR16 Google-speech compatible file.

Stars: ✭ 14 (-94.59%)

Mutual labels: speech

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Stars: ✭ 74 (-71.43%)

Mutual labels: speech

DeepSegmentor

Sequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)

Stars: ✭ 17 (-93.44%)

Mutual labels: speech

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (-39%)

Mutual labels: speech

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Stars: ✭ 13,870 (+5255.21%)

Mutual labels: speech

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

Stars: ✭ 49 (-81.08%)

Mutual labels: speech

deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

Stars: ✭ 82 (-68.34%)

Mutual labels: speech

MelNet-SpeechGeneration

Implementation of MelNet in PyTorch to generate high-fidelity audio samples

Stars: ✭ 19 (-92.66%)

Mutual labels: speech

kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database