The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".

Stars: ✭ 77 (+28.33%)

Mutual labels: speech

Timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

Stars: ✭ 202 (+236.67%)

Mutual labels: speech

Voiceprint-recognition-Speaker-recognition

It is a complete project of voiceprint recognition or speaker recognition.

Stars: ✭ 82 (+36.67%)

Mutual labels: speaker-recognition

Lingvo

Stars: ✭ 2,361 (+3835%)

Mutual labels: speech

icassp2019-latex-template

ICASSP 2019 official Latex template

Stars: ✭ 21 (-65%)

Mutual labels: speech

AESRC2020

a deep accent recognition network

Stars: ✭ 35 (-41.67%)

Mutual labels: speaker-recognition

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Stars: ✭ 278 (+363.33%)

Mutual labels: speech

Durian

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.

Stars: ✭ 111 (+85%)

Mutual labels: speech

Vq Vae Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Stars: ✭ 187 (+211.67%)

Mutual labels: speech

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (+198.33%)

Mutual labels: speech

Siricontrol System

Control anything with Siri voice commands.

Stars: ✭ 180 (+200%)

Mutual labels: speech

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-60%)

Mutual labels: speech

Deep speaker Speaker recognition system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

Stars: ✭ 174 (+190%)

Mutual labels: speech

Voice-ML

MobileNet trained with VoxCeleb dataset and used for voice verification

Stars: ✭ 15 (-75%)

Mutual labels: speaker-verification

MajorDomo-Scenarios

Сценарии для системы домашней автоматизации Majordomo

Stars: ✭ 12 (-80%)

Mutual labels: speech

Tts Papers

🐸 collection of TTS papers

Stars: ✭ 160 (+166.67%)

Mutual labels: speech

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

Stars: ✭ 205 (+241.67%)

Mutual labels: speech

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+3110%)

Mutual labels: speech

Shifter

Pitch shifter using WSOLA and resampling implemented by Python3

Stars: ✭ 22 (-63.33%)

Mutual labels: speech

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (+130%)

Mutual labels: speech

Python Speech recognition

A simple example for use speech recognition baidu api with python.

Stars: ✭ 106 (+76.67%)

Mutual labels: speech

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (+125%)

Mutual labels: speech

speaker extraction

target speaker extraction and verification for multi-talker speech

Stars: ✭ 85 (+41.67%)

Mutual labels: speaker-verification

Avpi

an open source voice command macro software

Stars: ✭ 130 (+116.67%)

Mutual labels: speech

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (-3.33%)

Mutual labels: speech

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (+113.33%)

Mutual labels: speech

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+8.33%)

Mutual labels: speech

Audiomate

Python library for handling audio datasets.

Stars: ✭ 99 (+65%)

Mutual labels: speech

RE-VERB

speaker diarization system using an LSTM

Stars: ✭ 22 (-63.33%)

Mutual labels: speaker-diarization

Naver-AI-Hackathon-Speech

2019 Clova AI Hackathon : Speech - Rank 12 / Team Kai.Lib

Stars: ✭ 26 (-56.67%)

Mutual labels: speech

Wikipron

Massively multilingual pronunciation mining

Stars: ✭ 99 (+65%)

Mutual labels: speech

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+391.67%)

Mutual labels: speech

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (+103.33%)

Mutual labels: speech

meta-embeddings

Meta-embeddings are a probabilistic generalization of embeddings in machine learning.

Stars: ✭ 22 (-63.33%)

Mutual labels: speaker-recognition

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (+95%)

Mutual labels: speech

UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

Stars: ✭ 94 (+56.67%)

Mutual labels: speaker-diarization

Holobot

HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.

Stars: ✭ 114 (+90%)

Mutual labels: speech

room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

Stars: ✭ 143 (+138.33%)

Mutual labels: speech

idear

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (+40%)

Mutual labels: speech

temporal-depth-segmentation

Source code (train/test) accompanying the paper entitled "Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach" in CVPR 2019 (https://arxiv.org/abs/1903.10764).

Stars: ✭ 20 (-66.67%)

Mutual labels: temporal-convolutional-network

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

Stars: ✭ 1,303 (+2071.67%)

Mutual labels: speech

browser-apis

🦄 Cool & Fun Browser Web APIs 🥳

Stars: ✭ 21 (-65%)

Mutual labels: speech

Wavenet Enhancement

Speech Enhancement using Bayesian WaveNet

Stars: ✭ 86 (+43.33%)

Mutual labels: speech

Audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Stars: ✭ 1,262 (+2003.33%)

Mutual labels: speech

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-65%)

Mutual labels: speech

lectures-all

Central repository for all lectures on deep learning at UPC ETSETB TelecomBCN.

Stars: ✭ 46 (-23.33%)

Mutual labels: speech

Julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

Stars: ✭ 1,258 (+1996.67%)

Mutual labels: speech

Tts

Tools to convert text to speech 📚💬

Stars: ✭ 84 (+40%)

Mutual labels: speech

AutoSpeech

[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei Zha, Zhangyang Wang

Stars: ✭ 195 (+225%)

Mutual labels: speaker-recognition

Deepspeech

A PaddlePaddle implementation of ASR.

Stars: ✭ 1,219 (+1931.67%)

Mutual labels: speech

Openasr

A pytorch based end2end speech recognition system.

Stars: ✭ 69 (+15%)

Mutual labels: speech

lidbox

End-to-end spoken language identification out of the box.