Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob

Stars: ✭ 38 (-36.67%)

Mutual labels: speaker-recognition, speaker-verification

kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database

Stars: ✭ 17 (-71.67%)

Mutual labels: speaker-recognition, speaker-verification

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (+273.33%)

Mutual labels: speech, speaker-verification

speaker-recognition-papers

Share some recent speaker recognition papers and their implementations.

Stars: ✭ 92 (+53.33%)

Mutual labels: speaker-recognition, speaker-verification

KaldiBasedSpeakerVerification

Kaldi based speaker verification

Stars: ✭ 43 (-28.33%)

Mutual labels: speaker-recognition, speaker-verification

wavenet-classifier

Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks

Stars: ✭ 54 (-10%)

Mutual labels: speaker-recognition, speaker-verification

Delta

DELTA is a deep learning based natural language and speech processing platform.

Stars: ✭ 1,479 (+2365%)

Mutual labels: speech, speaker-verification

meta-SR

Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)

Stars: ✭ 58 (-3.33%)

Mutual labels: speaker-recognition, speaker-verification

minutes

🔭 Speaker diarization via transfer learning

Stars: ✭ 25 (-58.33%)

Mutual labels: speech, speaker-diarization

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+18485%)

Mutual labels: speech, speaker-verification

spatio-temporal-brain

A Deep Graph Neural Network Architecture for Modelling Spatio-temporal Dynamics in rs-fMRI Data

Stars: ✭ 22 (-63.33%)

Mutual labels: temporal-convolutional-network

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-45%)

Mutual labels: speech

eidos-audition

Collection of auditory models.

Stars: ✭ 25 (-58.33%)

Mutual labels: speech

cape

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Stars: ✭ 29 (-51.67%)

Mutual labels: speech

CVC

CVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)

Stars: ✭ 45 (-25%)

Mutual labels: speech

StyleSpeech

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+168.33%)

Mutual labels: speech

icassp2019-latex-template

ICASSP 2019 official Latex template

Stars: ✭ 21 (-65%)

Mutual labels: speech

SignDetect

This application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.

Stars: ✭ 21 (-65%)

Mutual labels: speech

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

Stars: ✭ 14 (-76.67%)

Mutual labels: speech

ventib

📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.

Stars: ✭ 43 (-28.33%)

Mutual labels: speech

AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

Stars: ✭ 108 (+80%)

Mutual labels: speech

NBSS

The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".

Stars: ✭ 77 (+28.33%)

Mutual labels: speech

Voiceprint-recognition-Speaker-recognition

It is a complete project of voiceprint recognition or speaker recognition.

Stars: ✭ 82 (+36.67%)

Mutual labels: speaker-recognition

AESRC2020

a deep accent recognition network

Stars: ✭ 35 (-41.67%)

Mutual labels: speaker-recognition

Voice-ML

MobileNet trained with VoxCeleb dataset and used for voice verification

Stars: ✭ 15 (-75%)

Mutual labels: speaker-verification

pytorch-pcen

PyTorch reimplementation of per-channel energy normalization for audio.

Stars: ✭ 80 (+33.33%)

Mutual labels: speech

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (+198.33%)

Mutual labels: speech

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-60%)

Mutual labels: speech

MajorDomo-Scenarios

Сценарии для системы домашней автоматизации Majordomo

Stars: ✭ 12 (-80%)

Mutual labels: speech

Shifter

Pitch shifter using WSOLA and resampling implemented by Python3

Stars: ✭ 22 (-63.33%)

Mutual labels: speech

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

Stars: ✭ 205 (+241.67%)

Mutual labels: speech

txt2speech

Convert text to speech using Google Translate API

Stars: ✭ 38 (-36.67%)

Mutual labels: speech

speaker extraction

target speaker extraction and verification for multi-talker speech

Stars: ✭ 85 (+41.67%)

Mutual labels: speaker-verification

data-at-hand-mobile

Mobile application for exploring fitness data using both speech and touch interaction.

Stars: ✭ 50 (-16.67%)

Mutual labels: speech

aframe-speech-controls-component

alternative form of inputs for in-VR interaction with the content of a scene

Stars: ✭ 13 (-78.33%)

Mutual labels: speech

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (+48.33%)

Mutual labels: speech

anycontrol

Voice control for your websites and applications

Stars: ✭ 53 (-11.67%)

Mutual labels: speech

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (-3.33%)

Mutual labels: speech

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+8.33%)

Mutual labels: speech

Multimodal-Gesture-Recognition-with-LSTMs-and-CTC

An end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.

Stars: ✭ 25 (-58.33%)

Mutual labels: speech

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+391.67%)

Mutual labels: speech

meta-embeddings

Meta-embeddings are a probabilistic generalization of embeddings in machine learning.

Stars: ✭ 22 (-63.33%)

Mutual labels: speaker-recognition

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-65%)

Mutual labels: speech

react-native-speech-bubble

💬 A speech bubble dialog component for React Native.

Stars: ✭ 50 (-16.67%)

Mutual labels: speech

UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

Stars: ✭ 94 (+56.67%)

Mutual labels: speaker-diarization

room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

Stars: ✭ 143 (+138.33%)

Mutual labels: speech

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Stars: ✭ 278 (+363.33%)

Mutual labels: speech

temporal-depth-segmentation

Source code (train/test) accompanying the paper entitled "Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach" in CVPR 2019 (https://arxiv.org/abs/1903.10764).

Stars: ✭ 20 (-66.67%)

Mutual labels: temporal-convolutional-network

idear

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (+40%)

Mutual labels: speech

RE-VERB

speaker diarization system using an LSTM

Stars: ✭ 22 (-63.33%)

Mutual labels: speaker-diarization

Naver-AI-Hackathon-Speech

2019 Clova AI Hackathon : Speech - Rank 12 / Team Kai.Lib

Stars: ✭ 26 (-56.67%)

Mutual labels: speech

1-60 of 222 similar projects

›