[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

Stars: ✭ 33 (-26.67%)

Mutual labels: contrastive-learning

Volute

Raspberry Pi + Nodejs = Speech Robot

Stars: ✭ 224 (+397.78%)

Mutual labels: speech

NBSS

The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".

Stars: ✭ 77 (+71.11%)

Mutual labels: speech

Speech Enhancement

Deep learning for audio denoising

Stars: ✭ 207 (+360%)

Mutual labels: speech

awesome-graph-self-supervised-learning-based-recommendation

A curated list of awesome graph & self-supervised-learning-based recommendation.

Stars: ✭ 37 (-17.78%)

Mutual labels: contrastive-learning

Neural Voice Cloning With Few Samples

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Stars: ✭ 211 (+368.89%)

Mutual labels: speech

cape

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Stars: ✭ 29 (-35.56%)

Mutual labels: speech

Timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

Stars: ✭ 202 (+348.89%)

Mutual labels: speech

Lingvo

Stars: ✭ 2,361 (+5146.67%)

Mutual labels: speech

Revisiting-Contrastive-SSL

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]

Stars: ✭ 81 (+80%)

Mutual labels: contrastive-learning

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

Stars: ✭ 189 (+320%)

Mutual labels: speech

day2night

Image2Image Translation Research

Stars: ✭ 46 (+2.22%)

Mutual labels: cyclegan

Vq Vae Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Stars: ✭ 187 (+315.56%)

Mutual labels: speech

ventib

📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.

Stars: ✭ 43 (-4.44%)

Mutual labels: speech

Siricontrol System

Control anything with Siri voice commands.

Stars: ✭ 180 (+300%)

Mutual labels: speech

Parametric-Contrastive-Learning

Parametric Contrastive Learning (ICCV2021)

Stars: ✭ 155 (+244.44%)

Mutual labels: contrastive-learning

Deep speaker Speaker recognition system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

Stars: ✭ 174 (+286.67%)

Mutual labels: speech

object-aware-contrastive

Object-aware Contrastive Learning for Debiased Scene Representation (NeurIPS 2021)

Stars: ✭ 44 (-2.22%)

Mutual labels: contrastive-learning

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+4560%)

Mutual labels: speech

UPIT

A fastai/PyTorch package for unpaired image-to-image translation.

Stars: ✭ 94 (+108.89%)

Mutual labels: cyclegan

Tts Papers

🐸 collection of TTS papers

Stars: ✭ 160 (+255.56%)

Mutual labels: speech

txt2speech

Convert text to speech using Google Translate API

Stars: ✭ 38 (-15.56%)

Mutual labels: speech

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+4180%)

Mutual labels: speech

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-53.33%)

Mutual labels: speech

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (+206.67%)

Mutual labels: speech

CLSA

official implemntation for "Contrastive Learning with Stronger Augmentations"

Stars: ✭ 48 (+6.67%)

Mutual labels: contrastive-learning

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (+200%)

Mutual labels: speech

MediumVC

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Stars: ✭ 46 (+2.22%)

Mutual labels: voice-conversion

Avpi

an open source voice command macro software

Stars: ✭ 130 (+188.89%)

Mutual labels: speech

Supervised-Contrastive-Learning-in-TensorFlow-2

Implements the ideas presented in https://arxiv.org/pdf/2004.11362v1.pdf by Khosla et al.

Stars: ✭ 117 (+160%)

Mutual labels: contrastive-learning

Asr audio data links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 128 (+184.44%)

Mutual labels: speech

gans-2.0

Generative Adversarial Networks in TensorFlow 2.0

Stars: ✭ 76 (+68.89%)

Mutual labels: cyclegan

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+24680%)

Mutual labels: speech

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (+28.89%)

Mutual labels: speech

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (+171.11%)

Mutual labels: speech

GeDML

Generalized Deep Metric Learning.

Stars: ✭ 30 (-33.33%)

Mutual labels: contrastive-learning

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (+160%)

Mutual labels: speech

Multimodal-Gesture-Recognition-with-LSTMs-and-CTC

An end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.

Stars: ✭ 25 (-44.44%)

Mutual labels: speech

Holobot

HoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.

Stars: ✭ 114 (+153.33%)

Mutual labels: speech

lidbox

End-to-end spoken language identification out of the box.

Stars: ✭ 39 (-13.33%)

Mutual labels: speech

Python Speech recognition

A simple example for use speech recognition baidu api with python.