Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → seongmin-kye → meta-SR

seongmin-kye / meta-SR

Licence: other

Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)

Programming Languages

139335 projects - #7 most used programming language

Labels

speaker-recognition speaker-verification meta-learning short-utterances

Projects that are alternatives of or similar to meta-SR

dropclass speaker

DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020

Stars: ✭ 20 (-65.52%)

Mutual labels: speaker-recognition, speaker-verification, meta-learning

PyTorch implementation of Densely Connected Time Delay Neural Network

Stars: ✭ 60 (+3.45%)

Mutual labels: speaker-recognition, speaker-verification

wavenet-classifier

Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks

Stars: ✭ 54 (-6.9%)

Mutual labels: speaker-recognition, speaker-verification

speaker-recognition-papers

Share some recent speaker recognition papers and their implementations.

Stars: ✭ 92 (+58.62%)

Mutual labels: speaker-recognition, speaker-verification

Huawei-Challenge-Speaker-Identification

Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.

Stars: ✭ 34 (-41.38%)

Mutual labels: speaker-recognition, speaker-verification

Speaker-Identification

A program for automatic speaker identification using deep learning techniques.

Stars: ✭ 84 (+44.83%)

Mutual labels: speaker-recognition, speaker-verification

kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database

Stars: ✭ 17 (-70.69%)

Mutual labels: speaker-recognition, speaker-verification

deepaudio-speaker

neural network based speaker embedder

Stars: ✭ 19 (-67.24%)

Mutual labels: speaker-recognition, speaker-verification

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob

Stars: ✭ 38 (-34.48%)

Mutual labels: speaker-recognition, speaker-verification

Speaker-Recognition

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

Stars: ✭ 94 (+62.07%)

Mutual labels: speaker-recognition, speaker-verification

Pytorch implementation of Generalized End-to-End Loss for speaker verification

Stars: ✭ 72 (+24.14%)

Mutual labels: speaker-recognition, speaker-verification

KaldiBasedSpeakerVerification

Kaldi based speaker verification

Stars: ✭ 43 (-25.86%)

Mutual labels: speaker-recognition, speaker-verification

An efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.

Stars: ✭ 34 (-41.38%)

Mutual labels: meta-learning

FSL-Mate: A collection of resources for few-shot learning (FSL).

Stars: ✭ 1,346 (+2220.69%)

Mutual labels: meta-learning

meta-learning-progress

Repository to track the progress in Meta-Learning (MtL), including the datasets and the current state-of-the-art for the most common MtL problems.

Stars: ✭ 26 (-55.17%)

Mutual labels: meta-learning

Nearest-Celebrity-Face

Tensorflow Implementation of FaceNet: A Unified Embedding for Face Recognition and Clustering to find the celebrity whose face matches the closest to yours.

Stars: ✭ 30 (-48.28%)

Mutual labels: meta-learning

Voiceprint-recognition-Speaker-recognition

It is a complete project of voiceprint recognition or speaker recognition.

Stars: ✭ 82 (+41.38%)

Mutual labels: speaker-recognition

speaker extraction

target speaker extraction and verification for multi-talker speech

Stars: ✭ 85 (+46.55%)

Mutual labels: speaker-verification

Official implementation of Meta-StyleSpeech and StyleSpeech

Stars: ✭ 161 (+177.59%)

Mutual labels: meta-learning

a deep accent recognition network

Stars: ✭ 35 (-39.66%)

Mutual labels: speaker-recognition

View All Similar Projects ➔

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

Pytorch code for following paper:

Title : Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs. [paper]
Author : Seong Min Kye, Youngmoon Jung, Hae Beom Lee, Sung Ju Hwang, Hoirin Kim
Conference : Interspeech, 2020.

Abstract

In practical settings, a speaker recognition system needs to identify a speaker given a short utterance, while the enrollment utterance may be relatively long. However, existing speaker recognition models perform poorly with such short utterances. To solve this problem, we introduce a meta-learning framework for imbalance length pairs. Specifically, we use a Prototypical Networks and train it with a support set of long utterances and a query set of short utterances of varying lengths. Further, since optimizing only for the classes in the given episode may be insufficient for learning disminative embeddings for unseen classes, we additionally enforce the model to classify both the support and the query set against the entire set of classes in the training set. By combining these two learning schemes, our model outperforms existing state-of-the-art speaker verification models learned with a standard supervised learning framework on short utterance (1-2 seconds) on the VoxCeleb datasets. We also validate our proposed model for unseen speaker identification, on which it also achieves significant performance gains over the existing approaches.

Requirements

Python 3.6
Pytorch 1.3.1

Data preparation

The following script can be used to download and prepare the VoxCeleb dataset for training. This preparation code is based on VoxCeleb_trainer, but slightly changed.

python dataprep.py --save_path /root/home/voxceleb --download --user USERNAME --password PASSWORD 
python dataprep.py --save_path /root/home/voxceleb --extract
python dataprep.py --save_path /root/home/voxceleb --convert

In addition to the Python dependencies, wget and ffmpeg must be installed on the system.

Feature extraction

In configure.py, specify the path to the directory. For example, in meta-SR/configure.py line 2:

save_path = '/root/home/voxceleb'

Then, extract acoustic feature (mel filterbank-40).

python feat_extract/feature_extraction.py

Training examples

Softmax:

python train.py --loss_type softmax --use_GC False --n_shot 1 --n_query 0 --use_variable False --nb_class_train 256

Prototypical without global classification:

python train.py --loss_type prototypical --use_GC False --n_shot 1 --n_query 2 --use_variable True --nb_class_train 100

Prototypical with global classification:

python train.py --loss_type prototypical --use_GC True --n_shot 1 --n_query 2 --use_variable True --nb_class_train 100

if you want to use fixed length query, set --use_variable False.

Evaluation

If you use n-th folder & k-th checkpoint

Speaker verification for full utterance:

python EER_full.py --n_folder n --cp_num k --data_type vox2

if you trained the model with VoxCeleb1, set --data_type vox1.

Speaker verification for short utterance:

python EER_short.py --n_folder n --cp_num k --test_length 100

ex) test on 2-second utterance, set --test_length 200.

Unseen speaker identification:

python identification.py --n_folder n --cp_num k --nb_class_test 100 --test_length 100

Pretrained model

A pretrained model can be downloaded from here. If you put this model into meta-SR/saved_model/baseline_000, and run following script, you can get EER 2.08.

python EER_full.py --n_folder 0 --cp_num 100 --data_type vox2

Citation

Please cite the following if you make use of the code.

@inproceedings{kye2020meta,
  title={Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs},
  author={Kye, Seong Min and Jung, Youngmoon and Lee, Hae Beom and Hwang, Sung Ju and Kim, Hoirin},
  booktitle={Interspeech},
  year={2020}
}

Acknowledgments

This code is based on the implementation of SR_tutorial and VoxCeleb_trainer. I would like to thank Youngmoon Jung, Joon Son Chung and Sung Ju Hwang for helpful discussions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 58

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗