All Projects → doerlbh → MiniVox

doerlbh / MiniVox

Licence: other
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".

Programming Languages

Cuda
1817 projects
matlab
3953 projects
C++
36643 projects - #6 most used programming language
python
139335 projects - #7 most used programming language
shell
77523 projects
c
50402 projects - #5 most used programming language
M
324 projects

Projects that are alternatives of or similar to MiniVox

Vowpal wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Stars: ✭ 7,815 (+52000%)
Mutual labels:  online-learning, contextual-bandits
deepaudio-speaker
neural network based speaker embedder
Stars: ✭ 19 (+26.67%)
Mutual labels:  speaker-recognition, speaker-diarization
Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+56440%)
Mutual labels:  paper, self-supervised-learning
AESRC2020
a deep accent recognition network
Stars: ✭ 35 (+133.33%)
Mutual labels:  speaker-recognition, interspeech
GE2E-Loss
Pytorch implementation of Generalized End-to-End Loss for speaker verification
Stars: ✭ 72 (+380%)
Mutual labels:  speaker-recognition, speaker-diarization
D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (+300%)
Mutual labels:  speaker-recognition, speaker-diarization
KaldiBasedSpeakerVerification
Kaldi based speaker verification
Stars: ✭ 43 (+186.67%)
Mutual labels:  speaker-recognition
BYOL
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Stars: ✭ 102 (+580%)
Mutual labels:  self-supervised-learning
material-appearance-similarity
Code for the paper "A Similarity Measure for Material Appearance" presented in SIGGRAPH 2019 and published in ACM Transactions on Graphics (TOG).
Stars: ✭ 22 (+46.67%)
Mutual labels:  paper
STACP
Joint Geographical and Temporal Modeling based on Matrix Factorization for Point-of-Interest Recommendation - ECIR 2020
Stars: ✭ 19 (+26.67%)
Mutual labels:  paper
Awesome-Lane-Detection
A paper list with code of lane detection.
Stars: ✭ 34 (+126.67%)
Mutual labels:  paper
Self-Supervised-Embedding-Fusion-Transformer
The code for our IEEE ACCESS (2020) paper Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion.
Stars: ✭ 57 (+280%)
Mutual labels:  self-supervised-learning
pluGET
📦 Powerful Package manager which updates plugins & server software for minecraft servers
Stars: ✭ 87 (+480%)
Mutual labels:  paper
bug-localization
Source code of the paper "Leveraging textual properties of bug reports to localize relevant source files".
Stars: ✭ 15 (+0%)
Mutual labels:  paper
SportPaper
Performance-tuned Minecraft 1.8 spigot server
Stars: ✭ 122 (+713.33%)
Mutual labels:  paper
best AI papers 2021
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.
Stars: ✭ 2,740 (+18166.67%)
Mutual labels:  paper
G-SimCLR
This is the code base for paper "G-SimCLR : Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling" by Souradip Chakraborty, Aritra Roy Gosthipaty and Sayak Paul.
Stars: ✭ 69 (+360%)
Mutual labels:  self-supervised-learning
minie
An open information extraction system that provides compact extractions
Stars: ✭ 83 (+453.33%)
Mutual labels:  paper
PhD
Incremental Methods of Deep Learning for Detection and Classifcation in a Robotics Environment
Stars: ✭ 13 (-13.33%)
Mutual labels:  paper
Text-Summarization-Repo
텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.
Stars: ✭ 213 (+1320%)
Mutual labels:  paper

MiniVox

minivox

Code for our papers:

ACML 2021 "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox"

INTERSPEECH 2020 "VoiceID on the fly: A speaker recognition system that learns from scratch"

by Baihan Lin (Columbia) and Xinxin Zhang (NYU).

For the latest full paper: https://arxiv.org/abs/2006.04376

All the experimental results can be reproduced using the code in this repository. Feel free to contact me by [email protected] if you have any question about our work.

Abstract

We propose a novel machine learning framework to conduct real-time multi-speaker diarization and recognition without prior registration and pretraining in a fully online learning setting. Our contributions are two-fold. First, we propose a new benchmark to evaluate the rarely studied fully online speaker diarization problem. We build upon existing datasets of real world utterances to automatically curate \textit{MiniVox}, an experimental environment which generates infinite configurations of continuous multi-speaker speech stream. Second, we consider the practical problem of online learning with episodically revealed rewards and introduce a solution based on semi-supervised and self-supervised learning methods. Additionally, we provide a workable web-based recognition system which interactively handles the cold start problem of new user's addition by transferring representations of old arms to new ones with an extendable contextual bandit. We demonstrate that our proposed method obtains robust performance in the online MiniVox framework given either cepstrum-based representations or deep neural network embeddings.

Info

Language: Matlab

Platform: MacOS, Linux, Windows

by Baihan Lin, Jan 2020

Citation

If you find this work helpful, please try out the models and cite our works. Thanks!

@inproceedings{lin2021speaker,
  title={{Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox}},
  author={Lin, Baihan and Zhang, Xinxin},
  booktitle={Asian Conference on Machine Learning},
  year={2021},
  pages={},
  organization={PMLR}
}

@inproceedings{lin2020voiceid,
  title={{VoiceID on the fly: A speaker recognition system that learns from scratch}},
  author={Lin, Baihan and Zhang, Xinxin},
  booktitle={INTERSPEECH},
  year={2020}
}

Requirements

Acknowledgements

The CNN pretrained model was accessed from https://github.com/a-nagrani/VGGVox. We modified many of the original files and included our comparison.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].