Jungjee / Rawnet
Author's repository for reproducing RawNet 1 and 2 papers with pre-trained model weights and speaker embeddings. RawNet2 is implemented in PyTorch and RawNet1 is implemented in PyTorch and Keras.
Stars: ✭ 121
Programming Languages
python
139335 projects - #7 most used programming language
Labels
Overview
This repository includes implementations of speaker verification systems that input raw waveforms.
Currently, it has three systems in python.
Detailed instructions on each system is described in individual ReadME
files.
RawNet2_modified
- Code refactoring
- PyTorch ResNet alike model implementation
- Deeper architecture
- Improved feature map scaling method
-
α-feature map scaling for raw waveform speaker verification
- Only abstract is in English
-
α-feature map scaling for raw waveform speaker verification
- Angular loss function adopted
- Performance
- EER 1.91%
- Trained using VoxCeleb2
- VoxCeleb1 original trial
- Will be used as a baseline system for authors' future works
- EER 1.91%
RawNet2
- Improved performance than RawNet
- DNN speaker embedding extraction with raw waveform inputs
- cosine similarity back-end
- EER 4.8% -->> 2.56%
- VoxCeleb1 original trial
- Uses a technique named feature map scaling
- scales feature map alike squeeze-excitation
- Implemented in PyTorch.
- Published as a conference paper in Interspeech 2020.
@article{jung2020improved,
title={Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms},
author={Jung, Jee-weon and Kim, Seung-bin and Shim, Hye-jin and Kim, Ju-ho and Yu, Ha-Jin},
journal={Proc. Interspeech 2020},
pages={3583--3587},
year={2020}
}
RawNet
- DNN-based speaker embedding extractor used with another DNN-based classifier
- Built on top of authors' previous works on raw waveform speaker verification
- EER 4.8% with cosine simaility back-end, 4.0% with proposed concat&mul back-end
- VoxCeleb1 original trial
- Implemented in Keras and PyTorch
- Published as a conference paper in Interspeech 2019.
@article{jung2019RawNet,
title={RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification},
author={Jung, Jee-weon and Heo, Hee-soo and Kim, ju-ho and Shim, Hye-jin and Yu, Ha-jin},
journal={Proc. Interspeech 2019},
pages={1268--1272},
year={2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].