PiotrTa / Huawei-Challenge-Speaker-Identification

Licence: other

Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.

Programming Languages

Jupyter Notebook

11667 projects

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Huawei-Challenge-Speaker-Identification

dropclass speaker

DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020

Stars: ✭ 20 (-41.18%)

Mutual labels: speaker-recognition, speaker-verification, speaker-identification, speaker-embedding

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Stars: ✭ 841 (+2373.53%)

Mutual labels: voice-recognition, speech-processing, voice-activity-detection

spokestack-ios

Spokestack: give your iOS app a voice interface!

Stars: ✭ 27 (-20.59%)

Mutual labels: voice-recognition, speech-processing, voice-activity-detection

bob

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob

Stars: ✭ 38 (+11.76%)

Mutual labels: speaker-recognition, speaker-verification, speech-processing

KaldiBasedSpeakerVerification

Kaldi based speaker verification

Stars: ✭ 43 (+26.47%)

Mutual labels: speaker-recognition, speaker-verification, speaker-identification

Speaker-Recognition

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

Stars: ✭ 94 (+176.47%)

Mutual labels: speaker-recognition, speaker-verification, speaker-identification

react-native-spokestack

Spokestack: give your React Native app a voice interface!

Stars: ✭ 53 (+55.88%)

Mutual labels: voice-recognition, speech-processing, voice-activity-detection

GE2E-Loss

Pytorch implementation of Generalized End-to-End Loss for speaker verification

Stars: ✭ 72 (+111.76%)

Mutual labels: speaker-recognition, speaker-verification, speaker-identification

D-TDNN

PyTorch implementation of Densely Connected Time Delay Neural Network

Stars: ✭ 60 (+76.47%)

Mutual labels: speaker-recognition, speaker-verification, speaker-embedding

wavenet-classifier

Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks

Stars: ✭ 54 (+58.82%)

Mutual labels: speaker-recognition, speaker-verification, speaker-identification

cobra

On-device voice activity detection (VAD) powered by deep learning.

Stars: ✭ 76 (+123.53%)

Mutual labels: voice-recognition, voice-activity-detection

VoiceprintRecognition-Pytorch

本项目使用了EcapaTdnn模型实现的声纹识别

Stars: ✭ 140 (+311.76%)

Mutual labels: voice-recognition, speaker-recognition

VoiceprintRecognition-Keras

基于Kersa实现的声纹识别模型

Stars: ✭ 70 (+105.88%)

Mutual labels: voice-recognition, speaker-recognition

speakerIdentificationNeuralNetworks

⇨ The Speaker Recognition System consists of two phases, Feature Extraction and Recognition. ⇨ In the Extraction phase, the Speaker's voice is recorded and typical number of features are extracted to form a model. ⇨ During the Recognition phase, a speech sample is compared against a previously created voice print stored in the database. ⇨ The hi…

Stars: ✭ 26 (-23.53%)

Mutual labels: speaker-recognition, speaker-identification

VoiceprintRecognition-PaddlePaddle

使用PaddlePaddle实现声纹识别

Stars: ✭ 57 (+67.65%)

Mutual labels: voice-recognition, speaker-recognition

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (+52.94%)

Mutual labels: voice-recognition, voice-activity-detection

UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

Stars: ✭ 94 (+176.47%)

Mutual labels: speech-processing, speaker-identification

Speaker-Identification

A program for automatic speaker identification using deep learning techniques.

Stars: ✭ 84 (+147.06%)

Mutual labels: speaker-recognition, speaker-verification

meta-SR

Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)

Stars: ✭ 58 (+70.59%)

Mutual labels: speaker-recognition, speaker-verification

kaldi-timit-sre-ivector

Develop speaker recognition model based on i-vector using TIMIT database

Stars: ✭ 17 (-50%)

Mutual labels: speaker-recognition, speaker-verification

View All Similar Projects ➔

Huawei Speaker Identification Challenge

Code for Huawei speaker identification challenge. In this repo, you can find two neural networks that are trained to embed speech segments. The large Pytorch model is trained used pyannote.audio library. The second architecture is designed to be run on Huawei Neural-Network processing unit (NPU) and is a lighter version of the large one. The training pipline is written from scratch in tensorflow.

Models uploaded
Notebooks for speech utterance embedding visualization added
Notebook for speaker identification added

To do:

Upload the training pipeline
Documentation
Clean the library

If you want to play around with the code:

Clone the repo
Install dependencies (still need to be documented)
Create two folders: one for the speakers to be enrolled, and the other one for queries and place wave files with 16kHz sampling rate.
Open run_identification.ipynb and adjust the folder paths.

You can also try visualizing the embeddings for your dataset. Take a look into embedding_visualization.ipynb

Models:

large_model.pt (pytorch): Two bi-LSTM layers (2x512 hidden dimension) plus two fc-layers on top with tanh activation at the end.
best_model.pb (tensorflow) static LSTM (1x128 hidden dimension) plus two fc-layers with tanh activation. Can be run on Huawei Kirin 970.

Training data included LibriSpeech subsets: “clean-train-100” and “clean-train-360”, both having together 1105 English speakers, Clarin Polish data set with 552 speakers, and Huawei data set containing 165 English speakers.

Data preprocessing:

All audio data was converted to wave format and resampled to 16kHz using ffmpeg. Speech activity detection was done offline before training. For non-speech segment removal, a pre-trained support vector machine classifier is used. Silence removal reduced the overall size of the data set from 67.4GB to 55GB.

Some code for the pytorch network from: https://github.com/pyannote/pyannote-audio

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

PiotrTa / Huawei-Challenge-Speaker-Identification

Programming Languages

Labels

Projects that are alternatives of or similar to Huawei-Challenge-Speaker-Identification

Huawei Speaker Identification Challenge