All Projects → zeroQiaoba → Ivector Xvector

zeroQiaoba / Ivector Xvector

Extract xvector and ivector under kaldi

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Labels

Projects that are alternatives of or similar to Ivector Xvector

dropclass speaker
DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020
Stars: ✭ 20 (-70.15%)
Mutual labels:  kaldi
Zamia Speech
Open tools and data for cloudless automatic speech recognition
Stars: ✭ 374 (+458.21%)
Mutual labels:  kaldi
Kaldi Io
c++ Kaldi IO lib (static and dynamic).
Stars: ✭ 22 (-67.16%)
Mutual labels:  kaldi
Docker Kaldi Gstreamer Server
Dockerfile for kaldi-gstreamer-server.
Stars: ✭ 266 (+297.01%)
Mutual labels:  kaldi
Asr theory
语音识别理论,论文和PPT
Stars: ✭ 344 (+413.43%)
Mutual labels:  kaldi
Montreal Forced Aligner
Command line utility for forced alignment using Kaldi
Stars: ✭ 490 (+631.34%)
Mutual labels:  kaldi
kaldi-timit-sre-ivector
Develop speaker recognition model based on i-vector using TIMIT database
Stars: ✭ 17 (-74.63%)
Mutual labels:  kaldi
Nhyai
AI智能审查,支持色情识别、暴恐识别、语言识别、敏感文字检测和视频检测等功能,以及各种OCR识别能力,如身份证、驾照、行驶证、营业执照、银行卡、手写体、车牌和名片识别等功能,可以访问网站体验功能。
Stars: ✭ 60 (-10.45%)
Mutual labels:  kaldi
Espnet
End-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+6665.67%)
Mutual labels:  kaldi
Espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Stars: ✭ 808 (+1105.97%)
Mutual labels:  kaldi
Vosk Android Demo
Offline speech recognition for Android with Vosk library.
Stars: ✭ 271 (+304.48%)
Mutual labels:  kaldi
React Transcript Editor
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
Stars: ✭ 285 (+325.37%)
Mutual labels:  kaldi
Eesen
The official repository of the Eesen project
Stars: ✭ 738 (+1001.49%)
Mutual labels:  kaldi
Speech Aligner
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+286.57%)
Mutual labels:  kaldi
Theano Kaldi Rnn
THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.
Stars: ✭ 31 (-53.73%)
Mutual labels:  kaldi
speech-to-text
mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
Stars: ✭ 61 (-8.96%)
Mutual labels:  kaldi
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+486.57%)
Mutual labels:  kaldi
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+1571.64%)
Mutual labels:  kaldi
Voxceleb Ivector
Voxceleb1 i-vector based speaker recognition system
Stars: ✭ 36 (-46.27%)
Mutual labels:  kaldi
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+1028.36%)
Mutual labels:  kaldi

Summary of Kaldi for ivector and xvector

Files List

ivector/

  • conf/: configure file for mfcc and vad

  • wav/: test audio (you can also use your own wav path, see Step 1)

    • Only supprot flac (install flac), wav and sph (install sph2pipe )
  • model_3000h/: pre-trained model

  • enroll.sh: main process fille

  • data/: save extracted features (It's a generated file)

    • utt2spk, wav.scp generate two files through make_data.py
    • spk2utt: generate from utt2spk
    • log/: save all logs
    • tmp/: save all tmp files

xvector/

  • conf/: configure file for mfcc and vad

  • wav/: test audio (you can also use your own wav path, see Step 1)

    • Only supprot flac (install flac), wav and sph (install sph2pipe )
  • exp/: pre-trained model

  • enroll.sh: main process fille

  • data/: save extracted features (It's a generated file)

    • utt2spk, wav.scp generate two files through make_data.py
    • spk2utt: generate from utt2spk
    • log/: save all logs
    • tmp/: save all tmp files

format_norm.py: change ark format to npz format

Extract features: ivector and xvector

Step 0: Preparation

  • First, install Kaldi.

  • Then, step into ivector/ or xvector/ folder

  • Change KALDI_ROOT in path.sh to your own kaldi root

  • Add link:

ln -s $KALDI_ROOT/egs/sre16/v2/steps ./
ln -s $KALDI_ROOT/egs/sre16/v2/sid ./
ln -s $KALDI_ROOT/egs/sre16/v2/utils ./

Step 1: Extract ivector and xvector

Refers to pre-trained xvector model in kaldi and kaidi-sre-code

  • Extract ivector: cd ivector and run enroll.sh to extract ivector
bash enroll.sh wav_path
# for example: bash enroll.sh ./wav
  • Extract xvector: cd xvector and run enroll.sh to extract ivector
## Case 1: extract xvector without speaker infos
#          for example: bash enroll.sh ./wav 1
bash enroll.sh wav_path 1

## Case 2: extract xvector with speaker infos
####Step 1: Generate speaker.txt files
####Only suitable for files like 'wav_root/speaker_id/wav_name'
####Other format, you should write your own generate_speaker.py
python generate_speaker.py wav_dir speaker.txt
####Step 2: extract xvector with speaker infos
####For example, bash enroll.sh ./speaker.txt 2
bash enroll.sh ./speaker.txt 2

Step 2: Read generate ivector and xvector

In this section, we convert ivector and xvector from ark type to array type

i-vector in data/feat/ivectors_enroll_mfcc

  • spk_ivector.ark i-vector for each speaker
  • ivector.1.ark: i-vector for each utturance (400-d i-vector)

x-vector in data/feat/xvectors_enroll_mfcc

  • spk_xvector.ark x-vector for each speaker
  • xvector.1.ark: x-vector for each utturance (512-d x-vector)
## print name and feats from ark to txt
$KALDI_ROOT/src/bin/copy-vector ark:ivector/data/feat/ivectors_enroll_mfcc/ivector.1.ark ark,t:- >ivector.txt

$KALDI_ROOT/src/bin/copy-vector ark:xvector/data/feat/xvectors_enroll_mfcc/xvector.1.ark ark,t:- >xvector.txt

## Or you can change ark format to np.array format, which has (data_path ['pic_path'], ivector or xvector)
python format_norm.py --vector_path='xvector.txt' --save_path='x_vector.npz'
python format_norm.py --vector_path='ivector.txt' --save_path='i_vector.npz'

Other summary

## combine different files
utils/combine_data.sh

## make xxx fits to the kaldi format
utils/fix_data_dir.sh xxx

## gain subset of data
utils/subset_data_dir.sh

## file exists and dir exists
if [ -d "./data" ];then # dictionary exists
if [ -f "./data/1.txt" ];then # file exists

## xvector/run.sh
Has four folder: 
	sre_combined (source domain, argument data, for training)
	sre16_major (unlabeded target domain for model adaption)
	sre16_eval_enroll(labeded target domain for train)
	sre16_eval_test(unlabeded target domain for test)
Main stream: xvector->mean->transform(LDA)->len normalize->classifier(PLDA/adapt-PLDA)

## ark: split by space and print the third one
echo '1 2 3' |awk '{print $3}'  # print 3
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].