All Projects → RicherMans → Plda

RicherMans / Plda

An LDA/PLDA estimator using KALDI in python for speaker verification tasks

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Plda

Speech Aligner
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+204.71%)
Mutual labels:  kaldi
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+362.35%)
Mutual labels:  kaldi
Theano Kaldi Rnn
THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.
Stars: ✭ 31 (-63.53%)
Mutual labels:  kaldi
Vosk Android Demo
Offline speech recognition for Android with Vosk library.
Stars: ✭ 271 (+218.82%)
Mutual labels:  kaldi
Espnet
End-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+5232.94%)
Mutual labels:  kaldi
Eesen
The official repository of the Eesen project
Stars: ✭ 738 (+768.24%)
Mutual labels:  kaldi
speech-to-text
mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
Stars: ✭ 61 (-28.24%)
Mutual labels:  kaldi
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+1217.65%)
Mutual labels:  kaldi
Zamia Speech
Open tools and data for cloudless automatic speech recognition
Stars: ✭ 374 (+340%)
Mutual labels:  kaldi
Kaldi Io
c++ Kaldi IO lib (static and dynamic).
Stars: ✭ 22 (-74.12%)
Mutual labels:  kaldi
Vosk Server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Stars: ✭ 277 (+225.88%)
Mutual labels:  kaldi
Asr theory
语音识别理论,论文和PPT
Stars: ✭ 344 (+304.71%)
Mutual labels:  kaldi
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+789.41%)
Mutual labels:  kaldi
Docker Kaldi Gstreamer Server
Dockerfile for kaldi-gstreamer-server.
Stars: ✭ 266 (+212.94%)
Mutual labels:  kaldi
Voxceleb Ivector
Voxceleb1 i-vector based speaker recognition system
Stars: ✭ 36 (-57.65%)
Mutual labels:  kaldi
dropclass speaker
DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020
Stars: ✭ 20 (-76.47%)
Mutual labels:  kaldi
Montreal Forced Aligner
Command line utility for forced alignment using Kaldi
Stars: ✭ 490 (+476.47%)
Mutual labels:  kaldi
Ivector Xvector
Extract xvector and ivector under kaldi
Stars: ✭ 67 (-21.18%)
Mutual labels:  kaldi
Nhyai
AI智能审查,支持色情识别、暴恐识别、语言识别、敏感文字检测和视频检测等功能,以及各种OCR识别能力,如身份证、驾照、行驶证、营业执照、银行卡、手写体、车牌和名片识别等功能,可以访问网站体验功能。
Stars: ✭ 60 (-29.41%)
Mutual labels:  kaldi
Espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Stars: ✭ 808 (+850.59%)
Mutual labels:  kaldi

PLDA

An LDA/PLDA estimator using KALDI in python for speaker verification tasks

Installation

Make sure that you have KALDI compiled and installed. Further make sure that KALDI was compiled using the option --shared, during ./configure (e.g. ./configure --shared). Moreover the included ATLAS within KALDI is sufficient that PLDA works. If any compilation errors happen it's most likely that not all of the ATLAS libraries was installed successfully.

Moreover to find KALDI correctly, please run:

export KALDI_ROOT=/your/path/to/root

if your ATLAS is installed in a different directory please set the variable ATLAS_DIR e.g.

export ATLAS_DIR=/your/atlas/dir

Then just run:

git clone https://github.com/RicherMans/PLDA
cd PLDA
mkdir build && cd build && cmake ../ && make

Per default cmake is installing the python package into your /usr/lib directory. If this is not wised, pass the option -DUSER=ON to cmake to install the packages only for the current user

Voila, the python library is copied to your local users installation path.

Usage

Generally we use this script to do LDA/PLDA scoring. First we need to fit a model using LDA/PLDA.

For LDA:

from liblda import LDA
lda = LDA()
n_samples=500, featdim = 200
X=np.random.rand(n_samples,featdim)
Y=np.random.randint(0,2,n_samples).astype('uint') # for binary labels

lda.fit(X,Y)

For PLDA:

from liblda import PLDA
plda = PLDA()

n_samples=500, featdim = 200

X=np.random.rand(n_samples,featdim)
Y=np.random.randint(0,2,n_samples).astype('uint') # for binary labels

plda.fit(X,Y)

Note that fitting the model in the LDA case is done using enrolment data, while for PLDA we use background data ( which can be any data).

PLDA fit does also accept one extra argument:

#Transform the features first to a given target dimension. Default is keeping the dimension
targetdim=10
plda.fit(X,Y,targetdim)

LDA can then after fitting be used to directly score any incoming utterance using predict_log_proba(SAMPLE)

pred = np.random.rand(featdim)[np.newaxis,...]
scores = lda.predict_log_proba(pred)

the predict_log_proba method returns a list where each element in the last represents the likelihood for the indexed class.

For PLDA one can also do standard normalization methods such as z-norm (other norms are not implemented yet). For this case, simply transform your enrolment vectors (labeled as ENROL_X,ENROL_Y) into the PLDA space and then normalize them using any other data ( but do not use the background data from .fit() ). Generally it is recommended to have an held out set to do this estimation. The normalization procedure will then estimate the mean and variance of the scores of the enrolment models against the held out set (Otherdata).

ENROL_X=np.random.rand(n_samples,featdim)
ENROL_Y=np.arange(n_samples,dtype='uint')
#Smoothing factor does increase the performance. Its a value between 0 and 1. #Does affect the covariance matrix. Optional!
smoothing=0.5

#Transform the features first to a given target dimension. Default is keeping the dimension
targetdim=10

transformed_vectors = plda.transform(ENROL_X,ENROL_Y,targetdim,smoothing)

Otherdata = np.random.rand(m_samples,featdim)
plda.norm(Otherdata,transformed_vectors)

Note that if targetdim is given, all future plda.transform() calls also need targetdim as a valid parameter

And finally one can score any model against a utterance by:

Models_X=np.random.rand(n_samples,featdim)
Models_Y=np.arange(n_samples,dtype='uint')
transformed_vectors = plda.transform(Models_X,Models_Y)

testutt_x=np.random.rand(n_samples,featdim)
testutt_y=np.arange(n_samples,dtype='uint')

transformedtest_vectors=plda.transform(testutt_x,testutt_y)

for model,modelvec in transformed_vectors.iteritems():
  for testutt,testvec in transformedtest_vectors.iteritems():
    #model is an integer which represents the current class, need for normalization
    #modelvec is a tuple consisting of (samplesize,datavector)
    #testvec is a tuple consisting of (samplesize,datavector)
    score=plda.score(model,modelvec,testvec)

Note that the modelid is necessary only if one wants to normalize using z-norm.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].