Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → jimbozhang → Kaldi Gop

jimbozhang / Kaldi Gop

Licence: other

Computes the GMM-based Goodness of Pronunciation (GOP). Bases on Kaldi.

Labels

speech-recognition kaldi

Projects that are alternatives of or similar to Kaldi Gop

Factorized Tdnn

PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi

Stars: ✭ 98 (-5.77%)

Mutual labels: speech-recognition, kaldi

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Stars: ✭ 1,357 (+1204.81%)

Mutual labels: speech-recognition, kaldi

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras

Stars: ✭ 61 (-41.35%)

Mutual labels: speech-recognition, kaldi

kaldi-long-audio-alignment

Long audio alignment using Kaldi

Stars: ✭ 21 (-79.81%)

Mutual labels: speech-recognition, kaldi

The official repository of the Eesen project

Stars: ✭ 738 (+609.62%)

Mutual labels: speech-recognition, kaldi

srvk-eesen-offline-transcriber

Top level code to transcribe English audio/video files into text/subtitles

Stars: ✭ 22 (-78.85%)

Mutual labels: speech-recognition, kaldi

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Stars: ✭ 277 (+166.35%)

Mutual labels: speech-recognition, kaldi

Kaldi Active Grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Stars: ✭ 196 (+88.46%)

Mutual labels: speech-recognition, kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Stars: ✭ 393 (+277.88%)

Mutual labels: speech-recognition, kaldi

Open tools and data for cloudless automatic speech recognition

Stars: ✭ 374 (+259.62%)

Mutual labels: speech-recognition, kaldi

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Stars: ✭ 104 (+0%)

Mutual labels: speech-recognition, kaldi

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Stars: ✭ 808 (+676.92%)

Mutual labels: speech-recognition, kaldi

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

Stars: ✭ 14 (-86.54%)

Mutual labels: speech-recognition, kaldi

vosk-model-ru-adaptation

No description or website provided.

Stars: ✭ 19 (-81.73%)

Mutual labels: speech-recognition, kaldi

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Stars: ✭ 248 (+138.46%)

Mutual labels: speech-recognition, kaldi

Vosk Android Demo

Offline speech recognition for Android with Vosk library.

Stars: ✭ 271 (+160.58%)

Mutual labels: speech-recognition, kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1916.35%)

Mutual labels: speech-recognition, kaldi

Kaldi model converter to ONNX

Stars: ✭ 174 (+67.31%)

Mutual labels: speech-recognition, kaldi

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+4258.65%)

Mutual labels: speech-recognition, kaldi

A Python wrapper for Kaldi

Stars: ✭ 756 (+626.92%)

Mutual labels: speech-recognition, kaldi

View All Similar Projects ➔

kaldi-gop

This project computes GMM-based GOP (Goodness of Pronunciation) using Kaldi.

Notes about the DNN-based implementation

This implementation is GMM-based. For DNN-based implementation, please check Kaldi's official repository:

https://github.com/kaldi-asr/kaldi/tree/master/egs/gop_speechocean762

The performance of GOP-DNN should be much better than GOP-GMM.

How to build

./build.sh

Run the example

cd egs/gop-compute
./run.sh

Theory

In the conventional GMM-HMM based system, GOP was first proposed in (Witt et al., 2000). It was defined as the duration normalised log of the posterior:

$$ GOP(p)=\frac{1}{t_e-t_s+1} \log p(p|\mathbf o) $$

where $\mathbf o$ is the input observations, $p$ is the canonical phone, $t_s, t_e$ are the start and end frame indexes.

Assuming $p(q_i)\approx p(q_j)$ for any $q_i, q_j$, we have:

$$ \log p(p|\mathbf o)=\frac{p(\mathbf o|p)p(p)}{\sum_{q\in Q} p(\mathbf o|q)p(q)} \approx\frac{p(\mathbf o|p)}{\sum_{q\in Q} p(\mathbf o|q)} $$

where $Q$ is the whole phone set.

The numerator of the equation is calculated from forced alignment result and the denominator is calculated from a Viterbi decoding with an unconstrained phone loop.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 104

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗