All Projects → amirharati → kaldi-alligner

amirharati / kaldi-alligner

Licence: other
scripts to align a given wave to its transcription using trained models by Kaldi

Programming Languages

shell
77523 projects
python
139335 projects - #7 most used programming language
perl
6916 projects

Projects that are alternatives of or similar to kaldi-alligner

rustfst
Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (+333.33%)
Mutual labels:  kaldi, asr, kaldi-asr
Speech To Text Russian
Проект для распознавания речи на русском языке на основе pykaldi.
Stars: ✭ 151 (+529.17%)
Mutual labels:  kaldi, asr
opensnips
Open source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (+108.33%)
Mutual labels:  kaldi, asr
Zeroth
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Stars: ✭ 248 (+933.33%)
Mutual labels:  kaldi, asr
Espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Stars: ✭ 808 (+3266.67%)
Mutual labels:  kaldi, asr
Vosk Api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Stars: ✭ 1,357 (+5554.17%)
Mutual labels:  kaldi, asr
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+8637.5%)
Mutual labels:  kaldi, asr
Asr theory
语音识别理论,论文和PPT
Stars: ✭ 344 (+1333.33%)
Mutual labels:  kaldi, asr
asr24
24-hour Automatic Speech Recognition
Stars: ✭ 27 (+12.5%)
Mutual labels:  kaldi, asr
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-41.67%)
Mutual labels:  kaldi, kaldi-asr
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+3050%)
Mutual labels:  kaldi, asr
kaldi-long-audio-alignment
Long audio alignment using Kaldi
Stars: ✭ 21 (-12.5%)
Mutual labels:  kaldi, asr
Eesen
The official repository of the Eesen project
Stars: ✭ 738 (+2975%)
Mutual labels:  kaldi, asr
Pytorch Asr
ASR with PyTorch
Stars: ✭ 124 (+416.67%)
Mutual labels:  kaldi, asr
Zamia Speech
Open tools and data for cloudless automatic speech recognition
Stars: ✭ 374 (+1458.33%)
Mutual labels:  kaldi, asr
Py Kaldi Asr
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Stars: ✭ 156 (+550%)
Mutual labels:  kaldi, asr
Vosk Android Demo
Offline speech recognition for Android with Vosk library.
Stars: ✭ 271 (+1029.17%)
Mutual labels:  kaldi, asr
Vosk Server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Stars: ✭ 277 (+1054.17%)
Mutual labels:  kaldi, asr
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+7991.67%)
Mutual labels:  alignment, forced-alignment
torchain
WIP: pytorch FFI wrapper for Kaldi chain loss (a.k.a. Lattice Free MMI)
Stars: ✭ 20 (-16.67%)
Mutual labels:  kaldi, asr
Kaldi Aligner: A simple script to create time alignment for given speech/transcription pairs.
This script also enrich the transcription using [laughter] and [noise] markers.
It does not use the forced-alignment instead it creates a bigram LM using the input transcription (after enriching the transcription with markers).
After creating a language model it create an HCLG graph and use Kaldi decoder to generate a lattice and finally use the lattice to obtain alignment information.

requirements:
1- Kaldi tool. From: https://github.com/kaldi-asr/kaldi
2- SRILM (also existed under Kaldi/tools). From: http://www.speech.sri.com/projects/srilm/
3- Python 3
4- bash (only tested under Linux)

After installing Kaldi and SRILM:
open path.sh and update export KALDI_ROOT=/home/amir/Projects/kaldi to your kaldi path.
Also make sure SRILM binaries (specifically ngram-count) is in the PATH.

Before running the aligner:
Download the pretrained Aspire chain model by running:
sh sownload_extract.sh
This  script downloads the model and also run some commands for preparation.
Alternatively, you can train your own model. However, you might need to update the scripts accordingly.

Example:
bash align.sh example/trans.txt example/test.wav data/lang_chain/ out.ctm  out_phone.ctm  out_transid_seq.txt  lpf.txt

cat out.ctm
test.wav 1 0.070 0.840 [noise]
test.wav 1 0.910 0.320 my
test.wav 1 1.240 0.300 name
test.wav 1 1.540 0.340 is
test.wav 1 1.880 0.300 [noise]
test.wav 1 2.180 0.780 [laughter]
test.wav 1 2.960 0.600 [noise]
test.wav 1 3.630 0.360 amir
test.wav 1 4.000 0.480 <unk>
test.wav 1 4.510 1.610 [noise]

Notice OOVs are replaced with <unk>. The scripts adds [noise]/[laughter] markers when needed.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].