All Projects → funcwj → Setk

funcwj / Setk

Licence: apache-2.0
Tools for Speech Enhancement integrated with Kaldi

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Setk

opensnips
Open source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-77.97%)
Mutual labels:  speech, kaldi
kaldi helpers
🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-94.27%)
Mutual labels:  speech, kaldi
Lhotse
Stars: ✭ 236 (+3.96%)
Mutual labels:  speech, kaldi
Kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+4812.33%)
Mutual labels:  speech, kaldi
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+233.04%)
Mutual labels:  speech, kaldi
Speech Aligner
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+14.1%)
Mutual labels:  speech, kaldi
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-93.83%)
Mutual labels:  speech, kaldi
Awesome Kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Stars: ✭ 393 (+73.13%)
Mutual labels:  speech, kaldi
Pytorch Asr
ASR with PyTorch
Stars: ✭ 124 (-45.37%)
Mutual labels:  speech, kaldi
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+823.79%)
Mutual labels:  speech, kaldi
Siricontrol System
Control anything with Siri voice commands.
Stars: ✭ 180 (-20.7%)
Mutual labels:  speech
React Native Dialogflow
A React-Native Bridge for the Google Dialogflow (API.AI) SDK
Stars: ✭ 182 (-19.82%)
Mutual labels:  speech
Timit
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
Stars: ✭ 202 (-11.01%)
Mutual labels:  speech
Speech Enhancement
Deep learning for audio denoising
Stars: ✭ 207 (-8.81%)
Mutual labels:  speech
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
Stars: ✭ 175 (-22.91%)
Mutual labels:  speech
Esp8266sam
Speech synthesis for ESP8266 using S.A.M. port
Stars: ✭ 199 (-12.33%)
Mutual labels:  speech
Kaldi Onnx
Kaldi model converter to ONNX
Stars: ✭ 174 (-23.35%)
Mutual labels:  kaldi
Deep speaker Speaker recognition system
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
Stars: ✭ 174 (-23.35%)
Mutual labels:  speech
Chatbot Watson Android
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (-25.55%)
Mutual labels:  speech
Volute
Raspberry Pi + Nodejs = Speech Robot
Stars: ✭ 224 (-1.32%)
Mutual labels:  speech

SETK: Speech Enhancement Tools integrated with Kaldi

Here are some speech enhancement/separation tools integrated with Kaldi. I use them for front-end's data processing.

Python Scripts

  • Supervised (mask-based) adaptive beamformer (GEVD/MVDR/MCWF...)
  • Data convertion among MATLAB, Numpy and Kaldi
  • Data visualization (TF-mask, spatial/spectral features, beam pattern...)
  • Unified data and IO handlers for Kaldi's scripts, archives, wave and numpy's ndarray...
  • Unsupervised mask estimation (CGMM/CACGMM)
  • Spatial/Spectral feature computation
  • DS (delay and sum) beamformer, SD (supper-directive) beamformer
  • AuxIVA, WPE & WPD, FB (Fixed Beamformer)
  • Mask computation (iam, irm, ibm, psm, crm)
  • RIR simulation (1D/2D arrays)
  • Single channel speech separation (TF spectral masking)
  • Si-SDR/SDR/WER evaluation
  • Pywebrtc vad wrapper
  • Mask-based source localization
  • Noise suppression
  • Data simulation
  • ...

Please check out the following instruction for usage of the scripts.

Kaldi Commands

  • Compute time-frequency masks (ibm, irm etc)
  • Compute phase & magnitude spectrogram & complex STFT
  • Seperate target component using input masks
  • Wave reconstruction from enhanced spectral features
  • Complex matrix/vector class
  • MVDR/GEVD beamformer (depend on T-F mask, not very stable)
  • Fixed beamformer
  • Compute angular spectrogram based on SRP-PHAT
  • RIR generator (reference from RIR-Generator)

To build the sources, you need to compile Kaldi with --shared flags and patch matrix/matrix-common.h first

typedef enum {
    kTrans          = 112,  // CblasTrans
    kNoTrans        = 111,  // CblasNoTrans
    kConjTrans      = 113,  // CblasConjTrans
    kConjNoTrans    = 114   // CblasConjNoTrans
} MatrixTransposeType;

Then run

mkdir build
cd build
export KALDI_ROOT=/path/to/kaldi/root
export OPENFST_ROOT=/path/to/openfst/root
# if on UNIX, need compile kaldi with openblas
export OPENBLAS_ROOT=/path/to/openblas/root
cmake ..
make -j

Now I mainly work on sptk package, development based on kaldi is stopped.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].