All Projects → HTK → Similar Projects or Alternatives

202 Open source projects that are alternatives of or similar to HTK

Feature extraction of speech signal is the initial stage of any speech recognition system.

Stars: ✭ 78 (+239.13%)

Mutual labels: speech

Multimodal-Gesture-Recognition-with-LSTMs-and-CTC

An end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.

Stars: ✭ 25 (+8.7%)

Mutual labels: speech

mchmm

Markov Chains and Hidden Markov Models in Python

Stars: ✭ 89 (+286.96%)

Mutual labels: hmm

react-native-speech-bubble

💬 A speech bubble dialog component for React Native.

Stars: ✭ 50 (+117.39%)

Mutual labels: speech

TASNET

Time-domain Audio Separation Network (IN PYTORCH)

Stars: ✭ 18 (-21.74%)

Mutual labels: speech

idear

🎙️ Handsfree Audio Development Interface

Stars: ✭ 84 (+265.22%)

Mutual labels: speech

Phomeme

Simple sentence mixing tool (work in progress)

Stars: ✭ 18 (-21.74%)

Mutual labels: speech

browser-apis

🦄 Cool & Fun Browser Web APIs 🥳

Stars: ✭ 21 (-8.7%)

Mutual labels: speech

speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

Stars: ✭ 40 (+73.91%)

Mutual labels: speech

Gse

Go efficient multilingual NLP and text segmentation; support english, chinese, japanese and other. Go 高性能多语言 NLP 和分词

Stars: ✭ 1,695 (+7269.57%)

Mutual labels: hmm

citar

Citar HMM part-of-speech tagger

Stars: ✭ 16 (-30.43%)

Mutual labels: hmm

unsupervised-pos-tagging

教師なし品詞タグ推定

Stars: ✭ 16 (-30.43%)

Mutual labels: hmm

CIP

Basic exercises of chinese information processing

Stars: ✭ 32 (+39.13%)

Mutual labels: hmm

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+965.22%)

Mutual labels: speech

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (+43.48%)

Mutual labels: speech

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Stars: ✭ 245 (+965.22%)

Mutual labels: speech

VAD-LTSD

Efficient voice activity detection algorithm using long-term speech information

Stars: ✭ 37 (+60.87%)

Mutual labels: speech

Lhotse

Stars: ✭ 236 (+926.09%)

Mutual labels: speech

HMMBase.jl

Hidden Markov Models for Julia.

Stars: ✭ 83 (+260.87%)

Mutual labels: hmm

Setk

Tools for Speech Enhancement integrated with Kaldi

Stars: ✭ 227 (+886.96%)

Mutual labels: speech

Audio Signal Processing

Audio or speech signal processing guide.

Stars: ✭ 45 (+95.65%)

Mutual labels: speech

Volute

Raspberry Pi + Nodejs = Speech Robot

Stars: ✭ 224 (+873.91%)

Mutual labels: speech

mahjong

开源中文分词工具包，中文分词Web API，Lucene中文分词，中英文混合分词

Stars: ✭ 40 (+73.91%)

Mutual labels: hmm

Speech Enhancement

Deep learning for audio denoising

Stars: ✭ 207 (+800%)

Mutual labels: speech

JD-NMF

Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)

Stars: ✭ 20 (-13.04%)

Mutual labels: speech

Neural Voice Cloning With Few Samples

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Stars: ✭ 211 (+817.39%)

Mutual labels: speech

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (+286.96%)

Mutual labels: speech

Timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

Stars: ✭ 202 (+778.26%)

Mutual labels: speech

LinLP

使用Python进行自然语言处理相关实践，如新词发现，主题模型，隐马尔模型词性标注，Word2Vec，情感分析

Stars: ✭ 43 (+86.96%)

Mutual labels: hmm

Lingvo

Stars: ✭ 2,361 (+10165.22%)

Mutual labels: speech

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

Stars: ✭ 65 (+182.61%)

Mutual labels: speech

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

Stars: ✭ 189 (+721.74%)

Mutual labels: speech

D-TDNN

PyTorch implementation of Densely Connected Time Delay Neural Network

Stars: ✭ 60 (+160.87%)

Mutual labels: speech

Vq Vae Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Stars: ✭ 187 (+713.04%)

Mutual labels: speech

room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

Stars: ✭ 143 (+521.74%)

Mutual labels: speech

Siricontrol System

Control anything with Siri voice commands.

Stars: ✭ 180 (+682.61%)

Mutual labels: speech

web-speech-demo

Learn how to build a simple text-to-speech voice app for the web using the Web Speech API.

Stars: ✭ 19 (-17.39%)

Mutual labels: speech

Deep speaker Speaker recognition system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

Stars: ✭ 174 (+656.52%)

Mutual labels: speech

lidbox

End-to-end spoken language identification out of the box.

Stars: ✭ 39 (+69.57%)

Mutual labels: speech

Pytorch Kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+9017.39%)

Mutual labels: speech

melgan

MelGAN implementation with Multi-Band and Full Band supports...

Stars: ✭ 54 (+134.78%)

Mutual labels: speech

Tts Papers

🐸 collection of TTS papers

Stars: ✭ 160 (+595.65%)

Mutual labels: speech

SignDetect

This application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.

Stars: ✭ 21 (-8.7%)

Mutual labels: speech

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+8273.91%)

Mutual labels: speech

speech to text

how to use the Google Cloud Speech API to transcribe audio/video files.

Stars: ✭ 35 (+52.17%)

Mutual labels: speech

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (+500%)

Mutual labels: speech

NBSS

The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".

Stars: ✭ 77 (+234.78%)

Mutual labels: speech

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (+486.96%)

Mutual labels: speech

voice-based-email-for-blind

Emailing System for visually impaired persons

Stars: ✭ 35 (+52.17%)

Mutual labels: speech

Avpi

an open source voice command macro software