Speech Feature ExtractionFeature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (+239.13%)
Multimodal-Gesture-Recognition-with-LSTMs-and-CTCAn end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.
Stars: ✭ 25 (+8.7%)
mchmmMarkov Chains and Hidden Markov Models in Python
Stars: ✭ 89 (+286.96%)
TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-21.74%)
idear🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (+265.22%)
PhomemeSimple sentence mixing tool (work in progress)
Stars: ✭ 18 (-21.74%)
browser-apis🦄 Cool & Fun Browser Web APIs 🥳
Stars: ✭ 21 (-8.7%)
speech-transformerTransformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (+73.91%)
GseGo efficient multilingual NLP and text segmentation; support english, chinese, japanese and other. Go 高性能多语言 NLP 和分词
Stars: ✭ 1,695 (+7269.57%)
citarCitar HMM part-of-speech tagger
Stars: ✭ 16 (-30.43%)
CIPBasic exercises of chinese information processing
Stars: ✭ 32 (+39.13%)
WavegradImplementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+965.22%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (+43.48%)
KerasdeepspeechA Keras CTC implementation of Baidu's DeepSpeech for model experimentation
Stars: ✭ 245 (+965.22%)
VAD-LTSDEfficient voice activity detection algorithm using long-term speech information
Stars: ✭ 37 (+60.87%)
Lhotse Stars: ✭ 236 (+926.09%)
HMMBase.jlHidden Markov Models for Julia.
Stars: ✭ 83 (+260.87%)
SetkTools for Speech Enhancement integrated with Kaldi
Stars: ✭ 227 (+886.96%)
VoluteRaspberry Pi + Nodejs = Speech Robot
Stars: ✭ 224 (+873.91%)
mahjong开源中文分词工具包,中文分词Web API,Lucene中文分词,中英文混合分词
Stars: ✭ 40 (+73.91%)
JD-NMFJoint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (-13.04%)
simple-obs-sttSpeech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+286.96%)
TimitThe DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
Stars: ✭ 202 (+778.26%)
LinLP使用Python进行自然语言处理相关实践,如新词发现,主题模型,隐马尔模型词性标注,Word2Vec,情感分析
Stars: ✭ 43 (+86.96%)
LingvoLingvo
Stars: ✭ 2,361 (+10165.22%)
TFGANTFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Stars: ✭ 65 (+182.61%)
D-TDNNPyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (+160.87%)
Vq Vae SpeechPyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
Stars: ✭ 187 (+713.04%)
room-impulse-responsesA list of publicly available room impulse response datasets and scripts to download them.
Stars: ✭ 143 (+521.74%)
web-speech-demoLearn how to build a simple text-to-speech voice app for the web using the Web Speech API.
Stars: ✭ 19 (-17.39%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (+69.57%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+9017.39%)
melganMelGAN implementation with Multi-Band and Full Band supports...
Stars: ✭ 54 (+134.78%)
Tts Papers🐸 collection of TTS papers
Stars: ✭ 160 (+595.65%)
SignDetectThis application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.
Stars: ✭ 21 (-8.7%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (+52.17%)
WavegradA fast, high-quality neural vocoder.
Stars: ✭ 138 (+500%)
NBSSThe official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Stars: ✭ 77 (+234.78%)
AllosaurusAllosaurus is a pretrained universal phone recognizer for more than 2000 languages
Stars: ✭ 135 (+486.96%)
Avpian open source voice command macro software
Stars: ✭ 130 (+465.22%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (+456.52%)
libfmplibfmp - Python package for teaching and learning Fundamentals of Music Processing (FMP)
Stars: ✭ 71 (+208.7%)
ventib📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.
Stars: ✭ 43 (+86.96%)
reacnetgeneratoran automatic reaction network generator for reactive molecular dynamics simulation
Stars: ✭ 25 (+8.7%)
KARENKAREN: Unifying Hatespeech Detection and Benchmarking
Stars: ✭ 18 (-21.74%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (+313.04%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-47.83%)
txt2speechConvert text to speech using Google Translate API
Stars: ✭ 38 (+65.22%)