scriptySpeech to text bot for Discord using Mozilla's DeepSpeech
Stars: ✭ 14 (-98.89%)
Vosk ServerWebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Stars: ✭ 277 (-77.98%)
python-soxrFast and high quality sample-rate conversion library for Python
Stars: ✭ 25 (-98.01%)
Wav2letterFacebook AI Research's Automatic Speech Recognition Toolkit
Stars: ✭ 5,907 (+369.55%)
Transformer-TransducerPyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
Stars: ✭ 61 (-95.15%)
Vosk Android DemoOffline speech recognition for Android with Vosk library.
Stars: ✭ 271 (-78.46%)
timit-preprocessorExtract mfcc vectors and phones from TIMIT dataset
Stars: ✭ 14 (-98.89%)
Parrots Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese.
Stars: ✭ 48 (-96.18%)
Nara wpeDifferent implementations of "Weighted Prediction Error" for speech dereverberation
Stars: ✭ 265 (-78.93%)
rustfstRust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Stars: ✭ 104 (-91.73%)
Awesome DiarizationA curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Stars: ✭ 673 (-46.5%)
DCASE2020 task1Code for DCASE 2020 task 1a and task 1b.
Stars: ✭ 72 (-94.28%)
Iter ReasonCode for Iterative Reasoning Paper (CVPR 2018)
Stars: ✭ 263 (-79.09%)
AurioAudio Fingerprinting & Retrieval for .NET
Stars: ✭ 84 (-93.32%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (-79.41%)
fast-mixerMini recording and mixing studio for android
Stars: ✭ 47 (-96.26%)
Speech recognitionSpeech recognition module for Python, supporting several engines and APIs, online and offline.
Stars: ✭ 5,999 (+376.87%)
ml-with-audioHF's ML for Audio study group
Stars: ✭ 104 (-91.73%)
HotVoiceAdds Speech Recognition support to AutoHotkey, via a C# DLL
Stars: ✭ 41 (-96.74%)
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Stars: ✭ 493 (-60.81%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-94.2%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-96.1%)
apiSpeechly public API definitions and generated code
Stars: ✭ 15 (-98.81%)
Libreasr💬 An On-Premises, Streaming Speech Recognition System
Stars: ✭ 633 (-49.68%)
rnnt decoder cudaAn efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
Stars: ✭ 60 (-95.23%)
hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Stars: ✭ 88 (-93%)
twangLibrary for pure Rust advanced audio synthesis.
Stars: ✭ 83 (-93.4%)
Patterspeech-to-text in pytorch
Stars: ✭ 71 (-94.36%)
salutejsSmartApp Framework для создания навыков семейства Виртуальных Ассистентов "Салют" на языке JavaScript
Stars: ✭ 35 (-97.22%)
DuMEA fast, versatile, easy-to-use and cross-platform Media Encoder based on FFmpeg
Stars: ✭ 66 (-94.75%)
Android-TTS-STTOne line solution for Android Text to speech(TTS) & Speech to Text(STT) translation problem
Stars: ✭ 77 (-93.88%)
PhormaticsUsing A.I. and computer vision to build a virtual personal fitness trainer. (Most Startup-Viable Hack - HackNYU2018)
Stars: ✭ 79 (-93.72%)
Aca CodeMatlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Stars: ✭ 67 (-94.67%)
PnccA implementation of Power Normalized Cepstral Coefficients: PNCC
Stars: ✭ 40 (-96.82%)
speech-to-textmixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
Stars: ✭ 61 (-95.15%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-98.01%)
telltimeiOS application to tell the time in the British way 🇬🇧⏰
Stars: ✭ 49 (-96.1%)
OpenimagerImage processing Toolkit in R
Stars: ✭ 45 (-96.42%)
video-audio-toolsTo process/edit video and audio with Python+FFmpeg. [简单实用] 基于Python+FFmpeg的视频和音频的处理/剪辑。
Stars: ✭ 164 (-86.96%)
UnitySoundManagerSound manager with 3 tracks, language system, pooling system, Fade in/out effects, EventTrigger system and more.
Stars: ✭ 55 (-95.63%)
Beethoven🎸 A maestro of pitch detection.
Stars: ✭ 601 (-52.23%)
SimpleCompressorCode and theory of a look-ahead compressor / limiter.
Stars: ✭ 70 (-94.44%)
ruby-magicSimple interface to libmagic for Ruby Programming Language
Stars: ✭ 23 (-98.17%)
Chords.pyNeural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core
Stars: ✭ 24 (-98.09%)
Sytodya Flutter "speech to todo" app example
Stars: ✭ 79 (-93.72%)
ctc-asrEnd-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
Stars: ✭ 112 (-91.1%)
pydiogment📣 Python library for audio augmentation
Stars: ✭ 64 (-94.91%)
kospeechOpen-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
Stars: ✭ 456 (-63.75%)
Mycroft PreciseA lightweight, simple-to-use, RNN wake word listener
Stars: ✭ 481 (-61.76%)
scim[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.
Stars: ✭ 17 (-98.65%)
tsunamiA simple but powerful audio editor
Stars: ✭ 41 (-96.74%)
Tika PythonTika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Stars: ✭ 997 (-20.75%)
QC++ Library for Audio Digital Signal Processing
Stars: ✭ 481 (-61.76%)