WsayWindows "say"
Stars: ✭ 36 (-82.44%)
FAST-RIRThis is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (-56.1%)
SignDetectThis application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.
Stars: ✭ 21 (-89.76%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (-84.88%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-94.15%)
AudioData manipulation and transformation for audio signal processing, powered by PyTorch
Stars: ✭ 1,262 (+515.61%)
pie百度云流式语音识别客户端 SDK
Stars: ✭ 62 (-69.76%)
IMS-ToucanText-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+43.9%)
Voice GenderGender recognition by voice and speech analysis
Stars: ✭ 248 (+20.98%)
NBSSThe official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Stars: ✭ 77 (-62.44%)
capeContinuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
Stars: ✭ 29 (-85.85%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-91.71%)
TtsTools to convert text to speech 📚💬
Stars: ✭ 84 (-59.02%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+6665.85%)
ventib📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.
Stars: ✭ 43 (-79.02%)
PraatPraat: Doing Phonetics By Computer
Stars: ✭ 675 (+229.27%)
pytorch-pcenPyTorch reimplementation of per-channel energy normalization for audio.
Stars: ✭ 80 (-60.98%)
SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Stars: ✭ 661 (+222.44%)
WavegradImplementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+19.51%)
Chatbot Watson AndroidAn Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (-17.56%)
DragonflySpeech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
Stars: ✭ 209 (+1.95%)
TacotronAudio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.
Stars: ✭ 493 (+140.49%)
SubsyncSubtitle Speech Synchronizer
Stars: ✭ 379 (+84.88%)
data-at-hand-mobileMobile application for exploring fitness data using both speech and touch interaction.
Stars: ✭ 50 (-75.61%)
EspnetEnd-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+2111.22%)
Libfaceidlibfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.
Stars: ✭ 354 (+72.68%)
CVCCVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)
Stars: ✭ 45 (-78.05%)
Alan Sdk IosAlan AI iOS SDK adds a voice assistant or chatbot to your app. Supports Swift, Objective-C.
Stars: ✭ 318 (+55.12%)
Tts Papers🐸 collection of TTS papers
Stars: ✭ 160 (-21.95%)
Xr3player🎧 🎼 Advanced JavaFX Media Player
Stars: ✭ 472 (+130.24%)
Multimodal-Gesture-Recognition-with-LSTMs-and-CTCAn end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.
Stars: ✭ 25 (-87.8%)
CboardAAC communication system with text-to-speech for the browser
Stars: ✭ 437 (+113.17%)
CidlibThe CIDLib general purpose C++ development environment
Stars: ✭ 179 (-12.68%)
Aeneasaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+847.32%)
Kaldi OnnxKaldi model converter to ONNX
Stars: ✭ 174 (-15.12%)
StyleSpeechOfficial implementation of Meta-StyleSpeech and StyleSpeech
Stars: ✭ 161 (-21.46%)
PocketsphinxPocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop
Stars: ✭ 2,934 (+1331.22%)
StlThe ITU-T Software Tool Library (G.191)
Stars: ✭ 44 (-78.54%)
ShifterPitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-89.27%)
KaldiioA pure python module for reading and writing kaldi ark files
Stars: ✭ 160 (-21.95%)
obviA Polymer 3+ webcomponent / button for doing speech recognition
Stars: ✭ 54 (-73.66%)
Tacotron pytorchPyTorch implementation of Tacotron speech synthesis model.
Stars: ✭ 242 (+18.05%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: ✭ 1,756 (+756.59%)