https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

✭ 24

python TeX HTML CSS machine-learning clustering dsp scikit-learn speech audio-analysis data-reduction noise-reduction audio-processing

Shifter

Pitch shifter using WSOLA and resampling implemented by Python3

✭ 22

python shell signal-processing speech voice-control voice-conversion speech-processing

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

✭ 89

typescript HTML rust CSS javascript twitch angular azure webrtc speech captions tts subtitles speech-recognition speech-to-text obs stt text-animation tauri akita stt-plugins

TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis

✭ 65

python speech tts speech-synthesis gan frequency-domain tfgan fidelity-speech-synthesis

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

✭ 21

java offline voice-commands speech voice-recognition speech-recognition voice-chat speech-to-text voice-control voice-assistant speech-to-text-android on-device

room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

✭ 143

shell speech acoustics room-impulse-response

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

✭ 21

TeX chatbot voice corpus speech conversational-ui tts speech-recognition stt asr

lidbox

End-to-end spoken language identification out of the box.

✭ 39

python big-data deep-learning tensorflow speech audio-analysis language-recognition language-identification spoken-language-recognition spoken-language-identification

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

✭ 90

python shell deep-learning neural-network speech impulse-response generative-adversarial-network automatic-speech-recognition rir augmentation acoustics room-impulse-response synthetic-data conditional-generation diffuse-scattering implicit-neural-representation

SignDetect

This application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.

✭ 21

python CSS javascript HTML shell nodejs flask video voice sign-language speech

eidos-audition

Collection of auditory models.

✭ 25

C++Starlark python shell pipeline algorithms signal-processing speech perception auditory

NBSS

The official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".

✭ 77

python speech pytorch multi-channel separation narrow-band full-band

cape

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

✭ 29

python audio text speech pytorch transformer vit cape positional-encoder positional-encoding visual-transformer positional-embedding

icassp2019-latex-template

ICASSP 2019 official Latex template

✭ 21

TeX latex conference signal-processing speech ieee acoustics icassp icassp-2019

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

✭ 179

shell data speech speech-recognition audio-data speech-to-text asr speech-activities

ventib

📈 Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.

✭ 43

javascript python HTML CSS chart statistics analytics speech speech-pattern-analysis

pytorch-pcen

PyTorch reimplementation of per-channel energy normalization for audio.

✭ 80

python audio speech pytorch

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

✭ 205

python pyaudio speech speech-recognition speech-to-text asr wav2vec wav2vec2

txt2speech

Convert text to speech using Google Translate API

✭ 38

ruby bing speech

anycontrol

Voice control for your websites and applications

✭ 53

javascript voice speech speech-recognition speech-to-text voice-control voice-assistant speech-api anycontrol

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

✭ 58

Jupyter Notebook python shell raspberry-pi deep-learning neural-network tensorflow scikit-learn speech recurrent-neural-networks speech-recognition ensemble-learning convolutional-neural-networks audio-recognition

Multimodal-Gesture-Recognition-with-LSTMs-and-CTC

An end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.

✭ 25

python tensorflow keras speech lstm ctc skeletal multimodal-gesture-recognition

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

✭ 295

python text-to-speech deep-learning toolkit speech pytorch tts speech-synthesis speech-processing

react-native-speech-bubble

💬 A speech bubble dialog component for React Native.

✭ 50

javascript ui react-native component dialog speech bubble typewriter introduction speech-bubble

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

✭ 278

Jupyter Notebook python shell perl speech voice-conversion one-shot disentanglement-learning speech-generation

121-180 of 184 speech projects

first

‹

›