A speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.

✭ 35

javascript speech-recognition discord-bot speech speech-to-text voice-commands

Wsay

Windows "say"

✭ 36

windows command-line-tool speech text-to-speech tts speech-synthesis

Lightspeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

✭ 31

python pytorch speech text-to-speech tts speech-synthesis

Pykaldi

A Python wrapper for Kaldi

✭ 756

python numpy wrapper speech-recognition speech language-model feature-extraction asr kaldi

Annyang

💬 Speech recognition for your site

✭ 6,216

javascript HTML hacktoberfest speech-recognition speech speech-to-text voice

Praat

Praat: Doing Phonetics By Computer

✭ 675

c speech

Segan

Speech Enhancement Generative Adversarial Network in TensorFlow

✭ 661

python deep-learning tensorflow deep-neural-networks gan speech generative-model generative-adversarial-networks

Speech Emotion Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

✭ 633

python3 jupyter-notebook deep-learning data-science keras neural-network natural-language-processing deep-neural-networks speech-recognition speech voice natural-language-understanding emotion

Vad

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

✭ 622

matlab data lstm speech-recognition attention speech dnn

Nodejs Speech

Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.

✭ 545

typescript nodejs machine-learning speech speech-to-text

Sonus

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

✭ 532

javascript node speech-recognition speech speech-to-text alexa voice-recognition voice-control

Speech Denoising Wavenet

A neural network for end-to-end speech denoising

✭ 516

python deep-learning machine-learning neural-networks speech speech-processing end-to-end wavenet

Tacotron

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

✭ 493

html machine-learning audio speech tts tacotron

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

✭ 490

java api google speech-recognition speech speech-to-text speech-synthesis recognition

Xr3player

🎧 🎼 Advanced JavaFX Media Player

✭ 472

java javafx mp3 speech audio-processing audio-player audio-visualizer web-browser

Cboard

AAC communication system with text-to-speech for the browser

✭ 437

javascript languages symbols progressive-web-app speech text-to-speech tts aac

Specaugment

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

✭ 408

python pytorch tensorflow speech-recognition speech data-augmentation

Neural sp

End-to-end ASR/LM implementation with PyTorch

✭ 408

python pytorch streaming speech-recognition transformer attention-mechanism attention seq2seq speech language-model asr sequence-to-sequence ctc

Awesome Kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

✭ 393

awesome-list speech-recognition speech speech-to-text kaldi

Voice Converter Cyclegan

Voice Converter Using CycleGAN and Non-Parallel Data

✭ 384

python speech cyclegan

Tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

✭ 305

python jupyter-notebook deep-learning pytorch speech text-to-speech tts tacotron

Voice Builder

An opensource text-to-speech (TTS) voice building tool

✭ 362

javascript nlp speech text-to-speech tts speech-synthesis

Inaspeechsegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

✭ 352

python music segmentation speech noise audio-analysis

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

✭ 5,427

python Jupyter Notebook deep-learning pytorch speech text-to-speech tts tacotron vocoder tensorflow2 tacotron2 melgan speaker-encoder dataset-analysis glow-tts multiband-melgan gantts

Ios 10 Sampler

Code examples for new APIs of iOS 10.

✭ 3,341

swift objective c ios demo convolutional-neural-networks cnn speech metal image-recognition ios10 uiviewpropertyanimator metal-performance-shaders metal-cnn

Android Speech

Android speech recognition and text to speech made easy

✭ 310

java android speech tts recognition

Css10

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

✭ 302

html dataset speech speech-to-text

Pocketsphinx Python

Python interface to CMU Sphinxbase and Pocketsphinx libraries

✭ 298

python speech-recognition speech voice

Pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

✭ 297

python speech dsp speech-synthesis speech-processing python-wrapper

Sednn

deep learning based speech enhancement using keras or pytorch, make it easy to use

✭ 288

python deep-learning deep-neural-networks speech

Speech Vad Demo

集成Webrtc的VAD，用于切分音频文件

✭ 259

c webrtc speech

Speech Aligner

speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

✭ 259

cpp speech kaldi

Amazing Python Scripts

🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

✭ 229

python jupyter-notebook machine-learning artificial-intelligence speech calculator webcam projects

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

✭ 49

Jupyter Notebook python deep-learning speech autoencoder data-collection noise-reduction speech-enhancement speech-denoising noise-removal noise2noise audio-denoising audio-enhancement

hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

✭ 88

python waveform speech pytorch gan wavenet speech-processing denoising denoiser hifigan

minutes

🔭 Speaker diarization via transfer learning

✭ 25

python machine-learning library speech transfer-learning ubc speaker-diarization

flite-go

Go bindings for Flite (festival-lite)

✭ 14

go shell speech

sova-asr

SOVA ASR (Automatic Speech Recognition)

✭ 123

python javascript CSS HTML Dockerfile speech speech-recognition automatic-speech-recognition speech-to-text stt asr wav2letter asr-model

tt-vae-gan

Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

✭ 37

python music speech generative-adversarial-network variational-autoencoder timbre timbre-transfer voice-conversion-gan

Speech256

An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.

✭ 51

Verilog python fpga speech synthesizer verilog hdl retrochallenge

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

✭ 74

python cython text-to-speech speech pytorch tts speech-synthesis speech-edit

ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'