This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Stars: ✭ 51 (-75.12%)

Mutual labels: speech-to-text

Esp8266sam

Speech synthesis for ESP8266 using S.A.M. port

Stars: ✭ 199 (-2.93%)

Mutual labels: speech

Clovacall

ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)

Stars: ✭ 151 (-26.34%)

Mutual labels: speech-recognition

Transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Stars: ✭ 55,742 (+27091.22%)

Mutual labels: speech-recognition

Deepspeech German

Automatic Speech Recognition (ASR) - German

Stars: ✭ 179 (-12.68%)

Mutual labels: speech-recognition

Nonocaptcha

An asynchronized Python library to automate solving ReCAPTCHA v2 using audio

Stars: ✭ 744 (+262.93%)

Mutual labels: speech-to-text

YouTube-Tutorials--Italian

📂 Source Code for (some of) the Programming Tutorials from my Italian YouTube Channel and website ProgrammareInPython.it. This is just a small portion of the content: please visit the website for more.

Stars: ✭ 28 (-86.34%)

Mutual labels: speech-recognition

Parrots

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese.

Stars: ✭ 48 (-76.59%)

Mutual labels: speech-recognition

bingspeech-api-client

Microsoft Bing Speech API client in node.js

Stars: ✭ 32 (-84.39%)

Mutual labels: speech-to-text

Project alias

Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.

Stars: ✭ 1,577 (+669.27%)

Mutual labels: speech-recognition

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (-22.93%)

Mutual labels: speech

Cortex M Kws

Cortex M KWS example with Tengine Lite.

Stars: ✭ 45 (-78.05%)

Mutual labels: speech-recognition

Athena

A free and open source replacement for Google Assistant on Android devices, meant to integrate with the Sapphire Framework. It contains both speech-to-text and text-to-speech services. It does not require Google services or network connectivity

Stars: ✭ 73 (-64.39%)

Mutual labels: speech-to-text

Jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Stars: ✭ 158 (-22.93%)

Mutual labels: speech-to-text

Wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Stars: ✭ 5,907 (+2781.46%)

Mutual labels: speech-recognition

Formant Analyzer

iOS application for finding formants in spoken sounds

Stars: ✭ 43 (-79.02%)

Mutual labels: speech-recognition

kaldi-alligner

scripts to align a given wave to its transcription using trained models by Kaldi

Stars: ✭ 24 (-88.29%)

Mutual labels: asr

Sounder

An intent recognizing algorithm to predict the intent of a given text.

Stars: ✭ 118 (-42.44%)

Mutual labels: speech-recognition

Praat

Praat: Doing Phonetics By Computer

Stars: ✭ 675 (+229.27%)

Mutual labels: speech

React Native Dialogflow

A React-Native Bridge for the Google Dialogflow (API.AI) SDK

Stars: ✭ 182 (-11.22%)

Mutual labels: speech

Swiftspeech

A speech recognition framework designed for SwiftUI.

Stars: ✭ 149 (-27.32%)

Mutual labels: speech-recognition

Pansori

Tools for ASR Corpus Generation from Online Video

Stars: ✭ 106 (-48.29%)

Mutual labels: speech-recognition

Awesome Diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Stars: ✭ 673 (+228.29%)

Mutual labels: speech-recognition

KAREN

KAREN: Unifying Hatespeech Detection and Benchmarking

Stars: ✭ 18 (-91.22%)

Mutual labels: speech

Dialectid e2e

End to End Dialect Identification using Convolutional Neural Network

Stars: ✭ 40 (-80.49%)

Mutual labels: speech

Cidlib

The CIDLib general purpose C++ development environment

Stars: ✭ 179 (-12.68%)

Mutual labels: speech-recognition

Awesome Speech Recognition Speech Synthesis Papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Stars: ✭ 2,085 (+917.07%)

Mutual labels: speech-recognition

Segan

Speech Enhancement Generative Adversarial Network in TensorFlow

Stars: ✭ 661 (+222.44%)

Mutual labels: speech

Voice

🎤 React Native Voice Recognition library for iOS and Android (Online and Offline Support)

Stars: ✭ 993 (+384.39%)

Mutual labels: speech-recognition

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (-53.66%)

Mutual labels: speech

Wsay

Windows "say"

Stars: ✭ 36 (-82.44%)

Mutual labels: speech

Speech Recognition Neural Network

This is the end-to-end Speech Recognition neural network, deployed in Keras. This was my final project for Artificial Intelligence Nanodegree @Udacity.

Stars: ✭ 148 (-27.8%)

Mutual labels: speech-recognition

Ios ml

List of Machine Learning, AI, NLP solutions for iOS. The most recent version of this article can be found on my blog.

Stars: ✭ 1,409 (+587.32%)

Mutual labels: speech-recognition

Voicy

@voicybot Telegram bot main repository

Stars: ✭ 620 (+202.44%)

Mutual labels: speech-to-text

Tfg Voice Conversion

Deep Learning-based Voice Conversion system

Stars: ✭ 115 (-43.9%)

Mutual labels: speech

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

Stars: ✭ 189 (-7.8%)

Mutual labels: speech

Kaldi Gop

Computes the GMM-based Goodness of Pronunciation (GOP). Bases on Kaldi.

Stars: ✭ 104 (-49.27%)

Mutual labels: speech-recognition

Speech Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Stars: ✭ 565 (+175.61%)

Mutual labels: asr

Listen Attend Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Stars: ✭ 147 (-28.29%)

Mutual labels: asr

Speech recognition

中文语音识别

Stars: ✭ 534 (+160.49%)

Mutual labels: speech-recognition

Arvutaja

An Android app for voice actions in Estonian and English

Stars: ✭ 28 (-86.34%)

Mutual labels: speech-recognition

formulas-python

Ritchie CLI formulas in Python 🐍

Stars: ✭ 17 (-91.71%)

Mutual labels: speech-recognition

Tensorflow-Keyword-Spotting

Keyword spotting using various architecture like convolutional vggnet , 1D convolutional network and CTC.

Stars: ✭ 27 (-86.83%)

Mutual labels: speech-recognition

Speechtotext Websockets Java

SDK & Sample to do speech recognition using websockets in Java

Stars: ✭ 11 (-94.63%)

Mutual labels: speech-to-text

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-94.15%)

Mutual labels: speech

Aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Stars: ✭ 1,942 (+847.32%)

Mutual labels: speech

Ctcdecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, prefix search, beam search and token passing. Implemented in Python.

Stars: ✭ 529 (+158.05%)

Mutual labels: speech-recognition

Speech Denoising Wavenet

A neural network for end-to-end speech denoising

Stars: ✭ 516 (+151.71%)

Mutual labels: speech

Tacotron

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

Stars: ✭ 493 (+140.49%)

Mutual labels: speech

Mycroft Precise

A lightweight, simple-to-use, RNN wake word listener

Stars: ✭ 481 (+134.63%)

Mutual labels: speech-recognition

Xr3player

🎧 🎼 Advanced JavaFX Media Player

Stars: ✭ 472 (+130.24%)

Mutual labels: speech

Zerospeech Tts Without T

A Pytorch implementation for the ZeroSpeech 2019 challenge.

Stars: ✭ 100 (-51.22%)

Mutual labels: asr

Rhasspy

Offline private voice assistant for many human languages

Stars: ✭ 458 (+123.41%)

Mutual labels: speech-recognition

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+839.51%)

Mutual labels: speech

301-360 of 528 similar projects

first

‹

›