The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (+142.57%)

Mutual labels: speech

MajorDomo-Scenarios

Сценарии для системы домашней автоматизации Majordomo

Stars: ✭ 12 (-94.06%)

Mutual labels: speech

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+853.47%)

Mutual labels: speech

Phomeme

Simple sentence mixing tool (work in progress)

Stars: ✭ 18 (-91.09%)

Mutual labels: speech

Cboard

AAC communication system with text-to-speech for the browser

Stars: ✭ 437 (+116.34%)

Mutual labels: speech

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-83.66%)

Mutual labels: speech

Python Speech recognition

A simple example for use speech recognition baidu api with python.

Stars: ✭ 106 (-47.52%)

Mutual labels: speech

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-88.12%)

Mutual labels: speech

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (+101.98%)

Mutual labels: speech

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (-55.94%)

Mutual labels: speech

Siricontrol System

Control anything with Siri voice commands.

Stars: ✭ 180 (-10.89%)

Mutual labels: speech

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-89.6%)

Mutual labels: speech

Voice Converter Cyclegan

Voice Converter Using CycleGAN and Non-Parallel Data

Stars: ✭ 384 (+90.1%)

Mutual labels: speech

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-89.6%)

Mutual labels: speech

Audiomate

Python library for handling audio datasets.

Stars: ✭ 99 (-50.99%)

Mutual labels: speech

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Stars: ✭ 90 (-55.45%)

Mutual labels: speech

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (+79.21%)

Mutual labels: speech

eidos-audition

Collection of auditory models.

Stars: ✭ 25 (-87.62%)

Mutual labels: speech

Wavegrad

A fast, high-quality neural vocoder.

Stars: ✭ 138 (-31.68%)

Mutual labels: speech

cape

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Stars: ✭ 29 (-85.64%)

Mutual labels: speech

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+2586.63%)

Mutual labels: speech

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (-11.39%)

Mutual labels: speech

Gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

Stars: ✭ 1,303 (+545.05%)

Mutual labels: speech

pytorch-pcen

PyTorch reimplementation of per-channel energy normalization for audio.

Stars: ✭ 80 (-60.4%)

Mutual labels: speech

Android Speech

Android speech recognition and text to speech made easy

Stars: ✭ 310 (+53.47%)

Mutual labels: speech

txt2speech

Convert text to speech using Google Translate API

Stars: ✭ 38 (-81.19%)

Mutual labels: speech

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

Stars: ✭ 189 (-6.44%)

Mutual labels: speech

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (-71.29%)

Mutual labels: speech

Pocketsphinx Python

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Stars: ✭ 298 (+47.52%)

Mutual labels: speech

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+46.04%)

Mutual labels: speech

Audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Stars: ✭ 1,262 (+524.75%)

Mutual labels: speech

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Stars: ✭ 278 (+37.62%)

Mutual labels: speech

Sednn

deep learning based speech enhancement using keras or pytorch, make it easy to use

Stars: ✭ 288 (+42.57%)

Mutual labels: speech

Naver-AI-Hackathon-Speech

2019 Clova AI Hackathon : Speech - Rank 12 / Team Kai.Lib

Stars: ✭ 26 (-87.13%)

Mutual labels: speech

Allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Stars: ✭ 135 (-33.17%)

Mutual labels: speech

lectures-all

Central repository for all lectures on deep learning at UPC ETSETB TelecomBCN.

Stars: ✭ 46 (-77.23%)

Mutual labels: speech

Speech Aligner

speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

Stars: ✭ 259 (+28.22%)

Mutual labels: speech

Wavegrad

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

Stars: ✭ 245 (+21.29%)

Mutual labels: speech

Tts

Tools to convert text to speech 📚💬

Stars: ✭ 84 (-58.42%)

Mutual labels: speech

Kerasdeepspeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Stars: ✭ 245 (+21.29%)

Mutual labels: speech

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

Stars: ✭ 49 (-75.74%)

Mutual labels: speech

Lhotse

Stars: ✭ 236 (+16.83%)

Mutual labels: speech

Deep speaker Speaker recognition system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

Stars: ✭ 174 (-13.86%)

Mutual labels: speech

Setk

Tools for Speech Enhancement integrated with Kaldi

Stars: ✭ 227 (+12.38%)