Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

Stars: ✭ 37 (-94.52%)

Mutual labels: speech

web-speech-demo

Learn how to build a simple text-to-speech voice app for the web using the Web Speech API.

Stars: ✭ 19 (-97.19%)

Mutual labels: speech

Inaspeechsegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Stars: ✭ 352 (-47.85%)

Mutual labels: speech

jarvis

Jarvis Home Automation

Stars: ✭ 81 (-88%)

Mutual labels: speech

Xr3player

🎧 🎼 Advanced JavaFX Media Player

Stars: ✭ 472 (-30.07%)

Mutual labels: speech

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-92.3%)

Mutual labels: speech

Css10

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Stars: ✭ 302 (-55.26%)

Mutual labels: speech

jackpair

p2p speech encrypting device with analog audio interface suitable for GSM phones

Stars: ✭ 26 (-96.15%)

Mutual labels: speech

Sonus

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Stars: ✭ 532 (-21.19%)

Mutual labels: speech

nabaztag-php

a simple php implementation of a Nabaztag server

Stars: ✭ 14 (-97.93%)

Mutual labels: speech

Speech Vad Demo

集成Webrtc的VAD，用于切分音频文件

Stars: ✭ 259 (-61.63%)

Mutual labels: speech

speech to text

how to use the Google Cloud Speech API to transcribe audio/video files.

Stars: ✭ 35 (-94.81%)

Mutual labels: speech

Awesome Kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Stars: ✭ 393 (-41.78%)

Mutual labels: speech

Audio Signal Processing

Audio or speech signal processing guide.

Stars: ✭ 45 (-93.33%)

Mutual labels: speech

hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Stars: ✭ 88 (-86.96%)

Mutual labels: speech

sova-asr

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-81.78%)

Mutual labels: speech

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-98.22%)

Mutual labels: speech

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (-46.37%)

Mutual labels: speech

Speech256

An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.

Stars: ✭ 51 (-92.44%)

Mutual labels: speech

Java Speech Api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Stars: ✭ 490 (-27.41%)

Mutual labels: speech

ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

Stars: ✭ 40 (-94.07%)

Mutual labels: speech

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+704%)

Mutual labels: speech

torch-asg

Auto Segmentation Criterion (ASG) implemented in pytorch

Stars: ✭ 42 (-93.78%)

Mutual labels: speech

Nodejs Speech

Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.

Stars: ✭ 545 (-19.26%)

Mutual labels: speech

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (-89.19%)

Mutual labels: speech

Android Speech

Android speech recognition and text to speech made easy

Stars: ✭ 310 (-54.07%)

Mutual labels: speech

SER-datasets

A collection of datasets for the purpose of emotion recognition/detection in speech.

Stars: ✭ 74 (-89.04%)

Mutual labels: speech

Cboard

AAC communication system with text-to-speech for the browser

Stars: ✭ 437 (-35.26%)

Mutual labels: speech

speech recognition ctc

Use ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别

Stars: ✭ 40 (-94.07%)

Mutual labels: speech

Pocketsphinx Python

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Stars: ✭ 298 (-55.85%)

Mutual labels: speech

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (-76.59%)

Mutual labels: speech

Speech Emotion Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Stars: ✭ 633 (-6.22%)

Mutual labels: speech

MelNet-SpeechGeneration

Implementation of MelNet in PyTorch to generate high-fidelity audio samples

Stars: ✭ 19 (-97.19%)

Mutual labels: speech

Sednn

deep learning based speech enhancement using keras or pytorch, make it easy to use

Stars: ✭ 288 (-57.33%)

Mutual labels: speech

HTK

The Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.

Stars: ✭ 23 (-96.59%)

Mutual labels: speech

Neural sp

End-to-end ASR/LM implementation with PyTorch

Stars: ✭ 408 (-39.56%)

Mutual labels: speech

opensnips

Open source projects related to Snips https://snips.ai/.

Stars: ✭ 50 (-92.59%)

Mutual labels: speech

Speech Aligner

speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

Stars: ✭ 259 (-61.63%)

Mutual labels: speech

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (-85.93%)

Mutual labels: speech

Speech Denoising Wavenet

A neural network for end-to-end speech denoising

Stars: ✭ 516 (-23.56%)

Mutual labels: speech

Voice2Mesh

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Stars: ✭ 67 (-90.07%)

Mutual labels: speech

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

Stars: ✭ 49 (-92.74%)

Mutual labels: speech

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (-66.81%)

Mutual labels: speech

Voice Converter Cyclegan

Voice Converter Using CycleGAN and Non-Parallel Data

Stars: ✭ 384 (-43.11%)

Mutual labels: speech

minutes

🔭 Speaker diarization via transfer learning