WatbotAn Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Stars: ✭ 64 (-45.3%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-37.61%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (-36.75%)
GttsPython library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+1013.68%)
speech recognition ctcUse ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
Stars: ✭ 40 (-65.81%)
Java Speech ApiThe J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Stars: ✭ 490 (+318.8%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+35.04%)
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-51.28%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-83.76%)
CboardAAC communication system with text-to-speech for the browser
Stars: ✭ 437 (+273.5%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-80.34%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-57.26%)
Neural spEnd-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+248.72%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (-18.8%)
StlThe ITU-T Software Tool Library (G.191)
Stars: ✭ 44 (-62.39%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-42.74%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+91.45%)
AudioData manipulation and transformation for audio signal processing, powered by PyTorch
Stars: ✭ 1,262 (+978.63%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-89.74%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (+209.4%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-88.03%)
Dialectid e2eEnd to End Dialect Identification using Convolutional Neural Network
Stars: ✭ 40 (-65.81%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-85.47%)
Tts🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+4538.46%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+11754.7%)
HolobotHoloBot is a reusable 3D interface that allows HoloLens & VR users to interact with any bot using Mixed Reality & Speech.
Stars: ✭ 114 (-2.56%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-29.91%)
Android SpeechAndroid speech recognition and text to speech made easy
Stars: ✭ 310 (+164.96%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-88.89%)
WsayWindows "say"
Stars: ✭ 36 (-69.23%)
data-at-hand-mobileMobile application for exploring fitness data using both speech and touch interaction.
Stars: ✭ 50 (-57.26%)
Pocketsphinx PythonPython interface to CMU Sphinxbase and Pocketsphinx libraries
Stars: ✭ 298 (+154.7%)
AdaSpeechAdaSpeech: Adaptive Text to Speech for Custom Voice
Stars: ✭ 108 (-7.69%)
TtsTools to convert text to speech 📚💬
Stars: ✭ 84 (-28.21%)
Sednndeep learning based speech enhancement using keras or pytorch, make it easy to use
Stars: ✭ 288 (+146.15%)
PhomemeSimple sentence mixing tool (work in progress)
Stars: ✭ 18 (-84.62%)
PykaldiA Python wrapper for Kaldi
Stars: ✭ 756 (+546.15%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-71.79%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+121.37%)
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-79.49%)
AudiomatePython library for handling audio datasets.
Stars: ✭ 99 (-15.38%)
simple-obs-sttSpeech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (-23.93%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-58.12%)
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-82.05%)
PraatPraat: Doing Phonetics By Computer
Stars: ✭ 675 (+476.92%)
opensource-voice-toolsA repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: ✭ 21 (-82.05%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-78.63%)
FAST-RIRThis is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (-23.08%)
OpenasrA pytorch based end2end speech recognition system.
Stars: ✭ 69 (-41.03%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (+5.13%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-5.13%)
WikipronMassively multilingual pronunciation mining
Stars: ✭ 99 (-15.38%)
Nlp Paper自然语言处理领域下的对话语音领域,整理相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
Stars: ✭ 67 (-42.74%)
VadVoice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: ✭ 622 (+431.62%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-36.75%)