Keras KaldiKeras Interface for Kaldi ASR
Stars: ✭ 124 (+439.13%)
speech-to-text-code-patternReact app using the Watson Speech to Text service to transform voice audio into written text.
Stars: ✭ 37 (+60.87%)
Project aliasAlias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Stars: ✭ 1,577 (+6756.52%)
NonautoreggenprogressTracking the progress in non-autoregressive generation (translation, transcription, etc.)
Stars: ✭ 118 (+413.04%)
voice gender detection♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).
Stars: ✭ 51 (+121.74%)
Rnn TransducerMXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Stars: ✭ 114 (+395.65%)
TF-Speech-Recognition-Challenge-SolutionSource code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.
Stars: ✭ 58 (+152.17%)
Ml RoadMachine Learning Resources, Practice and Research
Stars: ✭ 1,776 (+7621.74%)
talkieText-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
Stars: ✭ 43 (+86.96%)
vocoPrivacy friendly voice control for the Candle Controller / WebThings Gateway
Stars: ✭ 18 (-21.74%)
PansoriTools for ASR Corpus Generation from Online Video
Stars: ✭ 106 (+360.87%)
spokestack-tray-androidA UI component that makes it easy to add voice interaction to your app.
Stars: ✭ 13 (-43.48%)
IMS-ToucanText-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+1182.61%)
DeltaDELTA is a deep learning based natural language and speech processing platform.
Stars: ✭ 1,479 (+6330.43%)
ppg-vcPPG-Based Voice Conversion
Stars: ✭ 154 (+569.57%)
Wav2letter.pytorchA fully convolution-network for speech-to-text, built on pytorch.
Stars: ✭ 104 (+352.17%)
megsA merged version of multiple open-source German speech datasets.
Stars: ✭ 21 (-8.7%)
Vosk ApiOffline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Stars: ✭ 1,357 (+5800%)
VoiceBridgeVoiceBridge - an AI-TOOLKIT Open Source C++ Speech Recognition Toolkit
Stars: ✭ 17 (-26.09%)
Factorized TdnnPyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
Stars: ✭ 98 (+326.09%)
EmotionalConversionStarGANThis repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".
Stars: ✭ 92 (+300%)
Ai Study人工智能学习资料超全整理,包含机器学习基础ML、深度学习基础DL、计算机视觉CV、自然语言处理NLP、推荐系统、语音识别、图神经网路、算法工程师面试题
Stars: ✭ 93 (+304.35%)
Deep Learning DrizzleDrench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Stars: ✭ 9,717 (+42147.83%)
ttsflowtensorflow speech synthesis c++ inference for voicenet
Stars: ✭ 17 (-26.09%)
SpeechlessSpeech-to-text based on wav2letter built for transfer learning
Stars: ✭ 89 (+286.96%)
VAENAR-TTSPyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
Stars: ✭ 66 (+186.96%)
JuliusOpen-Source Large Vocabulary Continuous Speech Recognition Engine
Stars: ✭ 1,258 (+5369.57%)
WavegradImplementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+965.22%)
B.e.n.j.i.B.E.N.J.I.- The Impossible Missions Force's digital assistant
Stars: ✭ 83 (+260.87%)
Tensorflow-Keyword-SpottingKeyword spotting using various architecture like convolutional vggnet , 1D convolutional network and CTC.
Stars: ✭ 27 (+17.39%)
Deepspeech Websocket ServerServer & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Stars: ✭ 79 (+243.48%)
NormitTranslations with speech synthesis in your terminal as a node package
Stars: ✭ 219 (+852.17%)
Sytodya Flutter "speech to todo" app example
Stars: ✭ 79 (+243.48%)
InimesedAn Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.
Stars: ✭ 65 (+182.61%)
PyspeechrevThis python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.
Stars: ✭ 74 (+221.74%)
wav2letterFacebook AI Research's Automatic Speech Recognition Toolkit
Stars: ✭ 6,026 (+26100%)
Catch-A-WaveformOfficial pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
Stars: ✭ 117 (+408.7%)
Asr benchmarkProgram to benchmark various speech recognition APIs
Stars: ✭ 71 (+208.7%)
mdmTerminal2Голосовой терминал для MajorDoMo
Stars: ✭ 24 (+4.35%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (+221.74%)
OpenasrA pytorch based end2end speech recognition system.
Stars: ✭ 69 (+200%)
DeepSpeech-APIThe code enables users to use Mozilla's Deep Speech model over the Web Browser.
Stars: ✭ 31 (+34.78%)
Dragonfirethe open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+4769.57%)
A chronology of deep learningTracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense research in AI.
Stars: ✭ 47 (+104.35%)
TFGANTFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis
Stars: ✭ 65 (+182.61%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (+217.39%)
VocganVocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Stars: ✭ 158 (+586.96%)
YourTTSYourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Stars: ✭ 217 (+843.48%)
YouTube-Tutorials--Italian📂 Source Code for (some of) the Programming Tutorials from my Italian YouTube Channel and website ProgrammareInPython.it. This is just a small portion of the content: please visit the website for more.
Stars: ✭ 28 (+21.74%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (+191.3%)
syn-speech-samplesAn application that demostrate the usage of Syn.Speech library for Speech Recognition
Stars: ✭ 24 (+4.35%)