VchsmC++ 11 algorithm implementation for voice conversion using harmonic plus stochastic models
Stars: ✭ 38 (-70.77%)
room-impulse-responsesA list of publicly available room impulse response datasets and scripts to download them.
Stars: ✭ 143 (+10%)
Imagedetect✂️ Detect and crop faces, barcodes and texts in image with iOS 11 Vision api.
Stars: ✭ 286 (+120%)
Hand-Digits-RecognitionRecognize your own handwritten digits with Tensorflow, embedded in a PyQT5 GUI. The Neural Network was trained on MNIST.
Stars: ✭ 11 (-91.54%)
TeaspeakThe TeaSpeak server issue tracker
Stars: ✭ 81 (-37.69%)
brasilttsBrasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado…
Stars: ✭ 34 (-73.85%)
DisgordGo module for interacting with the documented Discord's bot interface; Gateway, REST requests and voice
Stars: ✭ 277 (+113.08%)
DiscordspeechbotA speech-to-text bot for discord with music commands and more using NodeJS. Ideally for controlling your Discord server using voice commands, can also be useful for hearing-impaired people.
Stars: ✭ 35 (-73.08%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (-70%)
NoisetorchReal-time microphone noise suppression on Linux.
Stars: ✭ 5,199 (+3899.23%)
awesome-rhasspyCarefully curated list of projects and resources for the voice assistant Rhasspy
Stars: ✭ 50 (-61.54%)
Alan Sdk PcfAlan AI Power Apps SDK adds a voice assistant or chatbot to your Microsoft Power Apps project.
Stars: ✭ 128 (-1.54%)
Iter ReasonCode for Iterative Reasoning Paper (CVPR 2018)
Stars: ✭ 263 (+102.31%)
JustAnotherVoiceChatTeamSpeak 3 plugin to control 3D voice communication in games
Stars: ✭ 21 (-83.85%)
WsayWindows "say"
Stars: ✭ 36 (-72.31%)
karenopen-source voice assistant
Stars: ✭ 19 (-85.38%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (+99.23%)
DeepspeechA PaddlePaddle implementation of ASR.
Stars: ✭ 1,219 (+837.69%)
Amazing Python Scripts🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Stars: ✭ 229 (+76.15%)
NBSSThe official repo of "Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training", "Multichannel Speech Separation with Narrow-band Conformer" and "NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer".
Stars: ✭ 77 (-40.77%)
LightspeechLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Stars: ✭ 31 (-76.15%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-62.31%)
Midi2voiceSinging synthesis from MIDI file
Stars: ✭ 102 (-21.54%)
EnglishStu英语学习软件,集成有道翻译、科大讯飞,有翻译、朗读示例、阅读评测功能
Stars: ✭ 27 (-79.23%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-80.77%)
pytorch-pcenPyTorch reimplementation of per-channel energy normalization for audio.
Stars: ✭ 80 (-38.46%)
AayaPersonal Voice Assistant
Stars: ✭ 20 (-84.62%)
txt2speechConvert text to speech using Google Translate API
Stars: ✭ 38 (-70.77%)
ruby-magicSimple interface to libmagic for Ruby Programming Language
Stars: ✭ 23 (-82.31%)
PhormaticsUsing A.I. and computer vision to build a virtual personal fitness trainer. (Most Startup-Viable Hack - HackNYU2018)
Stars: ✭ 79 (-39.23%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-36.92%)
AlexaAndroidNo description or website provided.
Stars: ✭ 15 (-88.46%)
Multimodal-Gesture-Recognition-with-LSTMs-and-CTCAn end-to-end system that performs temporal recognition of gesture sequences using speech and skeletal input. The model combines three networks with a CTC output layer that recognises gestures from continuous stream.
Stars: ✭ 25 (-80.77%)
Vc With GanVoice Conversion with GANs
Stars: ✭ 13 (-90%)
Voice-Denoising-ANA Conditional Generative Adverserial Network (cGAN) was adapted for the task of source de-noising of noisy voice auditory images. The base architecture is adapted from Pix2Pix.
Stars: ✭ 42 (-67.69%)
idear🎙️ Handsfree Audio Development Interface
Stars: ✭ 84 (-35.38%)
TtsText-to-Speech for Arduino
Stars: ✭ 118 (-9.23%)
tt-vae-ganTimbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.
Stars: ✭ 37 (-71.54%)
lectures-allCentral repository for all lectures on deep learning at UPC ETSETB TelecomBCN.
Stars: ✭ 46 (-64.62%)
Xunfei CljClojure封装讯飞语音SDK, 可提供给Emacs/Vim编辑器使用,或者命令行, 实现语音提醒/语音识别/语音转为命令等
Stars: ✭ 26 (-80%)
Voiceripple Voice Record Button that has ripple effect with users voice
Stars: ✭ 379 (+191.54%)
D-TDNNPyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (-53.85%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-43.08%)
Speechbrain.github.ioThe SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+86.15%)
Vonage Dotnet SdkNexmo REST API client for .NET, ASP.NET, ASP.NET MVC written in C#. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.
Stars: ✭ 76 (-41.54%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-90%)
download audioset📁 This repo makes it easy to download the raw audio files from AudioSet (32.45 GB, 632 classes).
Stars: ✭ 53 (-59.23%)
VocA physical model of the human vocal tract using literate programming, based on Pink Trombone.
Stars: ✭ 129 (-0.77%)
UnivoiceP2P VoIP in Unity
Stars: ✭ 128 (-1.54%)
Dcnets Implementation for <Decoupled Networks> in CVPR'18.
Stars: ✭ 115 (-11.54%)
Syn SpeechSyn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Stars: ✭ 57 (-56.15%)