Vq Vae SpeechPyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
Stars: ✭ 187 (-85.14%)
Voice BuilderAn opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (-71.22%)
web-speech-demoLearn how to build a simple text-to-speech voice app for the web using the Web Speech API.
Stars: ✭ 19 (-98.49%)
Chatbot Watson AndroidAn Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
Stars: ✭ 169 (-86.57%)
GiadaYour Hardcore Loop Machine.
Stars: ✭ 903 (-28.22%)
Aeneasaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+54.37%)
TacotronA TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Stars: ✭ 1,756 (+39.59%)
App🤖 A GitHub App to automate acknowledging contributors to your open source projects
Stars: ✭ 358 (-71.54%)
facetFacet is a live coding system for algorithmic music
Stars: ✭ 72 (-94.28%)
Rnn ctcRecurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
Stars: ✭ 220 (-82.51%)
VocA physical model of the human vocal tract using literate programming, based on Pink Trombone.
Stars: ✭ 129 (-89.75%)
CCAligner🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.
Stars: ✭ 131 (-89.59%)
DragonflySpeech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
Stars: ✭ 209 (-83.39%)
Code Switching PapersA curated list of research papers and resources on code-switching
Stars: ✭ 122 (-90.3%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-99.05%)
InaspeechsegmenterCNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Stars: ✭ 352 (-72.02%)
DurianImplementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-91.18%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-98.89%)
GttsPython library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+3.58%)
ArcanArcan - [Display Server, Multimedia Framework, Game Engine] -> "Desktop Engine"
Stars: ✭ 885 (-29.65%)
AudioData manipulation and transformation for audio signal processing, powered by PyTorch
Stars: ✭ 1,262 (+0.32%)
speech-transformerTransformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (-96.82%)
Labelimg🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in images
Stars: ✭ 16,088 (+1178.86%)
All Contributors CliTool to help automate adding contributor acknowledgements according to the all-contributors specification ✨
Stars: ✭ 345 (-72.58%)
Idcardrecognition🇨🇳中国大陆第二代身份证 🆔 识别,自动读出身份证上的信息(姓名、性别、民族、住址、身份证号码)并截取身份证照片, iOS开发者交流:①群:446310206 ②群:426087546
Stars: ✭ 191 (-84.82%)
Ccpd[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition
Stars: ✭ 1,252 (-0.48%)
Text DetectorTool which allow you to detect and translate text.
Stars: ✭ 173 (-86.25%)
Lc FinderAn image annotation and object detection tool written in C
Stars: ✭ 163 (-87.04%)
DplugAudio plugin framework. VST2/VST3/AU/AAX/LV2 for Linux/macOS/Windows.
Stars: ✭ 341 (-72.89%)
FaceidAn implementation of YOLO v2 for direct facial recognition within detection layer.
Stars: ✭ 144 (-88.55%)
Speechpy💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Stars: ✭ 833 (-33.78%)
LabelboxLabelbox is the fastest way to annotate data to build and ship computer vision applications.
Stars: ✭ 1,588 (+26.23%)
JD-NMFJoint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (-98.41%)
Dcnets Implementation for <Decoupled Networks> in CVPR'18.
Stars: ✭ 115 (-90.86%)
Php Opencv ExamplesTutorial for computer vision and machine learning in PHP 7/8 by opencv (installation + examples + documentation)
Stars: ✭ 333 (-73.53%)
R8brain Free SrcHigh-quality pro audio sample rate converter / resampler C++ library
Stars: ✭ 238 (-81.08%)
OttoSampler, Sequencer, Multi-engine synth and effects - in a box! [WIP]
Stars: ✭ 2,390 (+89.98%)
openface mass compareAn openface script that runs a REST server. Posted images are compared against a large dataset, and the most likely match is returned. Works with https://hub.docker.com/r/uoacer/openface-mass-compare/
Stars: ✭ 22 (-98.25%)
Ios 10 SamplerCode examples for new APIs of iOS 10.
Stars: ✭ 3,341 (+165.58%)
deep avsrA PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Stars: ✭ 104 (-91.73%)
TtsTools to convert text to speech 📚💬
Stars: ✭ 84 (-93.32%)
Wav2letterSpeech Recognition model based off of FAIR research paper built using Pytorch.
Stars: ✭ 78 (-93.8%)
Mtcnnface detection and alignment with mtcnn
Stars: ✭ 66 (-94.75%)
UspeechSpeech recognition toolkit for the arduino
Stars: ✭ 448 (-64.39%)
rosechoTianbot Rosecho (Tianecho),中文语音人机交互模块,支持ROS即插即用
Stars: ✭ 28 (-97.77%)
Kaldi Active GrammarPython Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Stars: ✭ 196 (-84.42%)