Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-74.65%)
AdaSpeechAdaSpeech: Adaptive Text to Speech for Custom Voice
Stars: ✭ 108 (-62.5%)
RkdOfficial pytorch Implementation of Relational Knowledge Distillation, CVPR 2019
Stars: ✭ 257 (-10.76%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (-74.31%)
PhomemeSimple sentence mixing tool (work in progress)
Stars: ✭ 18 (-93.75%)
Twitter Sent DnnDeep Neural Network for Sentiment Analysis on Twitter
Stars: ✭ 270 (-6.25%)
Zero-Shot-TTSUnofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-88.54%)
speech recognition ctcUse ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
Stars: ✭ 40 (-86.11%)
audio noise clusteringhttps://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-91.67%)
Deep Learning In ProductionIn this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Stars: ✭ 3,104 (+977.78%)
simple-obs-sttSpeech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (-69.1%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-45.14%)
KeenASR-Android-PoCA proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html
Stars: ✭ 21 (-92.71%)
opensource-voice-toolsA repo listing known open source voice tools, ordered by where they sit in the voice stack
Stars: ✭ 21 (-92.71%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-93.4%)
FAST-RIRThis is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (-68.75%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-82.99%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-92.01%)
capeContinuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
Stars: ✭ 29 (-89.93%)
Awesome Speech EnhancementA tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Stars: ✭ 257 (-10.76%)
ASR-Audio-Data-LinksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (-37.85%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-82.64%)
pytorch-pcenPyTorch reimplementation of per-channel energy normalization for audio.
Stars: ✭ 80 (-72.22%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-91.32%)
txt2speechConvert text to speech using Google Translate API
Stars: ✭ 38 (-86.81%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (-67.01%)
TF-Speech-Recognition-Challenge-SolutionSource code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.
Stars: ✭ 58 (-79.86%)
Bigdata18Transfer learning for time series classification
Stars: ✭ 284 (-1.39%)
IMS-ToucanText-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Stars: ✭ 295 (+2.43%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-76.74%)
VQMIVCOfficial implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
Stars: ✭ 278 (-3.47%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (-57.29%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (-22.22%)
lectures-allCentral repository for all lectures on deep learning at UPC ETSETB TelecomBCN.
Stars: ✭ 46 (-84.03%)
Deepcvendor independent deep learning library, compiler and inference framework microcomputers and micro-controllers
Stars: ✭ 260 (-9.72%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-95.83%)
Speech256An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
Stars: ✭ 51 (-82.29%)
DeepregMedical image registration using deep learning
Stars: ✭ 245 (-14.93%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-95.14%)
Computer Vision Guide📖 This guide is to help you understand the basics of the computerized image and develop computer vision projects with OpenCV. Includes Python, Java, JavaScript, C# and C++ examples.
Stars: ✭ 244 (-15.28%)
Generative Inpainting PytorchA PyTorch reimplementation for paper Generative Image Inpainting with Contextual Attention (https://arxiv.org/abs/1801.07892)
Stars: ✭ 242 (-15.97%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-94.1%)
Dlwpt CodeCode for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
Stars: ✭ 3,054 (+960.42%)
ser-with-w2v2Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Stars: ✭ 40 (-86.11%)
LcnnLCNN: End-to-End Wireframe Parsing
Stars: ✭ 234 (-18.75%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+4715.97%)
Nanodet⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
Stars: ✭ 3,640 (+1163.89%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (-10.07%)
DarkonToolkit to Hack Your Deep Learning Models
Stars: ✭ 231 (-19.79%)
torch-asgAuto Segmentation Criterion (ASG) implemented in pytorch
Stars: ✭ 42 (-85.42%)
D-TDNNPyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (-79.17%)
kaldi helpers🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-95.49%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-71.53%)