SeganSpeech Enhancement Generative Adversarial Network in TensorFlow
Stars: ✭ 661 (+129.51%)
Pytorch Kaldipytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+628.13%)
Speech Emotion AnalyzerThe neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (+119.79%)
LIUMScripts for LIUM SpkDiarization tools
Stars: ✭ 28 (-90.28%)
Amazing Python Scripts🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
Stars: ✭ 229 (-20.49%)
fadeA Simulation Framework for Auditory Discrimination Experiments
Stars: ✭ 12 (-95.83%)
KARENKAREN: Unifying Hatespeech Detection and Benchmarking
Stars: ✭ 18 (-93.75%)
L2cLearning to Cluster. A deep clustering strategy.
Stars: ✭ 262 (-9.03%)
flite-goGo bindings for Flite (festival-lite)
Stars: ✭ 14 (-95.14%)
TASNETTime-domain Audio Separation Network (IN PYTORCH)
Stars: ✭ 18 (-93.75%)
spokestack-androidExtensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (-81.94%)
ChaidnnHLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs
Stars: ✭ 258 (-10.42%)
jackpairp2p speech encrypting device with analog audio interface suitable for GSM phones
Stars: ✭ 26 (-90.97%)
Dlpython courseПримеры для курса "Программирование глубоких нейронных сетей на Python"
Stars: ✭ 266 (-7.64%)
nabaztag-phpa simple php implementation of a Nabaztag server
Stars: ✭ 14 (-95.14%)
hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Stars: ✭ 88 (-69.44%)
speech to texthow to use the Google Cloud Speech API to transcribe audio/video files.
Stars: ✭ 35 (-87.85%)
Pose Residual Network PytorchCode for the Pose Residual Network introduced in 'MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network' paper https://arxiv.org/abs/1807.04067
Stars: ✭ 277 (-3.82%)
tt-vae-ganTimbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.
Stars: ✭ 37 (-87.15%)
web-speech-demoLearn how to build a simple text-to-speech voice app for the web using the Web Speech API.
Stars: ✭ 19 (-93.4%)
Speech Feature ExtractionFeature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (-72.92%)
edittsOfficial implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-74.31%)
speech-transformerTransformer implementation speciaized in speech recognition tasks using Pytorch.
Stars: ✭ 40 (-86.11%)
Fre-GAN-pytorchFre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-74.65%)
RkdOfficial pytorch Implementation of Relational Knowledge Distillation, CVPR 2019
Stars: ✭ 257 (-10.76%)
SER-datasetsA collection of datasets for the purpose of emotion recognition/detection in speech.
Stars: ✭ 74 (-74.31%)
Twitter Sent DnnDeep Neural Network for Sentiment Analysis on Twitter
Stars: ✭ 270 (-6.25%)
speech recognition ctcUse ctc to do chinese speech recognition by keras / 通过keras和ctc实现中文语音识别
Stars: ✭ 40 (-86.11%)
Deep Learning In ProductionIn this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Stars: ✭ 3,104 (+977.78%)
ttslearnttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-45.14%)
MelNet-SpeechGenerationImplementation of MelNet in PyTorch to generate high-fidelity audio samples
Stars: ✭ 19 (-93.4%)
Noise2Noise-audio denoising without clean training dataSource code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…
Stars: ✭ 49 (-82.99%)
HTKThe Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.
Stars: ✭ 23 (-92.01%)
Awesome Speech EnhancementA tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Stars: ✭ 257 (-10.76%)
opensnipsOpen source projects related to Snips https://snips.ai/.
Stars: ✭ 50 (-82.64%)
minutes🔭 Speaker diarization via transfer learning
Stars: ✭ 25 (-91.32%)
nlp-classA Natural Language Processing course taught by Professor Ghassemi
Stars: ✭ 95 (-67.01%)
Bigdata18Transfer learning for time series classification
Stars: ✭ 284 (-1.39%)
Voice2MeshCVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Stars: ✭ 67 (-76.74%)
sova-asrSOVA ASR (Automatic Speech Recognition)
Stars: ✭ 123 (-57.29%)
UniSpeechUniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (-22.22%)
Deepcvendor independent deep learning library, compiler and inference framework microcomputers and micro-controllers
Stars: ✭ 260 (-9.72%)
gtranscribeSoftware for interview transcription
Stars: ✭ 12 (-95.83%)
Speech256An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.
Stars: ✭ 51 (-82.29%)
linear16Converts an audio file to LINEAR16 Google-speech compatible file.
Stars: ✭ 14 (-95.14%)
DeepSegmentorSequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)
Stars: ✭ 17 (-94.1%)
ser-with-w2v2Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Stars: ✭ 40 (-86.11%)
VAD-LTSDEfficient voice activity detection algorithm using long-term speech information
Stars: ✭ 37 (-87.15%)
datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Stars: ✭ 13,870 (+4715.97%)
Speech Alignerspeech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
Stars: ✭ 259 (-10.07%)
jarvisJarvis Home Automation
Stars: ✭ 81 (-71.87%)
JD-NMFJoint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (-93.06%)
deepspeech.mxnetA MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (-71.53%)