https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-91.67%)

Mutual labels: speech

Deep Learning In Production

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

Stars: ✭ 3,104 (+977.78%)

Mutual labels: deep-neural-networks

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (-69.1%)

Mutual labels: speech

ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Stars: ✭ 158 (-45.14%)

Mutual labels: speech

KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Stars: ✭ 21 (-92.71%)

Mutual labels: speech

Awesome Distributed Deep Learning

A curated list of awesome Distributed Deep Learning resources.

Stars: ✭ 277 (-3.82%)

Mutual labels: deep-neural-networks

opensource-voice-tools

A repo listing known open source voice tools, ordered by where they sit in the voice stack

Stars: ✭ 21 (-92.71%)

Mutual labels: speech

MelNet-SpeechGeneration

Implementation of MelNet in PyTorch to generate high-fidelity audio samples

Stars: ✭ 19 (-93.4%)

Mutual labels: speech

FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Stars: ✭ 90 (-68.75%)

Mutual labels: speech

Noise2Noise-audio denoising without clean training data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoisi…

Stars: ✭ 49 (-82.99%)

Mutual labels: speech

eidos-audition

Collection of auditory models.

Stars: ✭ 25 (-91.32%)

Mutual labels: speech

HTK

The Hidden Markov Model Toolkit (HTK) from University of Cambridge, with fixed issues.

Stars: ✭ 23 (-92.01%)

Mutual labels: speech

cape

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Stars: ✭ 29 (-89.93%)

Mutual labels: speech

Awesome Speech Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

Stars: ✭ 257 (-10.76%)

Mutual labels: deep-neural-networks

ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities

Stars: ✭ 179 (-37.85%)

Mutual labels: speech

opensnips

Open source projects related to Snips https://snips.ai/.

Stars: ✭ 50 (-82.64%)

Mutual labels: speech

pytorch-pcen

PyTorch reimplementation of per-channel energy normalization for audio.

Stars: ✭ 80 (-72.22%)

Mutual labels: speech

minutes

🔭 Speaker diarization via transfer learning

Stars: ✭ 25 (-91.32%)

Mutual labels: speech

txt2speech

Convert text to speech using Google Translate API

Stars: ✭ 38 (-86.81%)

Mutual labels: speech

nlp-class

A Natural Language Processing course taught by Professor Ghassemi

Stars: ✭ 95 (-67.01%)

Mutual labels: speech

TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recognition-challenge). The solution ranked in top 5% in private leaderboard.

Stars: ✭ 58 (-79.86%)

Mutual labels: speech

Bigdata18

Transfer learning for time series classification

Stars: ✭ 284 (-1.39%)

Mutual labels: deep-neural-networks

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Stars: ✭ 295 (+2.43%)

Mutual labels: speech

Voice2Mesh

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Stars: ✭ 67 (-76.74%)

Mutual labels: speech

VQMIVC

Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!

Stars: ✭ 278 (-3.47%)

Mutual labels: speech

sova-asr

SOVA ASR (Automatic Speech Recognition)

Stars: ✭ 123 (-57.29%)

Mutual labels: speech

Naver-AI-Hackathon-Speech

2019 Clova AI Hackathon : Speech - Rank 12 / Team Kai.Lib

Stars: ✭ 26 (-90.97%)

Mutual labels: speech

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Stars: ✭ 224 (-22.22%)

Mutual labels: speech

lectures-all

Central repository for all lectures on deep learning at UPC ETSETB TelecomBCN.

Stars: ✭ 46 (-84.03%)

Mutual labels: speech

Deepc

vendor independent deep learning library, compiler and inference framework microcomputers and micro-controllers

Stars: ✭ 260 (-9.72%)

Mutual labels: deep-neural-networks

Probabilistic Face Embeddings

(ICCV 2019) Uncertainty-aware Face Representation and Recognition

Stars: ✭ 253 (-12.15%)

Mutual labels: deep-neural-networks

gtranscribe

Software for interview transcription

Stars: ✭ 12 (-95.83%)

Mutual labels: speech

Mixture Density Networks For Distribution And Uncertainty Estimation

A generic Mixture Density Networks (MDN) implementation for distribution and uncertainty estimation by using Keras (TensorFlow)

Stars: ✭ 249 (-13.54%)

Mutual labels: deep-neural-networks

Speech256

An FPGA implementation of a classic 80ies speech synthesizer. Done for the Retro Challenge 2017/10.

Stars: ✭ 51 (-82.29%)

Mutual labels: speech

Deepreg

Medical image registration using deep learning

Stars: ✭ 245 (-14.93%)

Mutual labels: deep-neural-networks

linear16

Converts an audio file to LINEAR16 Google-speech compatible file.

Stars: ✭ 14 (-95.14%)

Mutual labels: speech

Computer Vision Guide

📖 This guide is to help you understand the basics of the computerized image and develop computer vision projects with OpenCV. Includes Python, Java, JavaScript, C# and C++ examples.

Stars: ✭ 244 (-15.28%)

Mutual labels: deep-neural-networks

Bmw Tensorflow Inference Api Gpu

This is a repository for an object detection inference API using the Tensorflow framework.

Stars: ✭ 277 (-3.82%)

Mutual labels: deep-neural-networks

Generative Inpainting Pytorch

A PyTorch reimplementation for paper Generative Image Inpainting with Contextual Attention (https://arxiv.org/abs/1801.07892)

Stars: ✭ 242 (-15.97%)

Mutual labels: deep-neural-networks

DeepSegmentor

Sequence Segmentation using Joint RNN and Structured Prediction Models (ICASSP 2017)

Stars: ✭ 17 (-94.1%)

Mutual labels: speech

Dlwpt Code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.

Stars: ✭ 3,054 (+960.42%)

Mutual labels: deep-neural-networks

ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

Stars: ✭ 40 (-86.11%)

Mutual labels: speech

Lcnn

LCNN: End-to-End Wireframe Parsing

Stars: ✭ 234 (-18.75%)

Mutual labels: deep-neural-networks

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Stars: ✭ 13,870 (+4715.97%)

Mutual labels: speech

Nanodet

⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

Stars: ✭ 3,640 (+1163.89%)

Mutual labels: deep-neural-networks

Speech Aligner

speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription

Stars: ✭ 259 (-10.07%)

Mutual labels: speech

Darkon

Toolkit to Hack Your Deep Learning Models

Stars: ✭ 231 (-19.79%)

Mutual labels: deep-neural-networks

torch-asg

Auto Segmentation Criterion (ASG) implemented in pytorch

Stars: ✭ 42 (-85.42%)

Mutual labels: speech

D-TDNN

PyTorch implementation of Densely Connected Time Delay Neural Network