All Projects → SIP-Lab → CNN-VAD

SIP-Lab / CNN-VAD

Licence: MIT license
A Convolutional Neural Network based Voice Activity Detector for Smartphones

Programming Languages

Jupyter Notebook
11667 projects
CMake
9771 projects
C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language
matlab
3953 projects
Objective-C++
1391 projects

Projects that are alternatives of or similar to CNN-VAD

scim
[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.
Stars: ✭ 17 (-71.67%)
Mutual labels:  digital-signal-processing, speech-processing
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+163.33%)
Mutual labels:  digital-signal-processing, speech-processing
computer-vision-notebooks
👁️ An authorial set of fundamental Python recipes on Computer Vision and Digital Image Processing.
Stars: ✭ 89 (+48.33%)
Mutual labels:  digital-signal-processing
dsp-kit
A digital signal processing library in Javascript
Stars: ✭ 32 (-46.67%)
Mutual labels:  digital-signal-processing
EmiyaEngine
只要蘊藏著想成為真物的意志,偽物就比真物還要來得真實。
Stars: ✭ 27 (-55%)
Mutual labels:  digital-signal-processing
paperless app
An Android/iOS app for Paperless
Stars: ✭ 381 (+535%)
Mutual labels:  smartphone
awesome-speech-enhancement
A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.
Stars: ✭ 48 (-20%)
Mutual labels:  speech-processing
Macro-Deck
Macro Deck converts your phone, tablet or any other device with an up-to-date internet browser into an powerful remote macro pad to perform single actions or even multiple actions with just one tap.
Stars: ✭ 282 (+370%)
Mutual labels:  smartphone
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+1301.67%)
Mutual labels:  speech-processing
speechrec
a simple speech recognition app using the Web Speech API Interfaces
Stars: ✭ 18 (-70%)
Mutual labels:  speech-processing
Huawei-Challenge-Speaker-Identification
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
Stars: ✭ 34 (-43.33%)
Mutual labels:  speech-processing
lstm har
LSTM based human activity recognition using smart phone sensor dataset
Stars: ✭ 20 (-66.67%)
Mutual labels:  smartphone
awesome-keyword-spotting
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
Stars: ✭ 150 (+150%)
Mutual labels:  speech-processing
QuantumSpeech-QCNN
IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
Stars: ✭ 71 (+18.33%)
Mutual labels:  speech-processing
spafe
🔉 spafe: Simplified Python Audio Features Extraction
Stars: ✭ 310 (+416.67%)
Mutual labels:  speech-processing
dsp.rs
Digital Signal Processing
Stars: ✭ 60 (+0%)
Mutual labels:  digital-signal-processing
TriFlow
TriFlow: Triaging Android Applications using Speculative Information Flows
Stars: ✭ 12 (-80%)
Mutual labels:  smartphone
Shifter
Pitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-63.33%)
Mutual labels:  speech-processing
dsp
Header only C++14 library containing various digital signal processing utilities.
Stars: ✭ 30 (-50%)
Mutual labels:  digital-signal-processing
Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Stars: ✭ 205 (+241.67%)
Mutual labels:  speech-processing

Convolutional Neural Network based Voice Activity Detector

This GitHub repository is the code accompaniment of the following paper:

A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection
Abhishek Sehgal and Nasser Kehtarnavaz - University of Texas at Dallas
https://ieeexplore.ieee.org/document/8278160/

Abstract: This paper presents a smartphone app that performs real-time voice activity detection based on convolutional neural network. Real-time implementation issues are discussed showing how the slow inference time associated with convolutional neural networks is addressed. The developed smartphone app is meant to act as a switch for noise reduction in the signal processing pipelines of hearing devices, enabling noise estimation or classification to be conducted in noise-only parts of noisy speech signals. The developed smartphone app is compared with a previously developed voice activity detection app as well as with two highly cited voice activity detection algorithms. The experimental results indicate that the developed app using convolutional neural network outperforms the previously developed smartphone app.

Resources

Supporting materials related to thsi work are available via the following links:

Link Description
https://ieeexplore.ieee.org/document/8278160/ IEEE Access Manuscript
http://www.utdallas.edu/~kehtar/CNN-VAD.mp4 Videoclip of Convolutional Neural Network VAD running in real-time on Android and iOS smartphone platforms

Getting Started

A User's Guide is provided which describes how to run the codes for training and real-time operation on Android and iOS smartphones platforms.

License and Citation

The codes are licensed under MIT license.

For any utilization of the code content of this repository, the following paper needs to get cited by the user:

  • A. Sehgal and N. Kehtarnavaz, "A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection," IEEE Access, vol. 6, pp. 9017-9026, Feb 2018. (Open Access)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].