Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → gkonovalov → android-vad

gkonovalov / android-vad

Licence: MIT license

This VAD library can process audio in real-time utilizing GMM which helps identify presence of human speech in an audio sample that contains a mixture of speech and noise.

Programming Languages

50402 projects - #5 most used programming language

36643 projects - #6 most used programming language

68154 projects - #9 most used programming language

5116 projects

30231 projects

Labels

audio android real-time offline webrtc gaussian-mixture-models vad gmm audio-processing voice-activity-detection

Projects that are alternatives of or similar to android-vad

voice-activity-detection

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Stars: ✭ 82 (+28.13%)

Mutual labels: vad, voice-activity-detection

spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Stars: ✭ 52 (-18.75%)

Mutual labels: vad, voice-activity-detection

On-device voice activity detection (VAD) powered by deep learning.

Stars: ✭ 76 (+18.75%)

Mutual labels: vad, voice-activity-detection

Automagically synchronize subtitles with video.

Stars: ✭ 5,167 (+7973.44%)

Mutual labels: vad, voice-activity-detection

Spokestack: give your iOS app a voice interface!

Stars: ✭ 27 (-57.81%)

Mutual labels: vad, voice-activity-detection

MachineLearning

Implementations of machine learning algorithm by Python 3

Stars: ✭ 16 (-75%)

Mutual labels: gaussian-mixture-models, gmm

Pythonic audio processing and generation framework

Stars: ✭ 69 (+7.81%)

Mutual labels: audio-processing

Repackaging ESRI's VTPK into an MBTiles container

Stars: ✭ 34 (-46.87%)

Mutual labels: offline

simple-waveform-visualizer

JS Audio API 놀이터

Stars: ✭ 31 (-51.56%)

Mutual labels: audio-processing

🔌 Everything you need to know to create offline-first web apps.

Stars: ✭ 2,792 (+4262.5%)

Mutual labels: offline

Speaker-Recognition

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

Stars: ✭ 94 (+46.88%)

Mutual labels: gmm

A free, open-source, offline Cantonese Dictionary for Windows, Mac, and Linux. Qt, SQLite. C++ and Python.

Stars: ✭ 67 (+4.69%)

Mutual labels: offline

MusicVisualizer

A music visualizer based on the ATMEGA328P-AU

Stars: ✭ 30 (-53.12%)

Mutual labels: audio-processing

A simple audio feature extraction library

Stars: ✭ 72 (+12.5%)

Mutual labels: audio-processing

🧊 convert your dynamic django site to a static one with one line of code.

Stars: ✭ 81 (+26.56%)

Mutual labels: offline

Monorepo that includes libraries, Serverless Framework plugins and development tools to simplify and enhance the development, deployment and use of the Data API for Aurora Serverless.

Stars: ✭ 23 (-64.06%)

Mutual labels: offline

meditation-timer

🧘 Progressive web application for timing your meditations

Stars: ✭ 23 (-64.06%)

Mutual labels: offline

Datadriven-GPVAD

The codebase for Data-driven general-purpose voice activity detection.

Stars: ✭ 81 (+26.56%)

Mutual labels: voice-activity-detection

localForage中文仓库，localForage改进了离线存储，提供简洁健壮的API，包括 IndexedDB, WebSQL, 和 localStorage。

Stars: ✭ 201 (+214.06%)

Mutual labels: offline

A serverless and offline-first PWA that lets you track your mood

Stars: ✭ 34 (-46.87%)

Mutual labels: offline

View All Similar Projects ➔

Android Voice Activity Detection (VAD)

This VAD library can process audio in real-time utilizing Gaussian Mixture Model (GMM) which helps identify presence of human speech in an audio sample that contains a mixture of speech and noise. VAD work offline and all processing done on device.

Library based on WebRTC VAD from Google which is reportedly one of the best available: it's fast, modern and free. This algorithm has found wide adoption and has recently become one of the gold-standards for delay-sensitive scenarios like web-based interaction.

If you are looking for a higher accuracy and faster processing time I recommend to use Deep Neural Networks(DNN). Please see for reference the following paper with DNN vs GMM comparison.

Parameters

VAD library only accepts 16-bit mono PCM audio stream and can work with next Sample Rates, Frame Sizes and Classifiers.

Valid Sample Rate	Valid Frame Size
8000Hz	80, 160, 240
16000Hz	160, 320, 480
32000Hz	320, 640, 960
48000Hz	480, 960, 1440

Valid Classifiers
NORMAL
LOW_BITRATE
AGGRESSIVE
VERY_AGGRESSIVE

Silence duration (ms) - this parameter used in Continuous Speech detector, the value of this parameter will define the necessary and sufficient duration of negative results to recognize it as silence.

Voice duration (ms) - this parameter used in Continuous Speech detector, the value of this parameter will define the necessary and sufficient duration of positive results to recognize result as speech.

Recommended parameters:

Sample Rate - 16KHz,
Frame Size - 160,
Mode - VERY_AGGRESSIVE,
Silence Duration - 500ms,
Voice Duration - 500ms;

Usage

VAD supports 2 different ways of detecting speech:

Continuous Speech listener was designed to detect long utterances without returning false positive results when user makes pauses between sentences.

 Vad vad = new Vad(VadConfig.newBuilder()
                .setSampleRate(VadConfig.SampleRate.SAMPLE_RATE_16K)
                .setFrameSize(VadConfig.FrameSize.FRAME_SIZE_160)
                .setMode(VadConfig.Mode.VERY_AGGRESSIVE)
                .setSilenceDurationMillis(500)
                .setVoiceDurationMillis(500)
                .build());

        vad.start();
        
        vad.addContinuousSpeechListener(short[] audioFrame, new VadListener() {
            @Override
            public void onSpeechDetected() {
                //speech detected!
            }

            @Override
            public void onNoiseDetected() {
                //noise detected!
            }
        });
        
        vad.stop();

Speech detector was designed to detect speech/noise in small audio frames and return result for every frame. This method will not work for long utterances.

 Vad vad = new Vad(VadConfig.newBuilder()
                .setSampleRate(VadConfig.SampleRate.SAMPLE_RATE_16K)
                .setFrameSize(VadConfig.FrameSize.FRAME_SIZE_160)
                .setMode(VadConfig.Mode.VERY_AGGRESSIVE)
                .build());

        vad.start();
        
        boolean isSpeech = vad.isSpeech(short[] audioFrame);
        
        vad.stop();

Requirements

Android VAD supports Android 4.1 (Jelly Bean) and later.

Development

To open the project in Android Studio:

Go to File menu or the Welcome Screen
Click on Open...
Navigate to VAD's root directory.
Select setting.gradle

Download

Gradle is the only supported build configuration, so just add the dependency to your project build.gradle file:

Add it in your root build.gradle at the end of repositories:

allprojects {
   repositories {
     maven { url 'https://jitpack.io' }
   }
}

Add the dependency

dependencies {
    implementation 'com.github.gkonovalov:android-vad:1.0.1'
}

You also can download precompiled AAR library and APK files from GitHub's releases page.

Georgiy Konovalov 2021 (c) MIT License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 64

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗