py-lidbox / lidbox

Licence: MIT license

End-to-end spoken language identification out of the box.

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to lidbox

lingua-go

👄 The most accurate natural language detection library for Go, suitable for long and short text alike

Stars: ✭ 684 (+1653.85%)

Mutual labels: language-recognition, language-identification

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-38.46%)

Mutual labels: speech, audio-analysis

Inaspeechsegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Stars: ✭ 352 (+802.56%)

Mutual labels: speech, audio-analysis

ytpriv

YT metadata exporter

Stars: ✭ 28 (-28.21%)

Mutual labels: big-data

sgd

An R package for large scale estimation with stochastic gradient descent

Stars: ✭ 55 (+41.03%)

Mutual labels: big-data

dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Stars: ✭ 39 (+0%)

Mutual labels: big-data

SignDetect

This application is developed to help speechless people interact with others with ease. It detects voice and converts the input speech into a sign language based video.

Stars: ✭ 21 (-46.15%)

Mutual labels: speech

eidos-audition

Collection of auditory models.

Stars: ✭ 25 (-35.9%)

Mutual labels: speech

merkle-db

High-scalability analytics database built on immutable merkle-trees

Stars: ✭ 44 (+12.82%)

Mutual labels: big-data

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (+38.46%)

Mutual labels: big-data

dialectID siam

Dialect identification using Siamese network

Stars: ✭ 15 (-61.54%)

Mutual labels: language-recognition

Quantitative-Big-Imaging-2018

(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018

Stars: ✭ 50 (+28.21%)

Mutual labels: big-data

leetspeek

Open and collaborative content from leet hackers!

Stars: ✭ 11 (-71.79%)

Mutual labels: big-data

mmtf-spark

Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.

Stars: ✭ 20 (-48.72%)

Mutual labels: big-data

awesome-tools

curated list of awesome tools and libraries for specific domains

Stars: ✭ 31 (-20.51%)

Mutual labels: big-data

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (+223.08%)

Mutual labels: big-data

javaer-mind

Java 程序员进阶学习的思维导图

Stars: ✭ 66 (+69.23%)

Mutual labels: big-data

phoenix-queryserver

Apache Phoenix Query Server

Stars: ✭ 33 (-15.38%)

Mutual labels: big-data

couchdb-pkg

Apache CouchDB Packaging support files

Stars: ✭ 24 (-38.46%)

Mutual labels: big-data

da-tacos

A Dataset for Cover Song Identification and Understanding

Stars: ✭ 50 (+28.21%)

Mutual labels: audio-analysis

View All Similar Projects ➔

lidbox

Spoken language identification (LId) out of the box using TensorFlow.
Models implemented with tf.keras.
Metadata handling with pandas DataFrames.
High-performance, parallel preprocessing pipelines with tf.data
Simple spectral and cepstral feature extraction on the GPU with tf.signal.
Average detection cost (C_avg) implemented as a tf.keras.metrics.Metric subclass.
Angular proximity loss implemented as a tf.keras.losses.Loss subclass.

Why would I want to use this?

You need a simple, deep learning based speech classification pipeline. For example: waveform -> VAD filter -> augment audio data -> serialize all data to a single binary file -> extract log-scale Mel-spectra or MFCC -> use DNN/CNN/LSTM/GRU/attention (etc.) to classify by signal labels
You want to train a language vector/embedding extractor model (e.g. x-vector) on large amounts of data.
You have a TensorFlow/Keras model that you train on the GPU and want the tf.data.Dataset extraction pipeline to also be on the GPU
You want an end-to-end pipeline that uses TensorFlow 2 as much as possible

Why would I not want to use this?

You are happy doing everything with Kaldi or some other toolkits
You don't want to debug by reading the source code when something goes wrong
You don't want to install TensorFlow 2 and configure its dependencies (CUDA etc.)
You want to train phoneme recognizers or use CTC

Examples

Installing

Python 3.7 or 3.8 is required.

From source

python3 -m pip install https://github.com/py-lidbox/lidbox/archive/master.zip

Most recent version from PyPI

python3 -m pip install 'lidbox==1.0.0rc0'

TensorFlow

TensorFlow 2 is not included in the package requirements because you might want to do custom configuration to get the GPU working etc.

If you don't want to customize anything and instead prefer something that just works for now, the following should be enough:

python3 -m pip install tensorflow

Editable install

If you plan on making changes to the code, it is easier to install lidbox as a Python package in setuptools develop mode:

git clone --depth 1 https://github.com/py-lidbox/lidbox.git
python3 -m pip install --editable ./lidbox

Then, if you make changes to the code, there's no need to reinstall the package since the changes are reflected immediately. Just be careful not to make changes when lidbox is running, because TensorFlow will use its autograph package to convert some of the Python functions to TF graphs, which might fail if the code changes suddenly.

Citing `lidbox`

@inproceedings{Lindgren2020,
    author={Matias Lindgren and Tommi Jauhiainen and Mikko Kurimo},
    title={{Releasing a Toolkit and Comparing the Performance of Language Embeddings Across Various Spoken Language Identification Datasets}},
    year=2020,
    booktitle={Proc. Interspeech 2020},
    pages={467--471},
    doi={10.21437/Interspeech.2020-2706},
    url={http://dx.doi.org/10.21437/Interspeech.2020-2706}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

py-lidbox / lidbox

Programming Languages

Labels

Projects that are alternatives of or similar to lidbox

lidbox

Why would I want to use this?

Why would I not want to use this?

Examples

Installing

From source

Most recent version from PyPI

TensorFlow

Editable install

Citing `lidbox`

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

py-lidbox / lidbox

Programming Languages

Labels

Projects that are alternatives of or similar to lidbox

lidbox

Why would I want to use this?

Why would I not want to use this?

Examples

Installing

From source

Most recent version from PyPI

TensorFlow

Editable install

Citing lidbox

Citing `lidbox`