All Projects → aishoot → Speech_Feature_Extraction

aishoot / Speech_Feature_Extraction

Licence: other
Feature extraction of speech signal is the initial stage of any speech recognition system.

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Speech Feature Extraction

eidos-audition
Collection of auditory models.
Stars: ✭ 25 (-67.95%)
Mutual labels:  signal-processing, speech
antropy
AntroPy: entropy and complexity of (EEG) time-series in Python
Stars: ✭ 111 (+42.31%)
Mutual labels:  signal-processing, feature-extraction
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+869.23%)
Mutual labels:  speech, feature-extraction
Surfboard
Novoic's audio feature extraction library
Stars: ✭ 318 (+307.69%)
Mutual labels:  signal-processing, feature-extraction
bob
Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob
Stars: ✭ 38 (-51.28%)
Mutual labels:  signal-processing, feature-extraction
Strugatzki
Algorithms for matching audio file similarities. Mirror of https://git.iem.at/sciss/Strugatzki
Stars: ✭ 38 (-51.28%)
Mutual labels:  signal-processing, feature-extraction
Audio Signal Processing
Audio or speech signal processing guide.
Stars: ✭ 45 (-42.31%)
Mutual labels:  signal-processing, speech
icassp2019-latex-template
ICASSP 2019 official Latex template
Stars: ✭ 21 (-73.08%)
Mutual labels:  signal-processing, speech
Shifter
Pitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-71.79%)
Mutual labels:  signal-processing, speech
D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
Stars: ✭ 60 (-23.08%)
Mutual labels:  speech
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+952.56%)
Mutual labels:  feature-extraction
kaldi helpers
🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-83.33%)
Mutual labels:  speech
deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Stars: ✭ 82 (+5.13%)
Mutual labels:  speech
VAD-LTSD
Efficient voice activity detection algorithm using long-term speech information
Stars: ✭ 37 (-52.56%)
Mutual labels:  speech
PyTorch-Model-Compare
Compare neural networks by their feature similarity
Stars: ✭ 119 (+52.56%)
Mutual labels:  feature-extraction
pyssp
python speech signal processing library
Stars: ✭ 18 (-76.92%)
Mutual labels:  signal-processing
Bike-Sharing-Demand-Kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Stars: ✭ 33 (-57.69%)
Mutual labels:  feature-extraction
2D 3D PolarFourierTransform
C++, CUDA, and MATLAB codes for the paper "An Exact and Fast Computation of Discrete Fourier Transform for Polar and Spherical Grid"
Stars: ✭ 31 (-60.26%)
Mutual labels:  signal-processing
ssqueezepy
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
Stars: ✭ 315 (+303.85%)
Mutual labels:  signal-processing
NTFk.jl
Unsupervised Machine Learning: Nonnegative Tensor Factorization + k-means clustering
Stars: ✭ 36 (-53.85%)
Mutual labels:  feature-extraction

Speech Feature Extraction

The repository describes the feature extraction methods for speech signals.

Free speech datasets

  • OpenLSR: OpenSLR is a site devoted to hosting speech and language resources, such as training corpora for speech recognition, and software related to speech recognition.
  • VoxForge: VoxForge is now mirroring the LT and the Teleccoperation group Open Speech Data Corpus for German with 35 hours of speech from about 180 speakers.
  • TIMIT: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
  • Mozilla Speech: Mozilla Releases the world's Second Largest Public Voice Data Set on Nov 29th, 2017.
  • Open Data for Deep Learning

File description

  • feature_extraction_functions.py: a set of feature extraction functions from RDShi-SpeakerCount.
  • MFCC: Mel-frequency cepstral coefficients calculation.
    • MFCC.py, MFCCTest.py: Compute the MFCC feature.
    • FeatureExtraction.ipynb: Speech preprocessing, including loading data, pre-emphasis, framing, window, Fourier-transform, power spectrum, filter banks, mfccs and mean normalization.
  • Volume: volume calculation.
  • ZeroCR: Zero-Crossing Rate calculation.
  • Pitch: Pitch calculation and pitch tracking.
  • Timbre: spectrogram drawing.
  • VAD: EPD (End-Point Detection), or Speech Detection, or VAD(Voice Activity Detection).

Requirements

Anaconda3 (Python3.x)

References & Code source

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].