All Projects β†’ haoheliu β†’ torchsubband

haoheliu / torchsubband

Licence: MIT license
Pytorch implementation of subband decomposition

Programming Languages

HTML
75241 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to torchsubband

Sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
Stars: ✭ 764 (+1112.7%)
Mutual labels:  signal-processing, speech-recognition, speech-processing
spafe
πŸ”‰ spafe: Simplified Python Audio Features Extraction
Stars: ✭ 310 (+392.06%)
Mutual labels:  signal-processing, speech-processing
SpleeterRT
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Stars: ✭ 111 (+76.19%)
Mutual labels:  signal-processing, speech-enhancement
pyssp
python speech signal processing library
Stars: ✭ 18 (-71.43%)
Mutual labels:  signal-processing, speech-processing
Nonautoreggenprogress
Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
Stars: ✭ 118 (+87.3%)
Mutual labels:  speech-recognition, speech-processing
Zzz Retired openstt
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
Stars: ✭ 146 (+131.75%)
Mutual labels:  speech-recognition, speech-processing
Chinese-automatic-speech-recognition
Chinese speech recognition
Stars: ✭ 147 (+133.33%)
Mutual labels:  signal-processing, speech-recognition
Awesome Diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Stars: ✭ 673 (+968.25%)
Mutual labels:  speech-recognition, speech-processing
Dla
Deep learning for audio processing
Stars: ✭ 142 (+125.4%)
Mutual labels:  signal-processing, speech-recognition
Surfboard
Novoic's audio feature extraction library
Stars: ✭ 318 (+404.76%)
Mutual labels:  signal-processing, speech-processing
Keras Sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Stars: ✭ 47 (-25.4%)
Mutual labels:  speech-recognition, speech-processing
UHV-OTS-Speech
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Stars: ✭ 94 (+49.21%)
Mutual labels:  speech-recognition, speech-processing
Formant Analyzer
iOS application for finding formants in spoken sounds
Stars: ✭ 43 (-31.75%)
Mutual labels:  speech-recognition, speech-processing
Speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Stars: ✭ 242 (+284.13%)
Mutual labels:  speech-recognition, speech-processing
Pncc
A implementation of Power Normalized Cepstral Coefficients: PNCC
Stars: ✭ 40 (-36.51%)
Mutual labels:  speech-recognition, speech-processing
Shifter
Pitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-65.08%)
Mutual labels:  signal-processing, speech-processing
Espnet
End-to-End Speech Processing Toolkit
Stars: ✭ 4,533 (+7095.24%)
Mutual labels:  speech-recognition, speech-enhancement
Uspeech
Speech recognition toolkit for the arduino
Stars: ✭ 448 (+611.11%)
Mutual labels:  speech-recognition, speech-processing
Awesome Speech Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Stars: ✭ 257 (+307.94%)
Mutual labels:  signal-processing, speech-processing
Tutorial separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Stars: ✭ 151 (+139.68%)
Mutual labels:  signal-processing, speech-processing

PyPI version

torchsubband

This's a package for subband decomposition.

It can transform waveform into three kinds of revertable subband feature representations, which are potentially useful features for music source separation or similar tasks.

488zR0.png

Usage

Installation

pip install torchsubband

A simple example:

from torchsubband import SubbandDSP
import torch

# nn.Module
model = SubbandDSP(subband=2) # You can choose 1,2,4, or 8 
batchsize=3 # any int number
channel=1 # any int number
length = 44100*2 # any int number
input = torch.randn((batchsize,channel,length))

# Get subband waveform
subwav = model.wav_to_sub(input)
reconstruct_1 = model.sub_to_wav(subwav,length=length)

# Get subband magnitude spectrogram
sub_spec,cos,sin = model.wav_to_mag_phase_sub_spec(input)
reconstruct_2 = model.mag_phase_sub_spec_to_wav(sub_spec,cos,sin,length=length)

# Get subband complex spectrogram
sub_complex_spec = model.wav_to_complex_sub_spec(input)
reconstruct_3 = model.complex_sub_spec_to_wav(sub_complex_spec,length=length)

Reconstruction loss

The following table shows the reconstruction quality. We tried a set of audio to conduct subband decomposition and reconstruction.

Subbands L1loss PESQ SiSDR
2 1e-6 4.64 61.8
4 1e-6 4.64 58.9
8 5e-5 4.64 58.2

You can also test this program by running the following test script. It will give you some evaluation output.

from torchsubband import test
test()

Citation

If you find our code useful for your research, please consider citing:

    @misc{liu2021cwspresunet,
        title={CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet},
        author={Haohe Liu and Qiuqiang Kong and Jiafeng Liu},
        year={2021},
        eprint={2112.04685},
        archivePrefix={arXiv},
        primaryClass={cs.SD}
    }
    @inproceedings{Liu2020,   
      author={Haohe Liu and Lei Xie and Jian Wu and Geng Yang},   
      title={{Channel-Wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music}},   
      year=2020,   
      booktitle={Proc. Interspeech 2020},   
      pages={1241--1245},   
      doi={10.21437/Interspeech.2020-2555},   
      url={http://dx.doi.org/10.21437/Interspeech.2020-2555}   
    }
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].