All Projects â†’ JusperLee â†’ UtterancePIT-Speech-Separation

JusperLee / UtterancePIT-Speech-Separation

Licence: other
According to funcwj's uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
matlab
3953 projects

Projects that are alternatives of or similar to UtterancePIT-Speech-Separation

mann-for-speech-separation
Neural Turing machine for source separation in Tensorflow
Stars: ✭ 18 (-67.27%)
Mutual labels:  speech-separation
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Stars: ✭ 841 (+1429.09%)
Mutual labels:  speech-separation
postloader
A scaffolding tool for projects using DataLoader, Flow and PostgreSQL.
Stars: ✭ 52 (-5.45%)
Mutual labels:  dataloader
Workshop-GraphQL
A GraphQL Server made for the workshop
Stars: ✭ 22 (-60%)
Mutual labels:  dataloader
awesome-speech-enhancement
A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.
Stars: ✭ 48 (-12.73%)
Mutual labels:  speech-separation
dataloader-dotnet
DataLoader for .NET
Stars: ✭ 40 (-27.27%)
Mutual labels:  dataloader
graphql-compose-dataloader
Add DataLoader to graphql-composer resolvers.
Stars: ✭ 18 (-67.27%)
Mutual labels:  dataloader
TasNet
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
Stars: ✭ 81 (+47.27%)
Mutual labels:  speech-separation
SFDX-Data-Move-Utility-Desktop-App
This repository contains the special Desktop GUI Application, that will help you to prepare and execute data migration packages using the SFDMU Plugin.
Stars: ✭ 65 (+18.18%)
Mutual labels:  dataloader
graphql-spotify
GraphQL Schema And Resolvers For Spotify Web API
Stars: ✭ 55 (+0%)
Mutual labels:  dataloader
batch loader
âš¡ Powerful tool for avoiding N+1 DB or HTTP queries
Stars: ✭ 18 (-67.27%)
Mutual labels:  dataloader
MobilePose
Light-weight Single Person Pose Estimator
Stars: ✭ 588 (+969.09%)
Mutual labels:  dataloader
graphql-modules-app
TypeScripted Apollo GraphQL Server using modules and a NextJS frontend utilising React modules with Apollo hooks. All bundled with a lot of dev friendly tools in a lerna setup..
Stars: ✭ 39 (-29.09%)
Mutual labels:  dataloader
Deep-Clustering-for-Speech-Separation
Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
Stars: ✭ 99 (+80%)
Mutual labels:  speech-separation
UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
Stars: ✭ 224 (+307.27%)
Mutual labels:  speech-separation
brkraw
BrkRaw: A comprehensive tool to access raw Bruker Biospin MRI data
Stars: ✭ 31 (-43.64%)
Mutual labels:  dataloader
Voice-Separation-and-Enhancement
A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.
Stars: ✭ 60 (+9.09%)
Mutual labels:  speech-separation
Tensorflow-data-loader
Reading data into tensorflow using tf.data function
Stars: ✭ 15 (-72.73%)
Mutual labels:  dataloader
wp-graphql
WordPress REST API exposed via GraphQL
Stars: ✭ 59 (+7.27%)
Mutual labels:  dataloader
dataloader
Dataloader is a generic utility for batch data loading with caching, works great with GraphQL
Stars: ✭ 114 (+107.27%)
Mutual labels:  dataloader

UtterancePIT-Speech-Separation

According to funcwj's uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.

If you want to see the funcwj code, this is his repository link.

uPIT-for-speech-separation

Demo Pages: Results of pure speech separation model

Accomplished goal

  • Support Multi-GPU Training
  • Use the Dataloader Method That Comes With Pytorch
  • Provide Pre-Training Models

Python Library Version

  • Pytorch==1.3.0
  • tqdm==4.32.1
  • librosa==0.7.1
  • scipy==1.3.0
  • numpy==1.16.4
  • PyYAML==5.1.1

How to Using This Repository

  1. Generate dataset using create-speaker-mixtures.zip with WSJ0 or TIMI

  2. Prepare scp file(The content of the scp file is "filename path")

     python create_scp.py
  3. Prepare cmvn(Cepstral mean and variance normalization (CMVN) is a computationally efficient normalization technique for robust speech recognition.).

     #Calculated by the compute_cmvn.py script: 
     python compute_cmvn.py ./tt_mix.scp ./cmvn.dict
  4. Modify the contents of yaml, mainly to modify the scp address, cmvn address. At the same time, the number of num_spk in run_pit.py is modified.

  5. Training:

    sh train.sh
  6. Inference:

    sh test.sh
    

Reference

  • Kolbæk M, Yu D, Tan Z H, et al. Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2017, 25(10): 1901-1913.
  • https://github.com/funcwj/uPIT-for-speech-separation
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].