Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → santi-pdp → Segan_pytorch

santi-pdp / Segan_pytorch

Licence: mit

Speech Enhancement Generative Adversarial Network in PyTorch

Programming Languages

139335 projects - #7 most used programming language

Labels

pytorch neural-network deeplearning gans

Projects that are alternatives of or similar to Segan pytorch

It contains the coursework and the practice I have done while learning Deep Learning.🚀 👨‍💻💥 🚩🌈

Stars: ✭ 21 (-91.32%)

Mutual labels: deeplearning, gans

🎨 🎨 深度学习卷积神经网络教程：图像识别，目标检测，语义分割，实例分割，人脸识别，神经风格转换，GAN等🎨🎨 https://dataxujing.github.io/CNN-paper2/

Stars: ✭ 77 (-68.18%)

Mutual labels: deeplearning, gans

Reference code for the paper HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms (CVPR 2021).

Stars: ✭ 158 (-34.71%)

Mutual labels: deeplearning, gans

ICface: Interpretable and Controllable Face Reenactment Using GANs

Stars: ✭ 122 (-49.59%)

Mutual labels: deeplearning, gans

Contrastive Unpaired Translation

Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

Stars: ✭ 822 (+239.67%)

Mutual labels: deeplearning, gans

A demo webapp to convert images and videos into cartoon!

Stars: ✭ 215 (-11.16%)

Mutual labels: deeplearning, gans

Awesome Gans And Deepfakes

A curated list of GAN & Deepfake papers and repositories.

Stars: ✭ 224 (-7.44%)

Mutual labels: gans

Bmw Yolov4 Inference Api Gpu

This is a repository for an nocode object detection inference API using the Yolov3 and Yolov4 Darknet framework.

Stars: ✭ 237 (-2.07%)

Mutual labels: deeplearning

Apparel detection using deep learning

Stars: ✭ 223 (-7.85%)

Mutual labels: deeplearning

Pytorch cifar10

Pretrained TorchVision models on CIFAR10 dataset (with weights)

Stars: ✭ 219 (-9.5%)

Mutual labels: deeplearning

A small convolution neural network deep learning framework implemented in c++.

Stars: ✭ 241 (-0.41%)

Mutual labels: deeplearning

Hierarchical Attention Networks Pytorch

Hierarchical Attention Networks for document classification

Stars: ✭ 239 (-1.24%)

Mutual labels: deeplearning

Awesome Real World Rl

Great resources for making Reinforcement Learning work in Real Life situations. Papers,projects and more.

Stars: ✭ 234 (-3.31%)

Mutual labels: gans

Deeplearning cv notes

📓 deepleaning and cv notes.

Stars: ✭ 223 (-7.85%)

Mutual labels: deeplearning

Book deeplearning in pytorch source

Stars: ✭ 236 (-2.48%)

Mutual labels: gans

Bert Attributeextraction

USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and feature extraction. 使用基于bert的微调和特征提取方法来进行知识图谱百度百科人物词条属性抽取。

Stars: ✭ 224 (-7.44%)

Mutual labels: deeplearning

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-grained Object Generation and Discovery

Stars: ✭ 240 (-0.83%)

Mutual labels: gans

My Awesome Ai Bookmarks

Curated list of my reads, implementations and core concepts of Artificial Intelligence, Deep Learning, Machine Learning by best folk in the world.

Stars: ✭ 223 (-7.85%)

Mutual labels: deeplearning

The remake of the https://github.com/biubug6/Pytorch_Retinaface

Stars: ✭ 226 (-6.61%)

Mutual labels: deeplearning

三个月教你从零入门深度学习Tensorflow版配套代码

Stars: ✭ 238 (-1.65%)

Mutual labels: deeplearning

View All Similar Projects ➔

Speech Enhancement Generative Adversarial Network in PyTorch

Requirements

SoundFile==0.10.2
scipy==1.1.0
librosa==0.6.1
h5py==2.8.0
numba==0.38.0
torch==0.4.1
matplotlib==2.2.2
numpy==1.14.3
pyfftw==0.10.4
tensorboardX==1.4
torchvision==0.2.1

Ahoprocessing tools (ahoproc_tools) is also needed, and the public repo is found here.

Audio Samples

Latest denoising audio samples with baselines can be found in the segan+ samples website. SEGAN is the vanilla SEGAN version (like the one in TensorFlow repo), whereas SEGAN+ is the shallower improved version included as default parameters of this repo.

The voicing/dewhispering audio samples can be found in the whispersegan samples website. Artifacts can now be palliated a bit more with --interf_pair fake signals, more data than the one we had available (just 20 mins with 1 speaker per model) and longer training session by iterating more than 100 epoch.

Pretrained Models

SEGAN+ generator weights are released and can be downloaded in this link. Make sure you place this file into the ckpt_segan+ directory to make it work with the proper train.opts config file within that folder. The script run_segan+_clean.sh will properly read the ckpt in that directory as it is configured to be used with this referenced file.

Introduction to scripts

Two models are ready to train and use to make wav2wav speech enhancement conversions. SEGAN+ is an improved version of SEGAN [1], denoising utterances with its generator network (G).

To train this model, the following command should be ran:

python train.py --save_path ckpt_segan+ --batch_size 300 \
		--clean_trainset data/clean_trainset \
		--noisy_trainset data/noisy_trainset \
		--cache_dir data/cache

Read run_segan+_train.sh for more guidance. This will use the default parameters to structure both G and D, but they can be tunned with many options. For example, one can play with --d_pretrained_ckpt and/or --g_pretrained_ckpt to specify a departure pre-train checkpoint to fine-tune some characteristics of our enhancement system, like language, as in [2].

Cleaning files is done by specifying the generator weights checkpoint, its config file from training and appropriate paths for input and output files (Use soundfile wav writer backend (recommended) specifying the --soundfile flag):

python clean.py --g_pretrained_ckpt ckpt_segan+/<weights_ckpt_for_G> \
		--cfg_file ckpt_segan+/train.opts --synthesis_path enhanced_results \
		--test_files data/noisy_testset --soundfile

Read run_segan+_clean.sh for more guidance.

There is a WSEGAN, which stands for the dewhispering SEGAN [3]. This system is activated (rather than vanilla SEGAN) by specifying the --wsegan flag. Additionally, the --misalign_pair flag will add another fake pair to the adversarial loss indicating that content changes between input and output of G is bad, something that improved our results for [3].

References:

Cite

@article{pascual2017segan,
  title={SEGAN: Speech Enhancement Generative Adversarial Network},
  author={Pascual, Santiago and Bonafonte, Antonio and Serr{\`a}, Joan},
  journal={arXiv preprint arXiv:1703.09452},
  year={2017}
}

Notes

Multi-GPU is not supported yet in this framework.
Virtual Batch Norm is not included as in the very first SEGAN code, as similar results to those of original paper can be obtained with regular BatchNorm in D (ONLY D).
If using this code, parts of it, or developments from it, please cite the above reference.
We do not provide any support or assistance for the supplied code nor we offer any other compilation/variant of it.
We assume no responsibility regarding the provided code.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 242

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (20) 🔗