All Projects → GAMMA-UMD → IR-GAN

GAMMA-UMD / IR-GAN

Licence: MIT License
Augmenting Room Impulse Response

Programming Languages

matlab
3953 projects
python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
javascript
184084 projects - #8 most used programming language
c
50402 projects - #5 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to IR-GAN

FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Stars: ✭ 90 (+328.57%)
Mutual labels:  automatic-speech-recognition, augmentation, room-impulse-response, synthetic-data
SDMetrics
Metrics to evaluate quality and efficacy of synthetic datasets.
Stars: ✭ 67 (+219.05%)
Mutual labels:  synthetic-data
kaldi helpers
🙊 A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Stars: ✭ 13 (-38.1%)
Mutual labels:  automatic-speech-recognition
cram
cram is a computational room acoustics module to simulate and explore various acoustic properties of a modeled space
Stars: ✭ 23 (+9.52%)
Mutual labels:  room-impulse-response
deep avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Stars: ✭ 104 (+395.24%)
Mutual labels:  automatic-speech-recognition
soxan
Wav2Vec for speech recognition, classification, and audio classification
Stars: ✭ 113 (+438.1%)
Mutual labels:  automatic-speech-recognition
VisDA2020
VisDA2020: 4th Visual Domain Adaptation Challenge in ECCV'20
Stars: ✭ 53 (+152.38%)
Mutual labels:  synthetic-data
timber-ruby
🌲 Great Ruby logging made easy.
Stars: ✭ 155 (+638.1%)
Mutual labels:  augmentation
augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Stars: ✭ 49 (+133.33%)
Mutual labels:  synthetic-data
Clustering-Datasets
This repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
Stars: ✭ 189 (+800%)
Mutual labels:  synthetic-data
discolight
discolight is a robust, flexible and infinitely hackable library for generating image augmentations ✨
Stars: ✭ 25 (+19.05%)
Mutual labels:  augmentation
Robotics-Object-Pose-Estimation
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
Stars: ✭ 153 (+628.57%)
Mutual labels:  synthetic-data
game-feature-learning
Code for paper "Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery", Ren et al., CVPR'18
Stars: ✭ 68 (+223.81%)
Mutual labels:  synthetic-data
volumentations
Library for 3D augmentations
Stars: ✭ 111 (+428.57%)
Mutual labels:  augmentation
genstar
Generation of Synthetic Populations Library
Stars: ✭ 17 (-19.05%)
Mutual labels:  synthetic-data
kaldi-long-audio-alignment
Long audio alignment using Kaldi
Stars: ✭ 21 (+0%)
Mutual labels:  automatic-speech-recognition
torch-pitch-shift
Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Stars: ✭ 70 (+233.33%)
Mutual labels:  augmentation
Three-Filters-to-Normal
Three-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator (RAL+ICRA'21)
Stars: ✭ 41 (+95.24%)
Mutual labels:  synthetic-data
obman render
[cvpr19] Code to generate images from the ObMan dataset, synthetic renderings of hands holding objects (or hands in isolation)
Stars: ✭ 61 (+190.48%)
Mutual labels:  synthetic-data
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (+47.62%)
Mutual labels:  augmentation

Related Works

  1. TS-RIR: Translated synthetic room impulse responses for speech augmentation (IEEE ASRU 2021)
  2. FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

IR-GAN (INTERSPEECH 2021)

This is the official implementation of IR-GAN. This is the extension of WaveGAN to augment Room Impulse Response (RIR). You can find more details on this project here https://gamma.umd.edu/pro/speech/ir-gan.

Video : https://www.youtube.com/watch?v=_v5rDmDXvD0

Requirements

tensorflow-gpu==1.12.0
scipy==1.0.0
matplotlib==3.0.2
librosa==0.6.2
ffmpeg ==4.2.1
cuda ==9.0.176
cudnn ==7.6.5
Matlab

Datasets

In order to train WaveGAN to map low dimensional latent vectors to high dimensional space where room impulse response is present, use the following recorded RIR from BUT ReverbDB. Unzip RIR directory inside IR-GAN folder.

https://drive.google.com/file/d/1YX1XEpJ2W1cZD4Dn7d5CRBVPOFLUKG4B/view?usp=sharing

You can generate RIR using the following trained models (https://drive.google.com/file/d/1IktFk27UnJx7ycGlOnc71VX7GuFRwR7L/view?usp=sharing). Copy these trained models to RIR_Generation folder.

IR Statistics Toolbox

We need following Matlab toolbox to calculate Room Impulse Response Statistics (https://www.mathworks.com/matlabcentral/fileexchange/42566-impulse-response-acoustic-information-calculator).

Christopher Hummersone (2020). Impulse response acoustic information calculator (https://github.com/IoSR-Surrey/MatlabToolbox), GitHub. Retrieved October 31, 2020.

Train a WaveGAN

You can train WaveGAN to generate RIR using the following command

export CUDA_VISIBLE_DEVICES=0
python3 train_wavegan.py train ./train --data_dir RIR/ --data_first_slice --data_pad_end --data_fast_wav

Generate RIR

Copy the trained models inside train directory or download the trained models() to RIR Generation folder. You can generate constrained RIR using the following command.

python3 Augment_RIR.py

You can edit number of RIRs to be generated inside the file Augment_RIR.py

You can generate intermediate RIRs with given upper and lower limits of Ditrect to reverberant ratio (DRR) using the following command

python3 Vector_Arithmatic.py

you can edit the upper and lower limit inside the file Vector_Arithmatic.py

Attribution

If you use this code in your research, please consider citing

@inproceedings{ratnarajah21_interspeech,
  author={Anton Ratnarajah and Zhenyu Tang and Dinesh Manocha},
  title={{IR-GAN: Room Impulse Response Generator for Far-Field Speech Recognition}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={286--290},
  doi={10.21437/Interspeech.2021-230}
}
@inproceedings{donahue2019wavegan,
  title={Adversarial Audio Synthesis},
  author={Donahue, Chris and McAuley, Julian and Puckette, Miller},
  booktitle={ICLR},
  year={2019}
}

If you use recorded RIR from BUT ReverbDB, please consider citing

@article{DBLP:journals/jstsp/SzokeSMPC19,
  author    = {Igor Sz{\"{o}}ke and
               Miroslav Sk{\'{a}}cel and
               Ladislav Mosner and
               Jakub Paliesek and
               Jan Honza Cernock{\'{y}}},
  title     = {Building and Evaluation of a Real Room Impulse Response Dataset},
  journal   = {{IEEE} J. Sel. Top. Signal Process.},
  volume    = {13},
  number    = {4},
  pages     = {863--876},
  year      = {2019}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].