All Projects → ws-choi → AMSS-Net

ws-choi / AMSS-Net

Licence: MIT license
A PyTorch implementation of the paper: "AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries" (ACM Multimedia 2021)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to AMSS-Net

SpleeterRT
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Stars: ✭ 111 (+484.21%)
Mutual labels:  audio-processing, source-separation
pedalevite
Pédale Vite — DIY multi-FX pedalboard for guitar/bass/etc.
Stars: ✭ 68 (+257.89%)
Mutual labels:  audio-processing
fogpad
A VST reverb effect in which the reflections can be frozen, filtered, pitch shifted and ultimately disintegrated.
Stars: ✭ 61 (+221.05%)
Mutual labels:  audio-processing
Vst3HostDemo
Vst3HostDemo
Stars: ✭ 16 (-15.79%)
Mutual labels:  audio-processing
tenacity
Tenacity is an easy-to-use, privacy-friendly, FLOSS, cross-platform multi-track audio editor/recorder for Windows, macOS, Linux and other operating systems. Project currently on an indefinite hiatus.
Stars: ✭ 7,231 (+37957.89%)
Mutual labels:  audio-processing
guitar-tuner
Guitar tuner for Android
Stars: ✭ 19 (+0%)
Mutual labels:  audio-processing
ffcvt
ffmpeg convert wrapper tool
Stars: ✭ 32 (+68.42%)
Mutual labels:  audio-processing
FRIDA
A high-resolution direction-of-arrival finding algorithm relying on finite rate of innovation sampling with a robust reconstruction algorithm.
Stars: ✭ 69 (+263.16%)
Mutual labels:  audio-processing
old-audiosync
First implementation of the audio synchronization feature for Vidify, now obsolete
Stars: ✭ 16 (-15.79%)
Mutual labels:  audio-processing
acxi
acxi is an audio conversion tool that helps sync lossless to lossy formats.
Stars: ✭ 35 (+84.21%)
Mutual labels:  audio-processing
AudioEffectDynamics
Dynamics Processor (Gate, Compressor & Limiter) for the Teensy Audio Library
Stars: ✭ 23 (+21.05%)
Mutual labels:  audio-processing
speaker extraction
target speaker extraction and verification for multi-talker speech
Stars: ✭ 85 (+347.37%)
Mutual labels:  source-separation
Amplituda
Amlituda - an android library that calculates amplitudes from audio and provides data in different formats. Based on this data, you can draw waveform. Android audio amplitude library.
Stars: ✭ 75 (+294.74%)
Mutual labels:  audio-processing
audio source separation
An implementation of audio source separation tools.
Stars: ✭ 41 (+115.79%)
Mutual labels:  source-separation
MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Stars: ✭ 73 (+284.21%)
Mutual labels:  audio-processing
TD-JUCE
JUCE audio and VSTs in TouchDesigner
Stars: ✭ 29 (+52.63%)
Mutual labels:  audio-processing
steerable-nafx
Steerable discovery of neural audio effects
Stars: ✭ 172 (+805.26%)
Mutual labels:  audio-processing
facet
Facet is a live coding system for algorithmic music
Stars: ✭ 72 (+278.95%)
Mutual labels:  audio-processing
Planeverb
Project Planeverb is a CPU based real-time wave-based acoustics engine for games. It comes with an integration with the Unity Engine.
Stars: ✭ 22 (+15.79%)
Mutual labels:  audio-processing
AudioConverter
Audio Conversion CLI
Stars: ✭ 17 (-10.53%)
Mutual labels:  audio-processing

AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries

An official implementation of the paper: "AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries"


This repository does not contain the complete source code yet.

We will upload codes sooner or later, after refactorization, for better readability.


News

Our paper has been accepted to ACMMM 2021 !


1. Installation

(Optional)

conda create -n amss
conda activate amss

(Install)

conda install pytorch=1.7.1 cudatoolkit=11.0 -c pytorch
conda install -c conda-forge ffmpeg librosa
conda install -c anaconda jupyter
pip install torchtext musdb museval pytorch_lightning wandb pydub pysndfx

Also, you have to install sox,

  • for linux: conda install -c conda-forge sox
  • for Windows: download

2. Dataset: Musdb18

1. Download

  1. Full dataset

    • The entire dataset is hosted on Zenodo and requires that users request access.
    • The tracks can only be used for academic purposes.
    • They manually check requests.
  • After your request is accepted, then you can download the full dataset!
  1. or Sample Dataset
    • download sample version of MUSDB18 which includes 7s excerpts using this script

      import musdb
      musdb.DB(root='etc/musdb18_dev', download=True)

2. Generate wave files

  • run this!

    musdbconvert <your_DIR> <target_DIR> 
  • musdbconvert is automatically installed if you have installed musdb with:

    pip install musdb

3. Train script example

  • AMSS-Net
python train.py --musdb_root ../../repos/musdb18_wav --pre_trained_word_embedding glove.6B.100d --embedding_dim 100 --task task2 --model isolasion_smpocm --n_fft 4096 --gpus 4 --distributed_backend ddp --sync_batchnorm True --save_top_k 3 --min_epochs 100 --num_head 6 --num_latent_source 8 --optimizer adam --batch_size 4 --enable_pl_optimizer True --train_loss spec_mse --val_loss raw_l1 --check_val_every_n_epoch 10 --lr 0.0001 --precision 16 --num_worker 32 --pin_memory True --seed 2020 --deterministic True --n_blocks 9 --run_id your_run_id --log wandb

3. Evaluation script example

auto_task2_eval.py --musdb_root ../../repos/musdb18_wav --ckpt_root etc/checkpoints/ --model isolasion_smpocm --cuda True --batch_size 8 --logger wandb
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].