All Projects → lucidrains → ddpm-proteins

lucidrains / ddpm-proteins

Licence: MIT License
A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ddpm-proteins

gcWGAN
Guided Conditional Wasserstein GAN for De Novo Protein Design
Stars: ✭ 38 (-30.91%)
Mutual labels:  protein-structure, generative-model
CHyVAE
Code for our paper -- Hyperprior Induced Unsupervised Disentanglement of Latent Representations (AAAI 2019)
Stars: ✭ 18 (-67.27%)
Mutual labels:  generative-model
hotspot3d
3D hotspot mutation proximity analysis tool
Stars: ✭ 43 (-21.82%)
Mutual labels:  protein-structure
timbre painting
Hierarchical fast and high-fidelity audio generation
Stars: ✭ 67 (+21.82%)
Mutual labels:  generative-model
DeepCov
Fully convolutional neural networks for protein residue-residue contact prediction
Stars: ✭ 36 (-34.55%)
Mutual labels:  protein-structure
DiffuseVAE
A combination of VAE's and Diffusion Models for efficient, controllable and high-fidelity generation from low-dimensional latents
Stars: ✭ 81 (+47.27%)
Mutual labels:  generative-model
vae-torch
Variational autoencoder for anomaly detection (in PyTorch).
Stars: ✭ 38 (-30.91%)
Mutual labels:  generative-model
celeba-gan-pytorch
Generative Adversarial Networks in PyTorch
Stars: ✭ 35 (-36.36%)
Mutual labels:  generative-model
RG-Flow
This is project page for the paper "RG-Flow: a hierarchical and explainable flow model based on renormalization group and sparse prior". Paper link: https://arxiv.org/abs/2010.00029
Stars: ✭ 58 (+5.45%)
Mutual labels:  generative-model
TriangleGAN
TriangleGAN, ACM MM 2019.
Stars: ✭ 28 (-49.09%)
Mutual labels:  generative-model
VSCoding-Sequence
VSCode Extension for interactively visualising protein structure data in the editor
Stars: ✭ 41 (-25.45%)
Mutual labels:  protein-structure
Generalized-PixelVAE
PixelVAE with or without regularization
Stars: ✭ 64 (+16.36%)
Mutual labels:  generative-model
BtcDet
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection
Stars: ✭ 104 (+89.09%)
Mutual labels:  generative-model
generative deep learning
Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)
Stars: ✭ 24 (-56.36%)
Mutual labels:  generative-model
swd
unsupervised video and image generation
Stars: ✭ 50 (-9.09%)
Mutual labels:  generative-model
graph-nvp
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Stars: ✭ 69 (+25.45%)
Mutual labels:  generative-model
hPDB
PDB parser in Haskell
Stars: ✭ 20 (-63.64%)
Mutual labels:  protein-structure
DVAE
Official implementation of Dynamical VAEs
Stars: ✭ 75 (+36.36%)
Mutual labels:  generative-model
cgan-face-generator
Face generator from sketches using cGAN (pix2pix) model
Stars: ✭ 52 (-5.45%)
Mutual labels:  generative-model
srVAE
VAE with RealNVP prior and Super-Resolution VAE in PyTorch. Code release for https://arxiv.org/abs/2006.05218.
Stars: ✭ 56 (+1.82%)
Mutual labels:  generative-model

Denoising Diffusion Probabilistic Model for Proteins

Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to generative modeling that may have the potential to rival GANs. It uses denoising score matching to estimate the gradient of the data distribution, followed by Langevin sampling to sample from the true distribution. This implementation was transcribed from the official Tensorflow version here.

This specific repository will be using a heavily modifying version of the U-net for learning on protein structure, with eventual conditioning from MSA Transformers attention heads.

** at around 40k iterations **

Install

$ pip install ddpm-proteins

Training

We are using weights & biases for experimental tracking

First you need to login

$ wandb login

Then you will need to cache all the MSA attention embeddings by first running. For some reason, the below needs to be done multiple times to cache all the proteins correctly (it does work though). I'll get around to fixing this.

$ python cache.py

Finally, you can begin training by invoking

$ python train.py

If you would like to clear or recompute the cache (ie after changing the fetch MSA function), just run

$ rm -rf ~/.cache.ddpm-proteins

Todo

Usage

import torch
from ddpm_proteins import Unet, GaussianDiffusion

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
)

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,   # number of steps
    loss_type = 'l1'    # L1 or L2
)

training_images = torch.randn(8, 3, 128, 128)
loss = diffusion(training_images)
loss.backward()
# after a lot of training

sampled_images = diffusion.sample(batch_size = 4)
sampled_images.shape # (4, 3, 128, 128)

Or, if you simply want to pass in a folder name and the desired image dimensions, you can use the Trainer class to easily train a model.

from ddpm_proteins import Unet, GaussianDiffusion, Trainer

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).cuda()

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,   # number of steps
    loss_type = 'l1'    # L1 or L2
).cuda()

trainer = Trainer(
    diffusion,
    'path/to/your/images',
    train_batch_size = 32,
    train_lr = 2e-5,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    fp16 = True                       # turn on mixed precision training with apex
)

trainer.train()

Samples and model checkpoints will be logged to ./results periodically

Citations

@misc{ho2020denoising,
    title   = {Denoising Diffusion Probabilistic Models},
    author  = {Jonathan Ho and Ajay Jain and Pieter Abbeel},
    year    = {2020},
    eprint  = {2006.11239},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}
@inproceedings{anonymous2021improved,
    title   = {Improved Denoising Diffusion Probabilistic Models},
    author  = {Anonymous},
    booktitle = {Submitted to International Conference on Learning Representations},
    year    = {2021},
    url     = {https://openreview.net/forum?id=-NEXDKk8gZ},
    note    = {under review}
}
@article{Rao2021.02.12.430858,
    author  = {Rao, Roshan and Liu, Jason and Verkuil, Robert and Meier, Joshua and Canny, John F. and Abbeel, Pieter and Sercu, Tom and Rives, Alexander},
    title   = {MSA Transformer},
    year    = {2021},
    publisher = {Cold Spring Harbor Laboratory},
    URL     = {https://www.biorxiv.org/content/early/2021/02/13/2021.02.12.430858},
    journal = {bioRxiv}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].