Alternatives and detailed information of ddpm-proteins

lucidrains / ddpm-proteins

Licence: MIT License

A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ddpm-proteins

gcWGAN

Guided Conditional Wasserstein GAN for De Novo Protein Design

Stars: ✭ 38 (-30.91%)

Mutual labels: protein-structure, generative-model

CHyVAE

Code for our paper -- Hyperprior Induced Unsupervised Disentanglement of Latent Representations (AAAI 2019)

Stars: ✭ 18 (-67.27%)

Mutual labels: generative-model

hotspot3d

3D hotspot mutation proximity analysis tool

Stars: ✭ 43 (-21.82%)

Mutual labels: protein-structure

timbre painting

Hierarchical fast and high-fidelity audio generation

Stars: ✭ 67 (+21.82%)

Mutual labels: generative-model

DeepCov

Fully convolutional neural networks for protein residue-residue contact prediction

Stars: ✭ 36 (-34.55%)

Mutual labels: protein-structure

DiffuseVAE

A combination of VAE's and Diffusion Models for efficient, controllable and high-fidelity generation from low-dimensional latents

Stars: ✭ 81 (+47.27%)

Mutual labels: generative-model

vae-torch

Variational autoencoder for anomaly detection (in PyTorch).

Stars: ✭ 38 (-30.91%)

Mutual labels: generative-model

celeba-gan-pytorch

Generative Adversarial Networks in PyTorch

Stars: ✭ 35 (-36.36%)

Mutual labels: generative-model

RG-Flow

This is project page for the paper "RG-Flow: a hierarchical and explainable flow model based on renormalization group and sparse prior". Paper link: https://arxiv.org/abs/2010.00029

Stars: ✭ 58 (+5.45%)

Mutual labels: generative-model

TriangleGAN

TriangleGAN, ACM MM 2019.

Stars: ✭ 28 (-49.09%)

Mutual labels: generative-model

VSCoding-Sequence

VSCode Extension for interactively visualising protein structure data in the editor

Stars: ✭ 41 (-25.45%)

Mutual labels: protein-structure

Generalized-PixelVAE

PixelVAE with or without regularization

Stars: ✭ 64 (+16.36%)

Mutual labels: generative-model

BtcDet

Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

Stars: ✭ 104 (+89.09%)

Mutual labels: generative-model

generative deep learning

Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)

Stars: ✭ 24 (-56.36%)

Mutual labels: generative-model

swd

unsupervised video and image generation

Stars: ✭ 50 (-9.09%)

Mutual labels: generative-model

graph-nvp

GraphNVP: An Invertible Flow Model for Generating Molecular Graphs

Stars: ✭ 69 (+25.45%)

Mutual labels: generative-model

hPDB

PDB parser in Haskell

Stars: ✭ 20 (-63.64%)

Mutual labels: protein-structure

DVAE

Official implementation of Dynamical VAEs

Stars: ✭ 75 (+36.36%)

Mutual labels: generative-model

cgan-face-generator

Face generator from sketches using cGAN (pix2pix) model

Stars: ✭ 52 (-5.45%)

Mutual labels: generative-model

srVAE

VAE with RealNVP prior and Super-Resolution VAE in PyTorch. Code release for https://arxiv.org/abs/2006.05218.

Stars: ✭ 56 (+1.82%)

Mutual labels: generative-model

View All Similar Projects ➔

Denoising Diffusion Probabilistic Model for Proteins

Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to generative modeling that may have the potential to rival GANs. It uses denoising score matching to estimate the gradient of the data distribution, followed by Langevin sampling to sample from the true distribution. This implementation was transcribed from the official Tensorflow version here.

This specific repository will be using a heavily modifying version of the U-net for learning on protein structure, with eventual conditioning from MSA Transformers attention heads.

** at around 40k iterations **

Install

$ pip install ddpm-proteins

Training

We are using weights & biases for experimental tracking

First you need to login

$ wandb login

Then you will need to cache all the MSA attention embeddings by first running. For some reason, the below needs to be done multiple times to cache all the proteins correctly (it does work though). I'll get around to fixing this.

$ python cache.py

Finally, you can begin training by invoking

$ python train.py

If you would like to clear or recompute the cache (ie after changing the fetch MSA function), just run

$ rm -rf ~/.cache.ddpm-proteins

Todo

condition on mask
condition on MSA transformers (with caching of tensors in specified directory by protein id)
all-attention network with uformer https://arxiv.org/abs/2106.03106 (with 1d + 2d conv kernels)
reach for size 384
add all improvements from https://arxiv.org/abs/2105.05233 and https://cascaded-diffusion.github.io/

Usage

import torch
from ddpm_proteins import Unet, GaussianDiffusion

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
)

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,   # number of steps
    loss_type = 'l1'    # L1 or L2
)

training_images = torch.randn(8, 3, 128, 128)
loss = diffusion(training_images)
loss.backward()
# after a lot of training

sampled_images = diffusion.sample(batch_size = 4)
sampled_images.shape # (4, 3, 128, 128)

Or, if you simply want to pass in a folder name and the desired image dimensions, you can use the Trainer class to easily train a model.

from ddpm_proteins import Unet, GaussianDiffusion, Trainer

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).cuda()

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,   # number of steps
    loss_type = 'l1'    # L1 or L2
).cuda()

trainer = Trainer(
    diffusion,
    'path/to/your/images',
    train_batch_size = 32,
    train_lr = 2e-5,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    fp16 = True                       # turn on mixed precision training with apex
)

trainer.train()

Samples and model checkpoints will be logged to ./results periodically

Citations

@misc{ho2020denoising,
    title   = {Denoising Diffusion Probabilistic Models},
    author  = {Jonathan Ho and Ajay Jain and Pieter Abbeel},
    year    = {2020},
    eprint  = {2006.11239},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

@inproceedings{anonymous2021improved,
    title   = {Improved Denoising Diffusion Probabilistic Models},
    author  = {Anonymous},
    booktitle = {Submitted to International Conference on Learning Representations},
    year    = {2021},
    url     = {https://openreview.net/forum?id=-NEXDKk8gZ},
    note    = {under review}
}

@article{Rao2021.02.12.430858,
    author  = {Rao, Roshan and Liu, Jason and Verkuil, Robert and Meier, Joshua and Canny, John F. and Abbeel, Pieter and Sercu, Tom and Rives, Alexander},
    title   = {MSA Transformer},
    year    = {2021},
    publisher = {Cold Spring Harbor Laboratory},
    URL     = {https://www.biorxiv.org/content/early/2021/02/13/2021.02.12.430858},
    journal = {bioRxiv}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

lucidrains / ddpm-proteins

Programming Languages

Labels

Projects that are alternatives of or similar to ddpm-proteins

Denoising Diffusion Probabilistic Model for Proteins

Install

Training

Todo

Usage

Citations