All Projects → tnwei → vqgan-clip-app

tnwei / vqgan-clip-app

Licence: MIT license
Local image generation using VQGAN-CLIP or CLIP guided diffusion

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
shell
77523 projects

Projects that are alternatives of or similar to vqgan-clip-app

S2ML-Generators
Multiple notebooks which allow the use of various machine learning methods to generate or modify multimedia content
Stars: ✭ 172 (+82.98%)
Mutual labels:  generative-art, vqgan-clip
awesome-generative-deep-art
A curated list of generative deep learning tools, works, models, etc. for artistic uses
Stars: ✭ 172 (+82.98%)
Mutual labels:  generative-art, text2image
TargetCLIP
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.
Stars: ✭ 158 (+68.09%)
Mutual labels:  image-generation, clip
VQGAN-CLIP-Docker
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized
Stars: ✭ 58 (-38.3%)
Mutual labels:  generative-art, text2image
worlds
Building Virtual Reality Worlds using Three.js
Stars: ✭ 23 (-75.53%)
Mutual labels:  generative-art
gespensterfelder
A small generative system in clojurescript and Three.js.
Stars: ✭ 57 (-39.36%)
Mutual labels:  generative-art
VQGAN-CLIP
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Stars: ✭ 2,369 (+2420.21%)
Mutual labels:  text2image
soft-intro-vae-pytorch
[CVPR 2021 Oral] Official PyTorch implementation of Soft-IntroVAE from the paper "Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders"
Stars: ✭ 170 (+80.85%)
Mutual labels:  image-generation
aRtsy
An R package for making generative art using 'ggplot2'.
Stars: ✭ 142 (+51.06%)
Mutual labels:  generative-art
CycleGAN-Models
Models generated by CycleGAN
Stars: ✭ 42 (-55.32%)
Mutual labels:  image-generation
stamps
A language for producing art
Stars: ✭ 116 (+23.4%)
Mutual labels:  generative-art
BeatDrop
BeatDrop Music Visualizer
Stars: ✭ 54 (-42.55%)
Mutual labels:  generative-art
project-code-py
Leetcode using AI
Stars: ✭ 100 (+6.38%)
Mutual labels:  streamlit
universum-contracts
text-to-image generation gems / libraries incl. moonbirds, cyberpunks, coolcats, shiba inu doge, nouns & more
Stars: ✭ 17 (-81.91%)
Mutual labels:  image-generation
generativepy
Library for creating generative art and maths animations
Stars: ✭ 70 (-25.53%)
Mutual labels:  generative-art
Generative-Art
A selection of generative art scripts written in Python
Stars: ✭ 284 (+202.13%)
Mutual labels:  generative-art
GAN-XML-Fixer
No description or website provided.
Stars: ✭ 55 (-41.49%)
Mutual labels:  generative-art
ezancestry
Easy genetic ancestry predictions in Python
Stars: ✭ 38 (-59.57%)
Mutual labels:  streamlit
streamlit-project
This repository provides a simple deployment-ready project layout for a Streamlit app. Simply swap out the code in `app.py` for your own and hit deploy!
Stars: ✭ 33 (-64.89%)
Mutual labels:  streamlit
JRubyArt
JRubyArt a ruby implementation of processing
Stars: ✭ 87 (-7.45%)
Mutual labels:  generative-art

VQGAN-CLIP web app & CLIP guided diffusion web app

LGTM Grade License

Link to repo: tnwei/vqgan-clip-app.

Intro to VQGAN-CLIP

VQGAN-CLIP has been in vogue for generating art using deep learning. Searching the r/deepdream subreddit for VQGAN-CLIP yields quite a number of results. Basically, VQGAN can generate pretty high fidelity images, while CLIP can produce relevant captions for images. Combined, VQGAN-CLIP can take prompts from human input, and iterate to generate images that fit the prompts.

Thanks to the generosity of creators sharing notebooks on Google Colab, the VQGAN-CLIP technique has seen widespread circulation. However, for regular usage across multiple sessions, I prefer a local setup that can be started up rapidly. Thus, this simple Streamlit app for generating VQGAN-CLIP images on a local environment. Screenshot of the UI as below:

Screenshot of the UI

Be advised that you need a beefy GPU with lots of VRAM to generate images large enough to be interesting. (Hello Quadro owners!). For reference, an RTX2060 can barely manage a 300x300 image. Otherwise you are best served using the notebooks on Colab.

Reference is this Colab notebook originally by Katherine Crowson. The notebook can also be found in this repo hosted by EleutherAI.

Intro to CLIP guided diffusion

In mid 2021, Open AI released Diffusion Models Beat GANS on Image Synthesis, with corresponding source code and model checkpoints released on github. The cadre of people that brought us VQGAN-CLIP worked their magic, and shared CLIP guided diffusion notebooks for public use. CLIP guided diffusion uses more GPU VRAM, runs slower, and has fixed output sizes depending on the trained model checkpoints, but is capable of producing more breathtaking images.

Here's a few examples using the prompt "Flowery fragrance intertwined with the freshness of the ocean breeze by Greg Rutkowski", run on the 512x512 HQ Uncond model:

Example output for CLIP guided diffusion

The implementation of CLIP guided diffusion in this repo is based on notebooks from the same EleutherAI/vqgan-clip repo.

Setup

  1. Install the required Python libraries. Using conda, run conda env create -f environment.yml
  2. Git clone this repo. After that, cd into the repo and run:
    • git clone https://github.com/CompVis/taming-transformers (Update to pip install if either of these two PRs are merged)
    • git clone https://github.com/crowsonkb/guided-diffusion (Update to pip install if this PR is merged)
  3. Download the pretrained weights and config files using the provided links in the files listed below. Note that that all of the links are commented out by default. Recommend to download one by one, as some of the downloads can take a while.
    • For VQGAN-CLIP: download-weights.sh. You'll want to at least have both the ImageNet weights, which are used in the reference notebook.
    • For CLIP guided diffusion: download-diffusion-weights.sh.

Usage

  • VQGAN-CLIP: streamlit run app.py, launches web app on localhost:8501 if available
  • CLIP guided diffusion: streamlit run diffusion_app.py, launches web app on localhost:8501 if available
  • Image gallery: python gallery.py, launches a gallery viewer on localhost:5000. More on this below.

In the web app, select settings on the sidebar, key in the text prompt, and click run to generate images using VQGAN-CLIP. When done, the web app will display the output image as well as a video compilation showing progression of image generation. You can save them directly through the browser's right-click menu.

A one-time download of additional pre-trained weights will occur before generating the first image. Might take a few minutes depending on your internet connection.

If you have multiple GPUs, specify the GPU you want to use by adding -- --gpu X. An extra double dash is required to bypass Streamlit argument parsing. Example commands:

# Use 2nd GPU
streamlit run app.py -- --gpu 1

# Use 3rd GPU
streamlit run diffusion_app.py -- --gpu 2

See: tips and tricks

Output and gallery viewer

Each run's metadata and output is saved to the output/ directory, organized into subfolders named using the timestamp when a run is launched, as well as a unique run ID. Example output dir:

$ tree output
├── 20210920T232927-vTf6Aot6
│   ├── anim.mp4
│   ├── details.json
│   └── output.PNG
└── 20210920T232935-9TJ9YusD
    ├── anim.mp4
    ├── details.json
    └── output.PNG

The gallery viewer reads from output/ and visualizes previous runs together with saved metadata.

Screenshot of the gallery viewer

If the details are too much, call python gallery.py --kiosk instead to only show the images and their prompts.

More details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].