All Projects → MathiasGruber → Pconv Keras

MathiasGruber / Pconv Keras

Licence: mit
Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pconv Keras

Google Research
Google Research
Stars: ✭ 20,927 (+2562.47%)
Mutual labels:  ai, jupyter-notebook
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-54.33%)
Mutual labels:  ai, jupyter-notebook
Tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+305.98%)
Mutual labels:  ai, jupyter-notebook
Question Generation
Generating multiple choice questions from text using Machine Learning.
Stars: ✭ 227 (-71.12%)
Mutual labels:  ai, jupyter-notebook
Trtorch
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
Stars: ✭ 583 (-25.83%)
Mutual labels:  nvidia, jupyter-notebook
Es Dev Stack
An on-premises, bare-metal solution for deploying GPU-powered applications in containers
Stars: ✭ 257 (-67.3%)
Mutual labels:  nvidia, jupyter-notebook
Supervisely
AI for everyone! 🎉 Neural networks, tools and a library we use in Supervisely
Stars: ✭ 332 (-57.76%)
Mutual labels:  ai, jupyter-notebook
Learnopencv
Learn OpenCV : C++ and Python Examples
Stars: ✭ 15,385 (+1857.38%)
Mutual labels:  ai, jupyter-notebook
Anakin
High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.
Stars: ✭ 488 (-37.91%)
Mutual labels:  ai, nvidia
Tensor House
A collection of reference machine learning and optimization models for enterprise operations: marketing, pricing, supply chain
Stars: ✭ 449 (-42.88%)
Mutual labels:  ai, jupyter-notebook
Deep Learning In Production
Develop production ready deep learning code, deploy it and scale it
Stars: ✭ 216 (-72.52%)
Mutual labels:  ai, jupyter-notebook
Food 101 Keras
Food Classification with Deep Learning in Keras / Tensorflow
Stars: ✭ 646 (-17.81%)
Mutual labels:  ai, jupyter-notebook
Machine Learning Interview Enlightener
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Stars: ✭ 207 (-73.66%)
Mutual labels:  ai, jupyter-notebook
Artline
A Deep Learning based project for creating line art portraits.
Stars: ✭ 3,061 (+289.44%)
Mutual labels:  ai, jupyter-notebook
Jetson Nano Baseboard
Antmicro's open hardware baseboard for the NVIDIA Jetson Nano and Jetson Xavier NX
Stars: ✭ 209 (-73.41%)
Mutual labels:  ai, nvidia
Basic Mathematics For Machine Learning
The motive behind Creating this repo is to feel the fear of mathematics and do what ever you want to do in Machine Learning , Deep Learning and other fields of AI
Stars: ✭ 300 (-61.83%)
Mutual labels:  ai, jupyter-notebook
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (-75.32%)
Mutual labels:  ai, jupyter-notebook
Atari Model Zoo
A binary release of trained deep reinforcement learning models trained in the Atari machine learning benchmark, and a software release that enables easy visualization and analysis of models, and comparison across training algorithms.
Stars: ✭ 198 (-74.81%)
Mutual labels:  ai, jupyter-notebook
Machinelearning Deeplearning Nlp Leetcode Statisticallearningmethod Tensorflow
最近在学习机器学习,深度学习,自然语言处理,统计学习方法等知识,理论学习主要根据readme的链接,在学习理论的同时,决定自己将学习的相关算法用Python实现一遍,并结合GitHub上相关大牛的代码进行改进,本项目会不断的更新相关算法,欢迎star,fork和关注。 主要包括: 1.吴恩达Andrew Ng老师的机器学习课程作业个人笔记 Python实现, 2.deeplearning.ai(吴恩达老师的深度学习课程笔记及资源) Python实现, 3.李航《统计学习方法》 Python代码实现, 4.自然语言处理NLP 牛津大学xDeepMind Python代码实现, 5.LeetCode刷题,题析,分析心得笔记 Java和Python代码实现, 6.TensorFlow人工智能实践代码笔记 北京大学曹健老师课程和TensorFlow:实战Google深度学习框架(第二版) Python代码实现, 附带一些个人心得和笔记。GitHub上有很多机器学习课程的代码资源,我也准备自己实现一下,后续会更新笔记,代码和百度云网盘链接。 这个项目主要是学习算法的,并且会不断更新相关资源和代码,欢迎关注,star,fork! Min's blog 欢迎访问我的博客主页! (Welcome to my blog website !)https://liweimin1996.github.io/
Stars: ✭ 359 (-54.33%)
Mutual labels:  ai, jupyter-notebook
Machine learning tutorials
Code, exercises and tutorials of my personal blog ! 📝
Stars: ✭ 601 (-23.54%)
Mutual labels:  ai, jupyter-notebook

Partial Convolutions for Image Inpainting using Keras

Keras implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions", https://arxiv.org/abs/1804.07723. A huge shoutout the authors Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao and Bryan Catanzaro from NVIDIA corporation for releasing this awesome paper, it's been a great learning experience for me to implement the architecture, the partial convolutional layer, and the loss functions.

Dependencies

  • Python 3.6
  • Keras 2.2.4
  • Tensorflow 1.12

How to use this repository

The easiest way to try a few predictions with this algorithm is to go to www.fixmyphoto.ai, where I've deployed it on a serverless React application with AWS lambda functions handling inference.

If you want to dig into the code, the primary implementations of the new PConv2D keras layer as well as the UNet-like architecture using these partial convolutional layers can be found in libs/pconv_layer.py and libs/pconv_model.py, respectively - this is where the bulk of the implementation can be found. Beyond this I've set up four jupyter notebooks, which details the several steps I went through while implementing the network, namely:

Step 1: Creating random irregular masks
Step 2: Implementing and testing the implementation of the PConv2D layer
Step 3: Implementing and testing the UNet architecture with PConv2D layers
Step 4: Training & testing the final architecture on ImageNet
Step 5: Simplistic attempt at predicting arbitrary image sizes through image chunking

Pre-trained weights

I've ported the VGG16 weights from PyTorch to keras; this means the 1/255. pixel scaling can be used for the VGG16 network similarly to PyTorch.

Training on your own dataset

You can either go directly to step 4 notebook, or alternatively use the CLI (make sure to download the converted VGG16 weights):

python main.py \
    --name MyDataset \
    --train TRAINING_PATH \
    --validation VALIDATION_PATH \
    --test TEST_PATH \
    --vgg_path './data/logs/pytorch_to_keras_vgg16.h5'

Implementation details

Details of the implementation are in the paper itself, however I'll try to summarize some details here.

Mask Creation

In the paper they use a technique based on occlusion/dis-occlusion between two consecutive frames in videos for creating random irregular masks - instead I've opted for simply creating a simple mask-generator function which uses OpenCV to draw some random irregular shapes which I then use for masks. Plugging in a new mask generation technique later should not be a problem though, and I think the end results are pretty decent using this method as well.

Partial Convolution Layer

A key element in this implementation is the partial convolutional layer. Basically, given the convolutional filter W and the corresponding bias b, the following partial convolution is applied instead of a normal convolution:

where ⊙ is element-wise multiplication and M is a binary mask of 0s and 1s. Importantly, after each partial convolution, the mask is also updated, so that if the convolution was able to condition its output on at least one valid input, then the mask is removed at that location, i.e.

The result of this is that with a sufficiently deep network, the mask will eventually be all ones (i.e. disappear)

UNet Architecture

Specific details of the architecture can be found in the paper, but essentially it's based on a UNet-like structure, where all normal convolutional layers are replace with partial convolutional layers, such that in all cases the image is passed through the network alongside the mask. The following provides an overview of the architecture.

Loss Function(s)

The loss function used in the paper is kinda intense, and can be reviewed in the paper. In short it includes:

  • Per-pixel losses both for maskes and un-masked regions
  • Perceptual loss based on ImageNet pre-trained VGG-16 (pool1, pool2 and pool3 layers)
  • Style loss on VGG-16 features both for predicted image and for computed image (non-hole pixel set to ground truth)
  • Total variation loss for a 1-pixel dilation of the hole region

The weighting of all these loss terms are as follows:

Training Procedure

Network was trained on ImageNet with a batch size of 1, and each epoch was specified to be 10,000 batches long. Training was furthermore performed using the Adam optimizer in two stages since batch normalization presents an issue for the masked convolutions (since mean and variance is calculated for hole pixels).

Stage 1 Learning rate of 0.0001 for 50 epochs with batch normalization enabled in all layers

Stage 2 Learning rate of 0.00005 for 50 epochs where batch normalization in all encoding layers is disabled.

Training time for shown images was absolutely crazy long, but that is likely because of my poor personal setup. The few tests I've tried on a 1080Ti (with batch size of 4) indicates that training time could be around 10 days, as specified in the paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].