sayakpaul / Learnable-Image-Resizing

Licence: Apache-2.0 license

TF 2 implementation Learning to Resize Images for Computer Vision Tasks (https://arxiv.org/abs/2103.09950v1).

Programming Languages

Jupyter Notebook

11667 projects

Projects that are alternatives of or similar to Learnable-Image-Resizing

CarLens-iOS

CarLens - Recognize and Collect Cars

Stars: ✭ 124 (+158.33%)

Mutual labels: vision, image-recognition

Arc Robot Vision

MIT-Princeton Vision Toolbox for Robotic Pick-and-Place at the Amazon Robotics Challenge 2017 - Robotic Grasping and One-shot Recognition of Novel Objects with Deep Learning.

Stars: ✭ 224 (+366.67%)

Mutual labels: vision

Knn Matting

Source Code for KNN Matting, CVPR 2012 / TPAMI 2013. MATLAB code ready to run. Simple and robust implementation under 40 lines.

Stars: ✭ 130 (+170.83%)

Mutual labels: vision

Apriltag ros

A ROS wrapper of the AprilTag 3 visual fiducial detector

Stars: ✭ 160 (+233.33%)

Mutual labels: vision

Ravens

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.

Stars: ✭ 133 (+177.08%)

Mutual labels: vision

Donkeycar

Open source hardware and software platform to build a small scale self driving car.

Stars: ✭ 2,192 (+4466.67%)

Mutual labels: vision

Nvidia Gpu Tensor Core Accelerator Pytorch Opencv

A complete machine vision container that includes Jupyter notebooks with built-in code hinting, Anaconda, CUDA-X, TensorRT inference accelerator for Tensor cores, CuPy (GPU drop in replacement for Numpy), PyTorch, TF2, Tensorboard, and OpenCV for accelerated workloads on NVIDIA Tensor cores and GPUs.

Stars: ✭ 110 (+129.17%)

Mutual labels: vision

Opencv

📷 Computer-Vision Demos

Stars: ✭ 244 (+408.33%)

Mutual labels: vision

Simplecv

Stars: ✭ 2,522 (+5154.17%)

Mutual labels: vision

Arucogen

Online ArUco markers generator

Stars: ✭ 155 (+222.92%)

Mutual labels: vision

Openkai

OpenKAI: A modern framework for unmanned vehicle and robot control

Stars: ✭ 150 (+212.5%)

Mutual labels: vision

Flowiz

Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:

Stars: ✭ 144 (+200%)

Mutual labels: vision

Opticalflow visualization

Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge

Stars: ✭ 183 (+281.25%)

Mutual labels: vision

Cocoaai

🤖 The Cocoa Artificial Intelligence Lab

Stars: ✭ 134 (+179.17%)

Mutual labels: vision

Cs231a Notes

The course notes for Stanford's CS231A course on computer vision

Stars: ✭ 230 (+379.17%)

Mutual labels: vision

Facelandmarksdetection

Finds facial features such as face contour, eyes, mouth and nose in an image.

Stars: ✭ 130 (+170.83%)

Mutual labels: vision

Nextlevel

NextLevel was initally a weekend project that has now grown into a open community of camera platform enthusists. The software provides foundational components for managing media recording, camera interface customization, gestural interaction customization, and image streaming on iOS. The same capabilities can also be found in apps such as Snapchat, Instagram, and Vine.

Stars: ✭ 1,940 (+3941.67%)

Mutual labels: vision

Attendance Using Face

Face-recognition using Siamese network

Stars: ✭ 174 (+262.5%)

Mutual labels: vision

TF2DeepFloorplan

TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.

Stars: ✭ 98 (+104.17%)

Mutual labels: image-recognition

Amazing Arkit

ARKit相关资源汇总群：326705018

Stars: ✭ 239 (+397.92%)

Mutual labels: vision

View All Similar Projects ➔

Learnable-Image-Resizing

TensorFlow 2 implementation of Learning to Resize Images for Computer Vision Tasks by Talebi et al.

Accompanying blog post on keras.io: Learning to Resize in Computer Vision.

The above-mentioned paper proposes a simple framework to optimally learning representations for a given network architecture and given image resolution (such as 224x224). The authors find that the representations that are more coherent with the human perception system may not always improve the performance of vision models. Instead, optimizing the representations that are better suited for the models can substantially improve their performance.

The diagram presents the proposed learnable resizer module (source: original paper):

Here's how the resized images look like after being passed through a learned resizer:

On the left hand side, we see the outputs of an untrained learnable resizer. On the right, the outputs are from the same learnable resizer but with 10 epochs of training. The images may not make sense to our eyes in terms of their perceptual quality, but they help to improve the recognition performance of the vision models.

About the notebooks

Standard_Training.ipynb: Shows how to train a DenseNet-121 on the Cats and Dogs dataset with bilinear resizing (150 x 150).
Learnable_Resizer.ipynb: Shows how to train the same network with the learnable resizing module included. Here, the inputs are first resized to 300 x 300 and then the learnable resizer module helps learn optimal representations for 150 x 150.

These incorporate mixed-precision training along with distributed training.

Results

Model	Number of parameters (Million)	Top-1 accuracy
With learnable resizer	7.051717	67.67%
Without learnable resizer	7.039554	60.19%

Both the models were trained for only 10 epochs from the same initial checkpoint.

You can reproduce these results with the model weights provided here.

Paper citation

@InProceedings{Talebi_2021_ICCV,
    author    = {Talebi, Hossein and Milanfar, Peyman},
    title     = {Learning To Resize Images for Computer Vision Tasks},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {497-506}
}

Acknowledgements

ML-GDE program for providing GCP credit support.
Mark Doust (of Google) for feedback.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

sayakpaul / Learnable-Image-Resizing

Programming Languages

Labels

Projects that are alternatives of or similar to Learnable-Image-Resizing

Learnable-Image-Resizing

About the notebooks

Results

Paper citation

Acknowledgements