All Projects → vision4j → Vision4j Collection

vision4j / Vision4j Collection

Licence: mit
Collection of computer vision models, ready to be included in a JVM project

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Vision4j Collection

Imgclsmob
Sandbox for training deep learning networks
Stars: ✭ 2,405 (+1721.97%)
Mutual labels:  classification, segmentation, semantic-segmentation, imagenet
Segmentation
Tensorflow implementation : U-net and FCN with global convolution
Stars: ✭ 101 (-23.48%)
Mutual labels:  classification, segmentation, semantic-segmentation
Caffe Model
Caffe models (including classification, detection and segmentation) and deploy files for famouse networks
Stars: ✭ 1,258 (+853.03%)
Mutual labels:  classification, segmentation, imagenet
Segmentation models.pytorch
Segmentation models with pretrained backbones. PyTorch.
Stars: ✭ 4,584 (+3372.73%)
Mutual labels:  segmentation, semantic-segmentation, imagenet
Pytorch Classification
Classification with PyTorch.
Stars: ✭ 1,268 (+860.61%)
Mutual labels:  classification, imagenet
Frostnet
FrostNet: Towards Quantization-Aware Network Architecture Search
Stars: ✭ 85 (-35.61%)
Mutual labels:  classification, semantic-segmentation
Regnet
Pytorch implementation of network design paradigm described in the paper "Designing Network Design Spaces"
Stars: ✭ 129 (-2.27%)
Mutual labels:  classification, imagenet
Setr Pytorch
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Stars: ✭ 96 (-27.27%)
Mutual labels:  segmentation, semantic-segmentation
Dataset
Crop/Weed Field Image Dataset
Stars: ✭ 98 (-25.76%)
Mutual labels:  classification, segmentation
Universal Data Tool
Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.
Stars: ✭ 1,356 (+927.27%)
Mutual labels:  classification, semantic-segmentation
Pointnet Keras
Keras implementation for Pointnet
Stars: ✭ 110 (-16.67%)
Mutual labels:  classification, segmentation
Edafa
Test Time Augmentation (TTA) wrapper for computer vision tasks: segmentation, classification, super-resolution, ... etc.
Stars: ✭ 107 (-18.94%)
Mutual labels:  classification, segmentation
Autoannotationtool
A label tool aim to reduce semantic segmentation label time, rectangle and polygon annotation is supported
Stars: ✭ 113 (-14.39%)
Mutual labels:  classification, semantic-segmentation
Raster Vision
An open source framework for deep learning on satellite and aerial imagery.
Stars: ✭ 1,248 (+845.45%)
Mutual labels:  classification, semantic-segmentation
3dunet abdomen cascade
Stars: ✭ 91 (-31.06%)
Mutual labels:  segmentation, semantic-segmentation
Dlcv for beginners
《深度学习与计算机视觉》配套代码
Stars: ✭ 1,244 (+842.42%)
Mutual labels:  classification, segmentation
Pointclouddatasets
3D point cloud datasets in HDF5 format, containing uniformly sampled 2048 points per shape.
Stars: ✭ 80 (-39.39%)
Mutual labels:  classification, segmentation
Pytorch Imagenet Cifar Coco Voc Training
Training examples and results for ImageNet(ILSVRC2012)/CIFAR100/COCO2017/VOC2007+VOC2012 datasets.Image Classification/Object Detection.Include ResNet/EfficientNet/VovNet/DarkNet/RegNet/RetinaNet/FCOS/CenterNet/YOLOv3.
Stars: ✭ 130 (-1.52%)
Mutual labels:  classification, imagenet
Deep Segmentation
CNNs for semantic segmentation using Keras library
Stars: ✭ 69 (-47.73%)
Mutual labels:  segmentation, semantic-segmentation
Chainer Pspnet
PSPNet in Chainer
Stars: ✭ 76 (-42.42%)
Mutual labels:  semantic-segmentation, imagenet

Vision4j collection

Collection of computer vision models, ready to be included in a JVM project. The idea is to maintain a list of implementations for different computer vision problems in a plug-and-play format.

Table of Contents

Problems

Classification

By a given image, find the category that the image belongs to. For example, if a model is trained to recognize the categories: lion, cheetah and tiger, when given an image in one of those categories, it can recognize it.

Input Output
alt text tiger
alt text cheetah
alt text lion

Classification is not the problem you were looking for? Go back to table of contents

How do we measure how good a model is?

By a given dataset, find the number of correctly classified examples and divide them to the total number of examples in the dataset (classification accuracy).

A list of the most important datasets with leaderboard links:

Dataset Leaderboard
ImageNet 2014 Classification + Localization challenge http://image-net.org/challenges/LSVRC/2014/results#clsloc

Implementations available for the classification problem:

Pretrained VGG16 on ImageNet using DeepLearning4j

Trained on the ImageNet dataset.

Paper: Very Deep Convolutional Networks for Large-Scale Image Recognition

Original repo: https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-zoo/src/main/java/org/deeplearning4j/zoo/model/VGG16.java

Implementation license: Apache 2.0

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>vgg16-deeplearning4j-classifier</artifactId>
    <version>1.3.0</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation uses ND4J, so you should add one more dependency depending on whether you have GPU or not. You can read more about it here.

Once you have added the dependency and did the necessary setup, you can use it like this:

ImageClassifier imageClassifier = new Vgg16DeepLearning4jClassifier();
Category category = imageClassifier.predict(new File("./cheetah.jpg"));
String name = category.getCategoryName(); // cheetah
int index = category.getIndex(); // 293

Minimum required memory for the model: 1.355 GB

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
224x224 0.070 TODO 0.730

GRPC classifier

Delegates to another classifier (in another languages) through GRPC call.

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>grpc-classifier</artifactId>
    <version>1.3.1</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation requires a GRPC server running with the classifier. You can use any C++, Python or Lua model. By default, it communicates over localhost on port 50051 and is usually faster than the corresponding DeepLearning4j implementation. You can read more about GRPC here. This model can be combined with any of the following models:

Keras VGG16 classification

Keras implementation of VGG16, pretrained on ImageNet.

Paper: Very Deep Convolutional Networks for Large-Scale Image Recognition

Original repo: https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py

Implementation license: MIT

If you have a GPU:

nvidia-docker run -it -p 50051:50051 vision4j/grpc-keras-vgg16-classification:gpu

If you have only a CPU:

docker run -it -p 50051:50051 vision4j/grpc-keras-vgg16-classification

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Once you have added the dependency and started the external model, you can use it like this:

ImageClassifier imageClassifier = new GrpcImageNetClassification();
Category category = imageClassifier.predict(new File("./cheetah.jpg"));
String name = category.getCategoryName(); // cheetah
int index = category.getIndex(); // 293

The memory requirements and the prediction times depend on the model that is being delegated to.

Segmentation

By a given image, for each pixel predict what it is. For example, if a model is trained to recognize person, table and bottle, when given an image, it is able to correctly predict the boundaries of every category of objects. Also called instance segmentation.

Input Output
alt text alt text

Segmentation is not the problem you were looking for? Go back to table of contents

How do we measure how good a model is?

By a given dataset, find the average IoU (Intersection over Union) across all given imags. IoU is calculated by taking the intersection between the ground truth and the prediction and dividing it to the union of the ground truth and the prediction

A list of the most important datasets with leaderboard links:

Dataset Leaderboard
Coco dataset http://cocodataset.org/#detection-leaderboard
Pascal VOC 2012 http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6

Implementations available for the segmentation problem:

GRPC segmentation

Delegates to another segmentation model (in another languages) through GRPC call.

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>grpc-segmentation</artifactId>
    <version>1.0.1</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation requires a GRPC segmentation server. You can use any C++, Python or Lua model. By default, it communicates over localhost on port 50052 and is usually faster than the corresponding DeepLearning4j implementation. You can read more about GRPC here. This model can be combined with any of the following models:

DeepLabV3 Pascal VOC segmentation

Tensorflow implementation of DeepLabV3. Implementation is provided and maintained by Google.

Paper: Rethinking Atrous Convolution for Semantic Image Segmentation

Original repo: https://github.com/tensorflow/models/tree/master/research/deeplab

Implementation license: MIT

If you have a GPU:

nvidia-docker run -it -p 50052:50052 vision4j/deeplabv3-pascal-voc-segmentation:gpu

If you have only a CPU:

docker run -it -p 50052:50052 vision4j/deeplabv3-pascal-voc-segmentation

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Mask R-CNN pretrained on Coco dataset

Mask R-CNN used for semantic segmentation.

Paper: Mask R-CNN

Original repo: https://github.com/matterport/Mask_RCNN

Implementation license: MIT

If you have a GPU:

nvidia-docker run -it -p 50052:50052 vision4j/mask-rcnn-segmentation:gpu

If you have only a CPU:

docker run -it -p 50052:50052 vision4j/mask-rcnn-segmentation

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Once you have added the dependency and started the external model, you can use it like this:

Segmentation seg = new PascalVOC2012GrpcSegmentation();
SegmentationResult res = seg.segment(new File("chess.jpg"));
BufferedImage resultImage = res.getBufferedImage();

The memory requirements and the prediction times depend on the model that is being delegated to.

Completion

By a given image and a missing region in that image, fill in the missing region so that it fits with the rest of the image.

Input Output
alt text alt text alt text

Completion is not the problem you were looking for? Go back to table of contents

How do we measure how good a model is?

Different metrics exist for exemplar based inpainting and deep learning based models. One possible evaluator would be Naturalness Image Quality Evaluator (NIQE), or comparing it to the original image. A survey of different quality metrics is available in the paper A critical survey of state-of-the-art image inpainting quality assessment metrics

A list of the most important datasets with leaderboard links:

Dataset Leaderboard

Implementations available for the completion problem:

GRPC completion

Delegates to another completion model (in another languages) through GRPC call.

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>grpc-completion</artifactId>
    <version>1.0.1</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation requires a GRPC segmentation server. You can use any C++, Python or Lua model. By default, it communicates over localhost on port 50053 and is usually faster than the corresponding DeepLearning4j implementation. You can read more about GRPC here. This model can be combined with any of the following models:

Gimp Resynthesizer Plugin

Wrapper of the resynthesizer plugin in Gimp. Uses variation of PatchMatch that is close to the one used in Content-Aware fill in Adobe Photoshop.

Paper: PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing

Original repo: https://github.com/bootchk/resynthesizer

Implementation license: MIT

If you have a GPU:

nvidia-docker run -it -p 50053:50053 vision4j/gimp-completion:gpu

If you have only a CPU:

docker run -it -p 50053:50053 vision4j/gimp-completion

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Once you have added the dependency and started the external model, you can use it like this:

Completion completion = new GimpGrpcCompletion();
CompletionResult res = completion.complete(new File("people.jpg"), new File("mask.jpg"));
BufferedImage resultImage = res.getBufferedImage();

The memory requirements and the prediction times depend on the model that is being delegated to.

Detection

By a given image, found the bounding box for a given category (or multiple categories). For example, if the model is trained to recognize the categories car and pedestrian, the output would be coordinates of the bounding boxes as well as the classes.

Input Output
alt text alt text
alt text alt text

Detection is not the problem you were looking for? Go back to table of contents

How do we measure how good a model is?

By a given dataset, find the average IoU (Intersection over Union) across all given imags. IoU is calculated by taking the intersection between the ground truth and the prediction and dividing it to the union of the ground truth and the prediction

A list of the most important datasets with leaderboard links:

Dataset Leaderboard
Coco dataset http://cocodataset.org/#detection-leaderboard
Pascal VOC 2012 http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

Implementations available for the detection problem:

GRPC detection

Delegates to another detection model (in another languages) through GRPC call.

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>grpc-detection</artifactId>
    <version>1.0.0</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation requires a GRPC server running with the detection model. You can use any C++, Python or Lua model. By default, it communicates over localhost on port 50054 and is usually faster than the corresponding DeepLearning4j implementation. You can read more about GRPC here. This model can be combined with any of the following models:

Mask R-CNN detection pretrained on Coco dataset

Mask R-CNN used for object detection.

Paper: Mask R-CNN

Original repo: https://github.com/matterport/Mask_RCNN

Implementation license: MIT

If you have a GPU:

nvidia-docker run -it -p 50054:50054 vision4j/mask-rcnn-detection:gpu

If you have only a CPU:

docker run -it -p 50054:50054 vision4j/mask-rcnn-detection

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Once you have added the dependency and started the external model, you can use it like this:

Detection detection = new CocoDetection();
DetectionResult res = detecion.detect(new File("./kites.jpg"));
// res is now a map from categories to list of bounding boxes

The memory requirements and the prediction times depend on the model that is being delegated to.

Face detection

By a given image, find the bounding boxes of all faces present.

Input Output
alt text alt text

Face detection is not the problem you were looking for? Go back to table of contents

How do we measure how good a model is?

By a given dataset, find the average IoU (Intersection over Union) across all given imags. IoU is calculated by taking the intersection between the ground truth and the prediction and dividing it to the union of the ground truth and the prediction

A list of the most important datasets with leaderboard links:

Dataset Leaderboard

Implementations available for the face_detection problem:

GRPC face detection

Delegates to another GRPC face detection model (in another languages) through GRPC call.

To use this implementation in your project, add the dependency:

<dependency>
    <groupId>com.vision4j</groupId>
    <artifactId>grpc-face-detection</artifactId>
    <version>1.0.0</version>
</dependency>

Not what you were looking for? Go back to table of contents

This implementation requires a GRPC face detection server. You can use any C++, Python or Lua model. By default, it communicates over localhost on port 50055 and is usually faster than the corresponding DeepLearning4j implementation. You can read more about GRPC here. This model can be combined with any of the following models:

Dlib Face Recognition

Python face_recognition library, based on dlib C++ library for face recognition.

If you have a GPU:

nvidia-docker run -it -p 50055:50055 vision4j/python-dlib-face-detection:gpu

If you have only a CPU:

docker run -it -p 50055:50055 vision4j/python-dlib-face-detection

Not what you were looking for? Go back to table of contents

Minimum required memory for the model: TODO

Prediction times (in seconds):

Image size 1080Ti K80 CPU (AMD Ryzen)
TODO TODO TODO TODO

Once you have added the dependency and started the external model, you can use it like this:

FaceDetection faceDetection = new GrpcFaceDetection();
FaceDetectionResult faceDetectionResult = faceDetection.detect("obama_biden.jpg");

The memory requirements and the prediction times depend on the model that is being delegated to.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].