Lego Brick Recognition

This project is inspired by Jacques Mattheij's blog entry: Sorting 2 Metric Tons of Lego who uses a Deep Learning approach to sort Lego bricks. Our first goal is to improve the software side, especially creating a generic dataset for the neural network training. Due to the high number of bricks (10000+) and the related similarities, we plan a meaningful combination of categories and single bricks as classification task.

We would be happy if you would contribute to this project! We could then provide all necessary resource files, e.g. real test images or trained models.

Dataset Generation
- Render a Single Brick
- Data Analysis (Brick Similarity) and Preprocessing
- Render Images
Classification
- Preliminaries
- Training
- Evaluation
- Inference
Requirements / Installation

Dataset Generation

In order to generate images from a single 3d model we use Blender's Python interface and the ImportLDraw module. All 3d models come from a collection of LDraw™ an open standard for LEGO CAD programs that allow the user to create virtual LEGO models. Make sure you have successfully installed the module and set the ldraw path correctly.

Render a Single Brick

Control of the blender script as stand-alone:
blender -b -P dataset/blender/render.py -- -i 9.dat --save ./ --images_per_brick 1 --config thumbnail-example.json
Hint for Mac users to access blender: /Applications/Blender/blender.app/Contents/MacOS/blender -b -P ...

Data Analysis (Brick Similarity) and Preprocessing

To view the data for the first time, we generate thumbnails for each 3d file with fixed lightning conditions, background, camera settings, etc.. Furthermore, we extract some metadata such as category and label. The category distribution of all available bricks is shown below.

To generate thumbnails i.e. for a the category 'Technic' run: python dataset/generate_thumbnails.py --dataset ./ldraw/parts/ --category Technic --output ./technic

It is known that some parts have both different identifiers and are very similar to each other. For instance, a part differs only by small imprint on the front. The question is how to deal with these two cases. For the first one it is essential to identify these visual equal parts that only differ in the part number. The second case can be ignored if all parts are to be distinguished. However, it can be useful to consider bricks with the same shape as one class, which reduces the number of classes and the associated classification complexity.

Based on the generated thumbnails, a brick similarity can be calculated by comparing two images using the structural similarity image metric (SSIM). By setting thresholds, we can identify identical parts as well as shape-similar parts. In future, we plan further experiments and improvements e.g. by measuring the shape similarity.

The 10 most similar bricks with the part id '6549' (included itself) of a subset of all 'Technic' parts:

Similarities to all other parts:

When the script is finished with creating thumbnails, a final CSV is written that contains lists of IDs for identical parts and shape-like parts.

id	label	category	thumbnail_rendered	identical	shape_identical
6549	Technic 4 Speed Shift Lever	Technic	True	6549	2699 3705c01 3737c01 6549 73485
2698c01	Technic Action Figure (Complete)	Technic	True	2698c01	13670 24316 2698c01 3705 370526 4211815 4519 6587 87083 99008 u9011
...	...	...	...	...	...

Render Images

render images 2000 for each id
image size: 224x224x3
brick augmentation:
- 38 official brick colors
- rotation: in x direction due to no standardized origin: 0, 90, 180, 270
- scaling: normalized one dimension and size variation via zooming factor
  [ ] surface texture and reflexion
setting background:
- random images from IndoorCVPR09 dataset vs. noise
- camera position: random on the surface on the upper half of a sphere
- exposure: random spot on the sphere surface, radius and energy fixed
  [ ] varying exposure

Script to render all images: python dataset/generate_dataset.py
TODO: Parameter via argparse

Classification

In a first experiment, we focus on a small subset in order to test the capability of the synthetically generated training material. For this reason we manually selected 15 classes and created a real-world test set. At this point, the knowledge of the identical and form-identical parts is irrelevant and will be considered searately in future.

Training

Common fastai image classification workflow as baseline
Fine-tuning ResNeXt50 (and ResNet34), 5 epochs, fit_one_cycle 1e-3, batch size 42 (128)

Evaluation

Evaluate on real-world testset!

created for 15 bricks. 20 images for each
crop and resize images manually
train object detection model with squared bounding boxes

Results not satisfactory to some extent! The Difference between the synthetic generated training set and real-world validation set is to high!
Possible Reasons (further investigation needed):

brick size differ
lighting conditions differ
resolution differ
background is relevant for generalization
viewing angle differ

Big TODOs

reduce generalization gap by improving rendering process
multi-instance training i.e. stack three images to prevent confusion in classification
object detection model for preprocessing: crop squared images from video stream
inference script for one-camera setting (object detection, ...) and multi-camera setting

Requirements / Installation

Blender 2.79 + ImportLDraw
conda env create -f environment.yml

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

jtheiner / LegoBrickClassification

Programming Languages

Labels

Projects that are alternatives of or similar to LegoBrickClassification