MidcurveNN
Midcurve by Neural Networks
Description
- Goal: Given a 2D closed shape (closed polygon) find its midcurve (polyline, closed or open)
- Input: set of points or set of connected lines, non-intersecting, simple, convex, closed polygon
- Output: another set of points or set of connected lines, open/branched polygons possible
Thoughts
Representation Issue
- Shapes can not be modeled as sequences. Although polygon shape L may appear as sequence of points, it is not.
- All shapes can not be drawn without lifting a pencil, which is possible for sequences. Say, Shapes like Y or concentric O, cannot be modeled as sequences. So, Midcurve transformation cannot be modeled as Sequence 2 Sequence network.
- How to represent a geometric figure to feed to any Machine/Deep Learning problem, as they need numeric data in vector form?
- How to convert geometric shapes (even after restricting the domain to 2D linear profile shapes) to vectors?
- Closest data structure is graph, but thats predominantly topological and not geometrical. Meaning, graphs represent only connectivity and not spatial positions. Here, nodes would have coordinates and arcs would have curved-linear shape. So, even Graph Neural Networks, which convolute neighbors around a node and pool the output to generate vectors, are not suitable.
- Need RnD to come up with a way to generate geometric-graph embedding that will convolute at nodes (having coordinates) around arcs (having geometry, say, a set of point coordinates), then pool them to form a meaningful representation. Real crux would be how to formulate pooling to aggregate all incident curves info plus node coordinates info into a single number!!
Variable Lengths Issue
- It is a dimension-reduction problem. In 2D, input is the sketch profile (parametrically 2D), whereas the output is the midcurve (parametrically 1D). Input points are ordered (mostly forming closed loop, manifold). Output points may not be ordered, and can have branches (non-manifold)
- It is a variable input and variable output problem as number of points and lines are different in input and output.
- It is a network 2 network problem (not Sequence to Sequence) with variable size inputs and outputs
- For Encoder Decoder like network, libraries like Tensorflow need fixed length inputs. If input has variable lengths then padding is done with some unused value. But the padding will be a big issue here as the padding value cannot be like 0,0 as it itself would be a valid point.
- The problem of using seq2seq is that both input polygons and output branched midcurves are not linearly connected, they may have loops, or branches. Need to think more. (more details in LIMITATIONS below)
- Instead of going to point list as i/o let’s look at well worked format of images. Images are of constant size, say 64x64 pixels. Let’s colour profile in the bitmap (b&w only) similarly midcurve in output bitmap. With this as i/o LSTM encoder decoder seq2seq can be applied. Variety in training data can be populated by shifting/rotating/scaling both i/o. Only 2D sketch profile for now. Only linear segments. Only single simple polygon with no holes.
- How to vectorise? Each point as 2 floats/ints. So total input vector is polygon of m points is 2m floats/ints. Closed polygon with repeat the first point as last. Output is vector of 2n points. In case of closed figure, repeat the last point. Prepare training data using data files used in MIDAS. To make 100s, 1000s of input profiles, one can scale both input and output with different factors, and then randomly shuffle the entries. Find max num points of a profile, make that as fixed length for both input and output. Fill with 0,0??? As origin 0,0 could be valid part of profile…any other filler? NULL? Run simple feed forward NN, later RNN, LSTM, Seq2seq
- See https://www.tensorflow.org/tutorials/seq2seq, https://www.youtube.com/watch?v=G5RY_SUJih4, A Neural Representation of Sketch Drawings, https://magenta.tensorflow.org/sketch_rnn https://github.com/tensorflow/magenta/blob/master/magenta/models/sketch_rnn/README.md
- Add plotting capability, show polygons their midcurves etc, easy to debug and test unseen figures.
Dilution to Images
-
Images of geometric shapes address both, representation as well as variable-size issue. Big dilution is that, true geometric shapes are like Vector images, whereas images used here would be of Raster type. Approximation has crept in.
-
Even after modeling, the predicted output needs to be post-processed to bring to geometric form. Challenging again.
-
Thus, this project is divided into two phases:
- Phase I: Image to Image transformation learning
- Img2Img: i/o fixed size 100x100 bitmaps
- Populate many by scaling/rotating/translating both io shapes within the fixed size
- Use Encoder Decoder like Semantic Segmentation or Pix2Pix of IMages to learn dimension reduction
- Phase II: Geometry to Geometry transformation learning
- Build both, input and output polyline graphs with (x,y) coordinates as node features and edges with node id pairs mentioned. For poly-lines, edges being lines, no need to store geometric intermediate points as features, else for curves, store say, sampled fixed 'n' points.
- Build Image-Segmentation like Encoder-Decoder network, given Graph Convolution Layers from DGL in place of usual Image-based 2D convolution layer, in the usual pytorch encoder-decoder model.
- Generate variety of input-output polyline pairs, by using geometric transformations (and not image transformations as done in Phase I).
- See if Variational Graph Auto-Encoders https://github.com/dmlc/dgl/tree/master/examples/pytorch/vgae can help.
- Phase I: Image to Image transformation learning
-
Currently Phase I is under implementation. Phase II can start only after suitable geometric-graph embedding representation becomes available.
Publications/Talks
- Vixra paper MidcurveNN: Encoder-Decoder Neural Network for Computing Midcurve of a Thin Polygon, viXra.org e-Print archive, viXra:1904.0429 http://vixra.org/abs/1904.0429
- ODSC proposal https://confengine.com/odsc-india-2019/proposal/10090/midcurvenn-encoder-decoder-neural-network-for-computing-midcurve-of-a-thin-polygon
- CAD Conference 2021, Barcelona, pages 223-225 http://www.cad-conference.net/files/CAD21/CAD21_223-225.pdf
- CAD & Applications 2022 Journal paper 19(6) http://www.cad-journal.net/files/vol_19/CAD_19(6)_2022_1154-1161.pdf
- Google Developers Dev Library https://devlibrary.withgoogle.com/products/ml/repos/yogeshhk-MidcurveNN
Citations
Regarding the state of the art, the closest work related to this paper is from Kulkarni[28].
Kulkarni proposed MidcurveNN, an encoder-decoder neural network to extract mid-curves from
polygonal 2D shapes. The principle is to train the network with both a pixel image of the
polygonal shape and of the final desired mid-curves. Although in an early stage of research,
the network is able to produce reasonably well the mid-curves of simple L-shaped polygons.
The limitations of this work remain in the noisiness of the produced results.
It has not been tested on a large diversity of shapes and is performed on the full shape
globally potentially requiring a high-resolution pixel grid for large models.
[28]Y.H. Kulkarni, MIDCURVENN: Encoder-decoder neural network for computing midcurve of a thin polygon,
in: Open Data Sci. Conf.,2019
Disclaimer:
Author ([email protected]) gives no guarantee of the results of the program. It is just a fun script. Lot of improvements are still to be made. So, don’t depend on it at all.