Hand-Drawn Shape Simulation

Boris Kravtsov, PhD
5 min readJul 5, 2022

--

A Challenge for AI Developers

Photo by Road Ahead on Unsplash

Shape recognition is an essential property of the human intellect. Kids 2–3 years old can recognize objects even if they differ in size, position, orientation, or are severely deformed. For example, they quickly identify hearts in the image above and below.

Fig.1 “Hand-drawn” hearts created by the hdss Python package

Such distorted forms, “hand-drawn shapes,” are handy to test various recognition technologies. In the following, we show how to use hdss software from PyPI to generate these objects by using two simple python functions. Also, you can download the hdss code and data from our website.

About the Method

Any shape consists of one or more smooth curves. For example, below is a heart shape composed of two such curves, green and red. In turn, a smooth curve can be described by a small set of control points on this curve since all other intermediate points can be recovered by Bezier interpolation. So, one or more text files with coordinates of control points fully describe the shape.

Fig.2 Creating a shape by the coordinates of its control points
Fig.3 Plain text files (CSV format) with coordinates of control points.

Any slight change in the control point’s position will distort the original shape nearby this point (local distortion). By updating all control points, in other words, “adding the random noise,” you can generate any number of shapes that look hand-drawn to the eye.

Fig.4 Local distortion

Another way to distort a shape is to apply the same random perspective transformation to all control points and implement Bezier interpolation using their new location (global distortion).

To set the perspective transformation, we need the coordinates of 2 corresponding quadrangles:

1. Src: we use the positions of the four crosses randomly placed in the blue zones.

2. Dst: the positions of the orange square corners (left image).

Fig.5 Global distortion

Note. You may choose random local or/and random global distortion. However, we strongly recommend applying both methods together.

How to use

from pathlib import Path

import hdss
import colors


dir_name = '_HEARTS_'
dir_curves = Path.cwd() / '_CURVES_'

hdss.init(dir_name)

hdss.set_params(
200, # shape_size = 200 x 200
5, # number_of_shapes
True, # perspective_flag
6, # bezier_noise_param
colors.maraschino, # line_color
2) # line_thickness

shape_name = 'heart'
path_curve1 = dir_curves / 'heart_curve1.txt'
path_curve2 = dir_curves / 'heart_curve2.txt'

hdss.create_shapes(
dir_name,
shape_name,
path_curve1,
path_curve2)

The code above will create 5 “hand-drawn” shapes [200 x 200] in the new directory _HEARTS_: heart_0.png, heart_1.png, heart_2.png, heart_3.png, heart_4.png. The result is shown in Fig 1.

Notes. Control point coordinates are always measured for shape size [100 x 100]. For other sizes, these coordinates will be updated automatically inside the package.

You can download two detailed examples of how to use hdss from our website.

The program above has been tested on the following platforms: Ubuntu 20.04, Windows 11 and macOS Ventura.

A Challenge for AI Developers

Creating technology that recognizes distorted shapes is a real challenge.

Consider four types of star shapes (N = 4).

Fig.6 Star shapes and their control points

With the help of hdss, we have created six “hand-drawn” items of each type (M=6).

Then we pre-shuffle the list of all 24 images and put them into a [N x M] table, as seen in the example below.

Fig.7 Input table

Our goal is to swap the elements of the table so that each column contains items of the same type:

Fig.8 Output table

Thus, to achieve this result, we need a function, let’s call it “shapes_swap,” that converts the shapes list from the input table to the shapes list of the output table.

"""
n_types - number of different shape types = N
n_items - the number of items of each type = M
list_in - input list of shapes
list_out - output list of shapes
"""

list_out = shapes_swap(n_types, n_items, list_in)

In a fair simulation of human intelligence, we can use only the 24 images mentioned above. Any prior learning is unacceptable since humans manage to succeed even if they see the table elements, the distorted shapes, for the first time.

The method should work with other forms: card suits, chess symbols, etc. Also, this technique should support different values of N and M.

Why are we discussing this challenge at all? There are two reasons:

For those who believe in the existence of a solution like “shapes_swap” or the possibility of creating one soon, it will be interesting to compare the effectiveness of different approaches.

As for the skeptics, they should be satisfied with the fact that the formulation of important and complex tasks is an excellent catalyst for progress. In mathematics, for example, Hilbert’s famous 23 problems play such a role.

--

--

Boris Kravtsov, PhD
Boris Kravtsov, PhD

Written by Boris Kravtsov, PhD

I'm trying to share some of my old thoughts and new perspectives.