Human Recognition and AI, Part 2
Samson — A New Recognition Technology
Introduction
What are the essential properties a computer technology must possess in order to conform to a human recognition system? At a minimum, two key characteristics required:
- Constant readiness — The system must operate without any preliminary setup.
- Universality — It should recognize objects of various types and origins.
As an illustration, take a look at the picture where a little girl confidently manipulating unfamiliar 3D shapes. Furthermore, it’s easy to imagine how she could sort a shuffled deck of playing cards by suits, even if she’s never seen that deck before. This demonstrates an innate human capacity for recognition.
By contrast, neural networks lack these qualities. They require significant setup — deep learning, pre-training — and are not universal. A model trained on handwritten digits, for example, cannot identify printed letters.
The console application samson.py, introduced here, is the first example of a recognition technology that mimics this “innate ability to recognize”.
Note: We’ll begin with examples and usage instructions, followed by a discussion of the underlying algorithms and links to the source code.
Examples of the samson.py in Action
Example 1: “Sorting”
This example presents the card-sorting task mentioned earlier — but with increased complexity. Suit symbols are distorted, and we add an equal number of handwritten digits. All images were created using the INKredible iPad app and scaled to 200×200 pixels.
samson.py is a standard console application with a built-in directory called BASKET, which must be preloaded with the current input data (images) to be sorted. The program shuffles the reading order of the input data, and both the contents of the BASKET and the sorting results are saved as tables in the OUTPUT directory.
First, let’s download samson-data — the images we will use in this article. For the first example, place the contents of samson-data/example_1 into the BASKET directory, as shown in the figure below.
The program requires no prior setup and is launched in the standard way.
samson.py sorts the BASKET contents into homogeneous groups, or clusters. The user specifies the desired number of clusters via keyboard input. For this example, we’ll set it to 8 (representing four card suits + four types of digits). After execution, we’ll examine the contents of these eight newly formed groups.
The samson.py script (the main part of the Samson project) and a screenshot of its execution in PyCharm on a Mac mini M4 are provided below. Note the computation time.
# samson.py
"""
How to use: python samson.py
"""
from pathlib import Path
from distances import distances
from clustering import clustering
from get_tabular_results import get_tabular_results
from utils.src.dir_support import files_number_in_directory
def main():
dir_basket = Path.cwd() / 'BASKET'
n_objects = files_number_in_directory(dir_basket)
print(f'\nS A M S O N')
print(f'There are {n_objects} objects in the BASKET folder.')
input_string = input('Please enter the desired number of clusters: ')
number_of_clusters = int(input_string)
D = distances(dir_basket)
clustering(D, number_of_clusters)
get_tabular_results() # result in the directory "OUTPUT"
if __name__ == "__main__":
main()After a long wait, 8 files will appear in the OUTPUT directory, each representing a cluster containing the corresponding images.
As seen, Samson accurately sorted all 48 images from the BASKET.
Successful clustering, where all images within a cluster are similar, defines a class of images.
Note: Calculations were performed not on the original images from the BASKET directory (Fig. 1), but on their transformed canonical forms (Fig. 3). Refer to Part 1 of this article for details. This means recognition results are independent of image color or size; only their shapes, represented by contours, are relevant.
Example 2: “Recognition”
Recognition occurs when an object is correctly identified as belonging to a known class.
For instance, we recognize the Apple logo because we’ve seen apples since childhood. Without that prior knowledge, recognizing the logo would be impossible.
To demonstrate recognition, we will create objects that will play the role of the aforementioned apples. For this purpose, we first generate “known” objects: distorted three-, four-, five-, and six-pointed stars using the hdss program.
We place instances of these stars from samson-data/example_2a into the BASKET directory and sort them into 4 groups.
Successful clustering results in 4 clusters, thereby defining new object classes: “3star,” “4star,” “5star,” and “6star.”
Next, we augment the BASKET with new, orange “unknown-objects” created in the same way. To do this, we will fill BASKET with images from samson-data/example_2b. While color doesn’t affect the outcome of our computations, it helps visually distinguish old and new objects in the updated BASKET.
Problem. How can we recognize the orange objects, i.e., determine which class they belong to?
To achieve this, we sort the updated BASKET into 5 groups, yielding the following 5 clusters:
As observed, the previous clusters from Fig. 5 now incorporate both the old and the new “unknown” objects. The system does not differentiate between them, indicating they belong to the same class. For example, unknown_7 belongs to the “5star” class (see Fig. 7). Recognition has occurred!
The objects in the last cluster of Fig. 7, humorously termed “lonely hearts,” were not recognized by any existing class and formed their own.
Example 3: Recognizing MNIST Digits (Two Attempts)
This famous MNIST dataset of handwritten digits is widely used in neural network research, but unfortunately, it’s not flawless. The original images were heavily degraded to fit machine learning constraints: 128 x 128 pixel images were downscaled to 20 x 20 squares, then centered on a blank 28 x 28 field. We believe it’s illegitimate to adjust data to fit a specific technology, and a major concern is the lack of access to the original handwritten digit scans for competing recognition methods.
So, in the first attempt we will load into the BASKET the first 8 samples of the first three digits from the MNIST/data_test set. To do this, copy the contents of samson-data/example_3a into the BASKET directory.
Below is the result of sorting into three groups:
As seen, in the last cluster, the digit “2” was incorrectly grouped with zeros.
In the second attempt, we repeated the experiment using the first 8 samples of the last three digits from the MNIST/data_test set. We copied them from samson-data/example_3b.
Result of sorting into three groups:
The digit “8” was recognized flawlessly. However, “7,” with an additional horizontal stroke, was mistakenly identified as “9.” Despite minor errors, these early results demonstrate Samson’s ability to recognize MNIST digits without prior training — a breakthrough compared to conventional neural networks.
How samson.py Works
samson.py performs clustering — grouping similar objects — using Ward’s method (Ward J. H. Jr), a technique from classical cluster analysis without any modifications.
Typically, in cluster analysis, study objects are described by features and represented as points in a multidimensional feature space. Distances (or similarities) between objects are calculated as Euclidean distances between these points. The final, mandatory preparatory step before clustering is computing the matrix of pairwise distances between all object-points.
Consequently, cluster analysis is often mistakenly categorized as a method of multivariate statistics. However, cluster analysis solely relies on the matrix of pairwise distances and is unaware of the nature or number of features, thus having no knowledge of the feature space’s dimensionality.
In Samson technology, the object is an image, and its sole descriptive feature is the Fourier transform modulus. The similarity between objects is determined by a new metric called “distance,” introduced in Part 1 of this article. This metric is the core of Samson’s strength — its metaphorical “hair”, referencing the biblical Samson.
Final Remarks
1. Turing Test Compliance: Samson easily passes the Turing test. If samson.py is on a remote server and objects are sent for recognition, there’s no way to discern whether the program or a system administrator is performing the work.
2. Reliability through Collective Decision: In Part 1, we noted that the distance metric isn’t entirely reliable on its own. However, when integrated into cluster analysis, where a collective decision is made by all “inhabitants” of the BASKET, the metric’s reliability significantly increases. This aligns perfectly with von Neumann’s concept of building a reliable system from unreliable components.
3. Source of “Innate Ability”: The “innate ability to recognize” may stem from the capacity to accurately compute or assess the degree of similarity between the objects that surround us.
4. Code: We have published the entire source code of the Samson project.
5. Continuation: See the final part of this article.
Apendix
For those who are not familiar with the topic of cluster analysis, we recommend downloading the demonstration project cluster-analysis-demo. The only difference from the Samson project is that the BASKET contains points on a plane instead of images. As before, the input parameter defines the number of clusters, but now the contents of these clusters are easy to interpret visually (see examples below).
