Müller-Lyer Illusion Explanation
Based on numerous previous studies, this article attempts to explain the Müller-Lyer illusion, which still has no satisfactory explanation since 1889. The author argues that the visual cortex analyzes, first and foremost, a spectrum’s magnitude of the visible image.
Since any image can be considered as a regular 2D function, the Fourier transform is widely used in image processing. Reasonably helpful, although an incomplete description of the Fourier spectrum is its magnitude — a full set of coefficients of the Fourier series decomposition of an image. The magnitude will be the main topic of our discussion because we believe that vision, like hearing (according to G. Ohm’s and H. Helmholtz’s research), ignores the spectrum’s phase information.
Let’s list the main three properties of magnitude:
- The shift of image content (an object) does not affect its magnitude.
- When we rotate an object by a certain angle, its magnitude rotates by the same angle.
- When we increase an object k times, its magnitude is compressed k times and vice versa. In other words, by comparing the magnitudes of similar objects, we can judge the relative sizes of these objects.
Note that we can only analyze the magnitudes of finite, i.e., space-constrained small-size objects. When the size of an object in the image increases, the magnitude of an image is compressed and becomes inaccessible for analysis. Usually, magnitude values are normalized in the range [0, 255] and presented as a normal grayscale image. In addition, various color palettes and logarithmic transformation 1.0 + log(magnitude) are applied.
For example, let’s create an empty image of size [512 x 512] pixels and add a small object to it — such as $ of size [12 x 18]. Different representations of the magnitude of such an image are shown below:
The following shows how the magnitude of the image changes after a slight increase in the object’s size and its rotation.
Some visible compression of magnitude means increase the object’s size (property 3).
Alternatives in image processing
We can get the result in image processing in several ways. Take, for example, blurring. This operation, while preserving all meaningful details, removes impulse noise. To blur, we can replace each pixel in an image with the average of all the nearby pixels. This replacement results in cv2.blur(image, ksize) — a ready-to-use OpenCV function. When working with pixels, we are in the image space or in the spatial domain.
An alternative way to achieve blurring is to remove high frequencies in the Fourier spectrum of an image. This action would require applying the Fourier Transform and manipulating the frequencies, i.e., going into the frequency domain.
Almost all image processing operations allow us to reach them both in the spatial and/or frequency domain; hence, the choice is ours. But what choice has nature made? How does our visual cortex perform image processing? We have several reasons to believe that the visual cortex uses the frequency domain:
- Many researchers, long before us, had attributed to the brain the ability to perform frequency analysis. These conclusions followed the proposal (Campbell and Robson, 1968) that the brain could be viewed as a spectrum analysis device. Indeed, there were many attempts to find evidence that the visual cortex performs a spectrum analysis (Pollen and Lee, 1971; Maffei and Fiorentiny, 1973; Glezer et al., 1989, among others). Moreover, the flow of research in this direction is constantly growing.
- Modeling of orientation-selective neurons functioning led us to create a high-quality edge detector and clarified the role of such neurons in the visual cortex. All this was possible only in the frequency domain.
- The proposition that the visual cortex uses the frequency domain helps to explain various geometric illusions, particularly the famous “arrow illusion” by Müller-Lyer.
1. Müller-Lyer illusion
Why do identical arrow shafts appear different when multidirectional arrowheads are added to them (see picture on the left)? Do you think the right line is shorter in this picture? If so, you are not alone: everyone — even horses — will agree with you (Cappellato et al., 2020).
Consider line segments of different lengths and their magnitudes (only the low frequencies of the magnitude [4096 x 4096] are shown).
Here (from left to right):
- The long segment.
- The magnitude of the long segment.
- The magnitude of the short segment.
- The short segment.
As expected, according to property 3, the long segment has a more compressed magnitude.
And now, let’s create a simple GIF animation. At the beginning (wait for the frame labeled START), we show two identical arrow shafts (segments) and their (identical) magnitudes. We’ll build up the arrowheads step by step and examine how this affects the magnitudes. As we see, depending on the type of arrowhead, the magnitude either compresses (left) or expands (right).
If we are locked in the frequency domain and judge the length of similar objects solely by their magnitudes, we will inevitably conclude that the object on the left is larger than the object on the right (property 3).
2. Vertical-horizontal illusion
To illustrate the universal nature of the described mechanism, let’s examine the “vertical-horizontal illusion.”
In the left picture, the vertical segment seems to be slightly larger than the horizontal segment, equal to its size.
Here (from left to right):
- The object of study.
- Its magnitude spectrum is in grayscale format.
- The same magnitude spectrum after a substantial contrast enhancement.
Since the magnitude is more compressed in the vertical direction than the horizontal, we erroneously conclude that the vertical segment is larger than the horizontal one (property 3). The picture below explains our statement.
When you shift an object or any part of it, you change the Fourier spectrum of the image. We assume that the visual cortex uses incomplete information about this spectrum, i.e., it analyzes magnitude and ignores phase that sometimes leads to wrong conclusions.