Computational Gestalts
Computational Gestalts
Partial gestalts
In the last few decades many attempts were made to formulate Gestalt
principles in a more precise way, using mathematical tools that at the time of
the founding of the Berlin school were not at disposal of the first Gestalt
psychologists, from information theory (Attneave, 1854, 1957; Garner, 1962)
to synergetics (Haken & Stadler, 1990; Kelso, 1995), or other dynamic non-
linear approaches (van Leeuwen, 2007), and so on. One apparently very
3
promising attempt was undertaken, in the last ten years or so, by a group of
French mathematicians mainly interested in computer vision; among them
(Morel, Cao, Almansa, etc:), the leading figure appears today Agnés
Desolneux. (for a comprehensive review, see Desolneux, Moisan, Morel,
2006). The theory of computational Gestalts that they are building is centered
on three basic principles:
1. Shannon-Nyquist, definition of signals and images. Any image or
signal, including noisy signals, is a band-limited function sampled on
a bounded, periodic grid.
2. Wertheimer's contrast invariance principle: Image interpretation does
not depend upon actual values of the stimulus intensities, but only on
their relative values.
3. Helmholtz principle, indeed stated by D. Lowe (1985): Gestalts are
sets of points whose (geometric regular) spatial arrangement could
not occur in noise.
This means that, given the discreteness of the visual field (first
principle), and given the prevalence of the relative over the absolute values of
the stimuli (second principle), it is possible to determine a probability value ,
for whom all the stimuli whose probability is less of tend to group together
(third principle).
The name of Helmholtz can sound a little odd in this context for
psychologists. As a matter of fact, in his Handbuch (1867), neither in other
paper concerning perceptual theory, Helmholtz never stated something of
similar. But in general the quotations of such authors of psychological matters
are not to be taken too seriously. We will concentrate in our analysis overall on
Helmholtz principle.
Anyway, before examining the theory, is useful some introductory
remarks. In this approach, the starting point is the attempt made by Gestaltists
(overall Wertheimer) to find the basic laws that contribute to the formation of
shapes, on the basis of several common properties. These properties, the partial
gestalts (Desolneux, Moisan and Morel, 2001) correspond at least in part to the
4
“regularization”, for instance the so-called piecewise smoothness (see Black &
Anandan, 1996, Brox et al., 2004, Amiaz & Kiryati, 2006).
Helmholtz principle
Let’s go back to the model here discussed. As we said, the so called
Helmholtz principle was introduced by Lowe (1985). In very general terms, we
can state the principle in this way: we are able to detect any configuration that
has a very low probability to occur only by chance. So, any detected
configuration has a low probability, that implies that every improbable
configuration is perceptually relevant. Lowe stated so the principle: “ we need
V° Ë
to determine the probability that each relation in the image could have arisen
by accident p(A). Naturally, the smaller that this value is, the more likely the
relation is to have a causal interpretation.” A more formal statement of this
principle was first given by Desolneux, Moisan and Morel (2000): “We say
that an event of type ‘such configuration of points has such property’ is -
meaningful if the expectation in a image of the number of occurrences of this
event is less than ”.
What means the -meaningfulness? It can be restated assuming that in an
image are present n objects (parts, regions). Now, if k of them share a common
feature, we must decide if this is happening by chance or not. To answer this
question, we make the following mental experiment: we assume that the
considered quality has been randomly and uniformly distributed on all objects
O1, . . . , On. Notice that this quality may be spatial (e.g., position, orientation).
Then we (mentally) assume that the observed position of objects in the image
is a random realization of this uniform process, and ask the question: is the
observed repartition probable or not? The Helmholtz principle states that if the
expectation in the image of the observed configuration O1, . . .,Ok is very small,
then the grouping of these objects makes sense, is a Gestalt (see Desolneux,
Moisan and Morel, 2003). The Helmholtz principle can be illustrated by the
psychophysical experiment of Figure 1. On the left, we display roughly 400
segments whose directional accuracy (computed as the width–length ratio) is
7
about 12 degrees. Assuming that the directions and the positions of the
segments are independent, uniformly distributed, we can compute the
expectation of the number of alignments of four segments or more. (We say
that segments are aligned if they belong to the same line, up to the given
accuracy.) The expectation of such alignments in this case is about 2.5. Thus,
we can expect two or three such alignments of four segments and we found
them by computer. Do you see them? On the right, we performed the same
experiment with about 30 segments, with accuracy (width–length ratio) equal
to 7 degrees. The expectation of a group of four aligned segments is 1/250.
Most observers detect them immediately.
n ! n$
B( p,n,k) = ' # &p i (1 ( p) ,
n(i
i=k " k %
i.e. the tail of the binomial distribution. The independence assumption is not
realistic, but it is an a contrario assumption. In order to get an upper bound of
the number of false alarms, i.e. the expectation of the geometric event
happening by pure chance, we can simply multiply the above probability by the
number of tests we perform on the image. Let us call NT the number of tests.
Then in most cases we shall consider in the next subsections, a considered
event will be defined as -meaningful if
N T B( p,n,k ) ! ".
We call in the following the left hand member of this inequality on the the
“number of false alarms” (NFA).
If this expected number, is very low, then the group should be
considered as meaningful, since it cannot be due only to chance. This means
that we reject, a contrario, the independence hypothesis. Under an
independence assumption, probabilities are obtained as products of more
elementary probabilities. Therefore, it is often possible to prove (it will be the
case in what follows), that the minimal size of the meaningful group, depends
on the logarithm of the allowed number of false alarms. We shall see that,
experimentally, we can take this number equal to 1, since modifying its value
does not much change the results. We have to pay attention to the fact that the
a contrario events we define must not depend in any way of the observation.
When 1, we talk about meaningful events. This seems to contradict
the necessary notion of a parameter-less theory. Now, it does not, since the -
dependency of meaningfulness will be low (it will be in fact a log -
dependency). The probability that a meaningful event is observed by accident
will be very small. In such a case, our perception is liable to see the event, no
matter whether it is “true” or not. Our term !-meaningful is related to the
classical p-significance in statistics ; as we shall see further on, we must use
expectations in our estimates and not probabilities.
9
References
Almansa, A., Desolneux, A., Vamech, S. (2003). Vanishing point detection
without any a priori information. IEEE Transactions: Pattern Analysis
and Machine Intelligence, 25 (4), 502–507
Alvarez, L., Morales, F. (1997). Affine morphological multiscale analysis of
corners and multiple junctions. International Journal of Computer
Vision, 25 (2), 95–107.
Amiaz, T. & Kiryati, N. (2006). Piecewise-smooth dense optical flow via level
sets, International Journal of Computer Vision, 68, 111–124.
Amiaz, T., Lubetzky, E., & Kiryati, N. (2007). Coarse to over-fine optical
flowestimation. Pattern Recognition, 40, 2496 – 2503-
Arnheim, R. (1987). Prägnanz and its discontents. Gestalt Theory, 9,102-107.
Attneave, F. (1954). Some informational aspects of visual perception.
Psychological Review, 61, 183–193.
Attneave, F. (1959). Applications of Information Theory to Psychology, New
York, NY: Holt.
Black, M.J. & Anandan, P. (1996). The robust estimation of multiple motions:
parametric and piecewise-smooth flow fields. Computer Vision Image
Understanding, 63, 75–104.
Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy
optical flow estimation based on a theory for warping, in: Eighth
European Conference on Computer Vision (ECCV04), vol. IV, 25-36.
11
Lisani, J.L., Monasse, P., Rudin, L. (2001). Fast shape extraction and
application. Preprint 16, CMLA, ENS-Cachan. Lowe, D.
(1985).Perceptual Organization and Visual Recognition. London:
Klüwer,.
Marr, D. (1982). Vision. San Francisco: CA, Freeman.
Montanari, U. (1971). On the optimal detection of curves in noisy pictures.
Communications of the ACM, 14, 335–345
Mumford, D., Shah, J. (1989). Optimal approximation by piecewise smooth
functions and associated variational problems. Communication on Pure
and Applied Mathematics, XLII (4)
Nitzberg, M., Shiota, T. (1992). Nonlinear image filtering with edge and corner
enhancement. IEEE Trans. Pattern Anal. Mach. Intell., 14, 826–833
Pao, H., Geiger, D. (1999) Rubin, N.: Measuring convexity for figure/ground
separation. In: International Conference of Computer Vision, ICCV 9
Vol. 2, 948–955
Prytulak L. S. (1974). Good continuation revisited. Journal of Experimental
Psychology, 102, 773-777.
Rausch E. (1952). Struktur und Metrik figural-optischer Wahrnehmung.
Frankfurt: Kramer. .
Runeson, S. (1977). On th possibility of “smart” perceptual mechanisms.
Scandinavian Journal of Psychology, 18, 172-179.
Sojka, E. (2001). A new algorithm for detecting corners in digital images. In:
8th Spring Conference on Computer Graphics,
van Leeuwen, C. (2007). What needs to emerge to make you conscious?.
Journal of Consciousness Studies, 14, 115--136.
Wertheimer, M. (1923). Untersuchungen zur Lehre der Gestalt, II.
Psychologische Forschung. 4, 301–350.
Zhu, S.C.: Embedding Gestalt laws in markow random fields. IEEE
Transactions: Pattern Analysis and Machine Intelligence, 21(11), 1170–1187
(1999)