Underlying Theory


The prominent founders of Gestalt theory are Max Wertheimer Wolfgang Kohler and Kurt Koffka.  Kurt Koffka joined the Psychological Institute in Frankfurt am Main in 1910 where he met Wolfgang Kohler and together they joined Max Wertheimer in his laboratory studying motion perception. The three of them became lifelong partners and together established the foundation for Gestalt theory. Although all three studied motion the majority of motion study was carried out by Wertheimer with his discovery of phi movement in 191. Media theorist and perceptual psychologist Rudolf Arnheim (who studied with Wertheimer in Berlin in the 1920̸;s) later extrapolated many of the Gestalt principles to create a new comprehension of film theory.

The most important notion in Gestalt is that the whole  carries a different and altogether greater meaning than its individual components. Just imagine how a film is so much more than a sum of its parts: shots scenes montages sound music dialogue actors film stock projector and light. In viewing the whole a cognitive process takes place – the mind makes a leap from comprehending the parts to a meta-realisation of the whole.

We visually and psychologically attempt to make order out of chaos to create harmony or structure from seemingly disconnected bits of information. How we best we could receive this information and arrange it is governed by a fundamental to Gestalt theory; the law of Prägnanz. Prägnanz is a German word that translates roughly as salience incisiveness conciseness impressiveness or orderliness. Often translated in English design terms as  the law of good form it encompasses many ideas at the heart of neuroaesthetics and can be summed up in Edward Tuftes observation of good design: “When principles of design replicate principles of thought the act of arranging information becomes an act of insight.

Gestalt theory offers us several other laws or principles that offer insights into how we perceive and organise information in our minds.

These principles are:

Figure/Ground articulation

This principle denotes our perceptual tendency to separate whole figures from their backgrounds based on one or more of a number of possible variables such as contrast color size or movement. The opposite of this articulation is the camouflage adopted by a many animals in natural settings. Everything that is not figure is ground or negative space. We can switch between multiple figures and some art brings the ground to the fore; much traditional Japanese art for instance confers as much weight to the negative space as to the figure.

Similarity Principle

Things which share visual characteristics such as shape size color texture or value will be seen as belonging together or grouped in the viewer’s mind. Repetition of forms or colors in a composition is pleasing in much the same way rhythm is pleasing in music the forms need not be entirely identical – there May be variety within the repetition yet the correspondence will still be discernible and similarity or repetition in an image often brings connotations of harmony rhythm and movement. Use of similarity in composition can impart meaning to the viewer that is independent of the subject matter of the image.

Common Fate Principle

Elements tend to be perceived as being grouped together if they move together. This is an extrapolation of the visual grouping of the similarity principle to movement and so is often left out of design books taht are only interested in visuals not dynamics. Interesting versions of this can be taken beyond the pleasure we find in lines of chorus girls/dancers moving simultaneously to the battle scenes of Kurosawa and the wind in the fields of wheat at the start of Witness.  Viewpoint is crucial to good interpretation of common fate – for instance many big stage numbers in Busby Berkeley Musicals would be meaningless without the overhead camera wide-angle top shot.

Closure Principle

Closure is the satisfying effect of recognising a pattern our brains are so drawn to patterns that  we tend to see complete figures even when part of the information is missing.   Closure occurs when elements in a composition are aligned in such a way that the viewer perceives that the information could be connected and the eye understands something as being part of the composition even though there is nothing there. The most famous example is the Kanizsa effect where a square is defined  by absent quarters in smaller circles.

Good Continuation Principle

The continuity principle governs how oriented units or groups tend to be integrated into perceptual wholes if they are aligned with each other. We tend to continue shapes and lines beyond their ending points and so meet up with other shapes or lines particularly if the path followed by our eyes is smooth.

Past Experience (familiarity) Principle

In some cases the visual input is organized according to the familiarity where elements tend to be grouped together if they were together often in the past experience of the observer. This is particularly true for interpreting text.

(Other principles sometimes evoked are: symmetry convexity and common region principles)

These Principles are not limited to static visuals they include motion and auditory Gestalt too.

An article in Shcolarpedia indicates that there is some criticism of Gestalt principles being merely descriptive rather than  providing a model of perceptual processing. EG: Koffka K. (192). Perception: An introduction to the Gestalt-theory. Psychological Bulletin 19 531-585. Multistabile perception is the tendency of ambiguous perceptual experiences to pop back and forth unstably between two or more alternative interpretations. Gestalt does not explain how images appear multistable only remark that the phenomenum exists.

As formulated by Wertheimer Gestalt principles involve a ceteris paribus (all other things being equal) clause. That is each principle is supposed to apply given that the other principles do not apply or are being held constant. In case two (or more) principles apply for the same input and they favor the same grouping it will tend to become strengthened; however if they disagree usually one wins or the organization of the percept is unclear. Several examples of the domination of one principle over another are presented above. The  significant theoretical problem of how to predict which principle will win in which circumstances remains to be worked out in detail.

Gestalt principles are usually illustrated with rather simple drawings. Ideally it should be possible to apply them to an arbitrarily complex image and as a result produce a hierarchical parsing of its content that corresponds to our perception of its wholes and sub-wholes. This ambitious goal is yet to be accomplished.


Motion in animation cannot be discussed without referring to the twelve animation principles first espoused by the Disney studios in the 1930’s and still going strong today. An updating of the principles by translating their application to 3D computer animation was explored by John Lassetter in his much cited 1987 Siggraph paper.

The first and generally acknowledged most important principle is that of ‘Squash and Stretch’. Lasseter discussed this principle not just as a desirable indeed essential principle for animating facial movement but also for the way squash and stretch and overlap can be used by an animator to relieve the disturbing effect of strobing that happens sometimes in depicting very fast motion. This occurs if the distance an object moves between frames is so fast that there is no overlap and the eye begins to perceive separate images. There are a number of ways an animator would deal with this problem – in computer animation we would add blur in model we would never move an object beyond its previous silhouette and as Lasseter indicates in drawn animation we would stretch the figure.

the fact that the twelve animation principles have stood the test of time and have been adapted successfully to new methods of animating such as 3D CG means that they underpin most animator’s work today every bit as much as they did the animation on Disney’s first feature Snow White and the Seven Dwarfs a lifetime ago. The twelve are:

1. Squash and Stretch

2. Timing

3. Anticipation

4. Staging

5. Follow Through and Overlapping Action

6. Straight Ahead Action and Pose-To-Pose Action

7. Slow In and Out

8. Arcs

9. Exaggeration

10. Secondary Action

11. Appeal

1. Personality

Looked at from a psychophysical angle there are good arguments for at least one other principle; that of ‘Isolation’. Isolation refers to centralizing a character and their movement. If one character is gesticulating wildly in a scene you don’t want another to join in or the viewer’s eyes will shift away from the main action. We are hard-wired to detect movement as a survival instinct. If something moves on the periphery of our vision; we need to know ‘Is it food or does it think I am?’ The edge of the human retina is only sensible to movement and because detailed visual information can only be obtained by the concentrated nerve endings in the centre of the retina an animator who works extra hard moving a lot of characters around at the edge of the screen is wasting his or her time worse still they May be distracting from the main action. (Claymation or model animation where more than one animator works on a scene at a time is particularly susceptible to this cf. Aardman’s Claymation film Chicken Run)

Viewers of animated films not only accept that still images move they also happily accept that elephants can fly giant apes scale the Empire State Building and sponges have personality. They unconsciously aid animators create the illusion of life by looking for narrative and characters they can empathize with. As with film audiences can immerse themselves in a two-dimensional experience enhanced only by sound. They can in short suspend their disbelief. Is it not the very essence of illusion that it should be incomplete? To quote the film theorist and perceptual psychologist Rudolf Arnheim:

In order to gain a full impression it is not necessary for a film to be complete in a naturalistic sense – all kinds of things can be left out which would be present in real life so long as what is shown contains the essentials.’

The example Arnheim gives is of viewing a black and white film; how if all the color were to be drained from our world we would be shocked yet audiences have happily participated in and enjoyed the spectacle of black and white films such as Citizen Kane and the 193 film of King Kong. (They also managed to empathize with heroes and heroines of the golden age of silent movie such as Chaplin’s Tramp or The Perils of Pauline.)

There is a strong overlap between a live action film and an animated film in fact it could be argued that animation is a more tangibly creative and formative medium than film it is closer to fine art and it deserves to be treated with the same respect as an art form. A work of art is not simply an imitation or selective duplication of reality but a translation of observed characteristics into the terms of the medium. Perhaps those animation principles are there for a reason.

The term ‘neuro-aesthetics’ was coined by Professor Semir Zeki who set up the Institute Vislabat the University College of London to study the biological foundations of aesthetics. The institute explores visual art in relation to the known physiology of the visual brain and advocates three suppositions:
1) that all visual art must obey the laws of the visual brain whether in conception or in execution or in appreciation;
2) that visual art has an overall function which is an extension of the function of the visual brain to acquire knowledge;
3) that artists are in a sense neurologists who study the capacities of the visual brain with techniques that are unique to them. [Zek93]
There is a useful extrapolation to be made from Zeki’s essentially constructivist aesthetic suppositions; that the closer visual art comes to echoing the physiology of the visual cortex the ‘better’ it is. This fits well with the Gestalt theory propounded by Wertheimer et al which advocates design principles of perception such as Figure/Ground distinction and Closure and have been much used in all design since.The theory of neuro-aesthetics is fundamental to two hypotheses held by the artists:1) Artistic models that echo the psychophysical architecture of the mind are best for depth of communication and qualia of experience.2) That commonality of experience and response to stimuli is significantly greater than we thought because audiences are exegetes who share cognitive neural architectures and tend to take the same perceptual shortcuts.
ZEKI S.: A Vision of the Brain. Blackwell Science 1993