& the Arts
M.C. Escher (1898 - 1972), Bond Of Union, 1956.
Computational theories of
David Marr's model
"Vision can be
understood as an information
which converts a numerical image representation into a symbolic
Marr [Marr:Vision:1982] proposed three different levels for the
understanding of information
processing systems (having vision systems as the target example):
2. representation and algorithm; and
One of the Marr's most
important contribution was made
in the level of representation
and algorithm when he
representational framework for vision. He concentrated on the vision
task of deriving shape
information from images.
After D. Marr, in "Vision," 1982.
From S. Lehar.
The intensities perceived by any
visual system are a
function of four main factors:
1. the geometry
(meaning shape and relative placement);
2. the reflectance and absorption properties of
the visible surfaces (physical properties);
3. the illumination (light sources); and
4. the camera
detection of intensity changes, the representation and analysis
of local geometric structures and the detection of illumination
effects take place in the process of generation of the primal
sketch, where independent spatial organizations of the viewed
intensities in a scene
reflects the structure of the visible surfaces. Marr proposed to
capture these organizations by using a set of "place tokens," or low
level features, which correspond to:
1. oriented edges,
3. ends and
Each of these were represented by a 5-tuple: type,
position, orientation, scale, contrast.
A set of examples : http://www.cs.ucla.edu/~cguo/primal_sketch.htm
The 2.5 sketch
sketch is intended to represent the orientation
and depth of the
visible surfaces as well as discontinuities. It is composed of
local surface orientation primitives, distance from the viewer and
discontinuities in depth and surface orientation and, as in the
previous representation, it is specified in a viewer-centered
It also takes into account visual information of motion, shading, shape
What is in the ".5" ? this
is not about a fractal
dimension, but rather a metaphor for the
claim/concept that, in reality, we do not see all of
our surroundings. For example, consider someone with her
back turned to you. You can only
see half of her body, although you assume their is some front part to
her body (with a face, etc.).
The point Marr is making here is that we
are not actually aware of all our surroundings and so
details to fill in the gaps.
From Benjamin Kimia, Brown University.
The 3D Model
- Describe shapes and their organization using a modular and hierarchical organization of volumetric
and surface primitives.
The recognition process
uses a catalogue of 3-D
which is a collection of stored 3-D model descriptions and various
indices into the collection that allow the association of a new
description with the appropriate one in the collection.
All 3-D model descriptions
can be organized in a hierarchy
according to the specificity of information they carry. The top level
of such a hierarchy is a model which does not have a component
decomposition and describes the model's principal axis. At the next
level in the hierarchy more details are added to the model, like the
number and distribution of subcomponent axes along the principal axis.
At the lower levels each individual object's model receives more
precise descriptions, and they can now be distinguished by the angles
and length of their components.
From David Marr's book: Vision, 1982.
Another famous example of related ideas human cognition
(primitve-based representation of objects) are the geons of Biederman et al. :
- The description of the object (shape) is relative to the object;
in particular, a coordinate frame is attached to a center (e.g., of mass) for the object.
- Examples: Constructive
Solid Geometry (CSG), boundary-based representation (B-rep, such as
NURBS), generalized cylinders, medial axis.
Observer-based (or view-based)
- The description of the object is dependent on the camera
parameters (essentially its field of view) and attached to the image
space for the selected viewpoint.
- Example: Curvature description for a given outline (projection).
, Zhu, S.-C, &
Wu, Y., "A Mathematical Theory of Primal Sketch and Sketchability,"
Proc. of the Int. Conf. on Computer Vision (ICCV), pp. 1228-1235,
Nice, France, 2003.
Marr, D., Vision: A
computational investigation into the human representation and
processing of visual information, W.H. Freeman publ., 1982.
Watt, R.J., Visual Processing: Computational,
Psychophical and Cognitive Research, Lawrence Erlbaum publ.,
Last update: Oct. 10, 2006.