Paul van Diepen - Research Overview
Scene Perception Research, October 1992 - April 1999
Laboratory for Experimental Psychology
University of Leuven, Belgium
Back to the index page
While viewing a scene (a real-world situation, such as a kitchen or a playground),
people mainly fixate information-rich areas, such as object locations.
When objects are viewed with the foveal part of the retina, identification
is most efficient. Visual acuity rapidly decreases at higher eccentricities.
Saccades are ballistic movements that bring a selected part of the visual
stimulus to the fovea. The main goal of a fixation is to identify the fixated
object. This local process requires the allocation of selective visual
attention to the foveal stimulus. It is assumed that the presence of peripheral
stimulus information is not required for local object identification. Other,
global processes utilize the peripheral stimulus, for example to select
a saccade target.
To dissociate foveal and peripheral information processing during free
scene exploration, eye-contingent display-change techniques can be used:
The presence and quality of foveal or peripheral information is manipulated
in function of the time elapsed since the onset of fixations. Well known
examples of eye-contingent display-change paradigms are the so-called moving
mask and moving window paradigms, developed in the context of
reading research. In moving mask experiments, foveal information is replaced
by a mask after a preset delay from the beginning of each fixation. By
manipulating the mask onset delay, it can be determined how long a foveal
stimulus has to be present for accurate identification. In moving window
experiments, peripheral information outside of a window, centered on the
fixation position, is masked. By manipulating the size of the window, the
useful field of vision can be estimated. More refined display-change techniques
replace the foveal or peripheral image by a manipulated version of the
stimulus, such as a low- or highpass filtered version.
In our experiments, participants freely explore scenes, in the context
of an object-search task. Scenes are black-on-white line drawings, or computer-generated
images (black-on-white or full-color). Participants' task is to count the
number of non-existing objects (non-objects),
as fast and accurately as possible. They self-terminate the scene presentation
by a button-press, as soon as they feel confident about the number of non-objects
in the scene. Eye movements are recorded and used on-line to achieve eye-contigent
display changes. The trial duration (i.e., the scene inspection time) is
considered a measure of overall task difficulty. The effect of task difficulty
on the number of errors is minimized by instructions and feedback after
each trial (the number of errors is defined as the absolute deviating of
the counted number of non-objects from the correct number). Eye-movement
statistics that are analyzed include the mean fixation duration and the
mean saccadic amplitude.
Chronometry of information encoding during
Chronometry of foveal information extraction
during scene perception.
Foveal information within an ovoid was masked by pixels with random grey-levels,
after a delay of 15, 45, 75, or 120 ms from the onset of fixations. In
a control condition, no masking occurred. It was found that in the 15-ms
condition, more time was required to complete the non-object search task,
compared with the other conditions. The control condition, and the 75-
and 120-ms conditions hardly differed. It was concluded that for object-identification,
sufficient foveal information could be encoded within the first 45 to 75
ms of fixations.
Foveal stimulus degradation
during scene perception.
In the above experiment, it was observed that fixation durations did not
vary in function of the mask onset delay. The was attributed to the fact
that the noise mask completely terminated foveal encoding. In this study,
foveal image manipulations were included that degraded the foveal image,
without completely removing it. Four experiments replicated the finding
that foveal information can be extracted early during fixations. Furthermore,
fixation durations increased as foveal information was degraded earlier
foveal information processing in scene perception.
The first experiment of this study was an extended version of the first
moving mask experiment (see above). It largely replicated earlier findings.
The following four experiments delayed the appearance of the foveal or
peripheral information at the beginning of fixations. During the later
part of fixations, both foveal and peripheral information were presented.
It was found that both foveal and peripheral information are used from
the onset of fixations. Delaying the foveal image resulted in longer fixation
durations. When the peripheral image was delayed, fixation durations increased
as well, but to a lesser extent. Saccade target selection was disturbed
only when the peripheral image was delayed.
The use of coarse and fine peripheral information
during scene perception
In co-operation with Martien Wampers, several studies have been performed regarding the use of coarse
and fine peripheral information during scene perception:
The use of coarse and fine peripheral
information during scene perception.
Peripheral information was lowpass or highpass filtered, whereas foveally
the normal stimulus was presented. In a control condition the normal stimulus
was presented both foveally and peripherally, but a white ellipse outlined
the foveal area. High-frequency (fine) peripheral information seemed to
be more useful compared with low-frequency (coarse) information.
Scene exploration with Fourier
filtered peripheral information.
Peripheral information was degraded during the initial 150 ms of fixations,
while participants explored full-color scenes,
in the context of a non-object search task. Degradations included lowpass,
highpass, and bandpass Fourier filtering, and blanking of the peripheral
image. Images were filtered with a Matlab filter
program. In a no-change control condition, the stimulus was undegraded
throughout fixations. A second control condition reduced the stimulus luminance.
This induced a stimulus-change during fixations, without changing the spatial
frequency content of the peripheral image. Scene inspection times increased
in the degradation conditions, relative to the control conditions, but
no differences were found among the degradation types. This indicates that,
to some extent, peripheral information is used during the initial 150 ms
of fixations, but that no preference for a specific spatial-frequency range
Tachistoscopic presentation of hybrid scenes (I)
Recognition of hybrid scenes (with conflicting central and peripheral information)
was compared to the recognition of normal scenes, and scenes with no peripheral
information, as a function of practice and instruction.
Tachistoscopic presentation of hybrid scenes
In this follow-up experiment we decreased the size of the central part
of stimuli. We also included stimuli that had no central information.
Exploratory experiment regarding overshoots at the end of saccades, as
measured by DPI Eye-trackers. The overshoots are presumably caused by movement
of the eye-lens, relative to the eye-ball.
Eye-contingent display-change techniques
The moving overlay technique
A technique using two ATVista Video Graphics Adapters to produce moving
masks and moving windows for high-resolution graphical images.
A pixel-resolution video switcher
Three synchronized graphics boards and a custom-built video switcher enable
moving windows without limitations on the graphical content inside or outside
Online Categorization of Saccades and
Fast eye-contingent display changes require fast online categorization
of raw eye position sample data as saccades or fixation. We use an algorithm
that was developed by Andreas De Troy.
Filtering of full-color images
Explanation of full-color filtering, as was used in several of our experiments.
Includes a demonstration program that shows how colored gif-images can
be low-, band-, and highpass filtered by Matlab.
Extension of the STAGE library for the ATVista Video Graphics Adapter.
van Diepen, P. M. J. (1997) PvDSTAGE Version 1.0 Reference Manual. An
extension of the STAGE library for the ATVista (Psych. Rep. No. 215).
Laboratory of Experimental Psychology, University of Leuven, Belgium.
Software notes on the use of the ATVista in visual perception research.
van Diepen, P. M. J. (1993). Use of the ATVista
Videographics Adapter on visual perception research (Psych. Rep. No.
154). Laboratory of Experimental Psychology, University of Leuven, Belgium.
Currently, our laboratory has the disposal of thirty scene backgrounds
and several hundreds of objects and non-objects that can be assembled to
realistic scenes (The image library and associated
software is available via FTP).
An example shows a scene background
with several objects and non-objects in it. If your browser supports GIF
animation, you will see a moving window with degraded contrast, following
an imaginary scanpath.
A library of 3D models was created using 3D-Studio. It contains models
of 12 scene backgrounds, 112 objects, and 40 non-objects. An example
of a rendered scene shows a kitchen background, with several objects and
The non-objects that we use are meaningless figures with a part-structure
and size-range comparable to that of real objects. We use non-objects as
targets for the search task to evoke fixations that are long enough ensure
object identification, without the necessity to actually name the object.
Here are some examples of non-objects:
Paul M. J. van Diepen - October 2002