Paul van Diepen - Research Overview

Scene Perception Research, October 1992 - April 1999

Laboratory for Experimental Psychology
University of Leuven, Belgium
Back to the index page


While viewing a scene (a real-world situation, such as a kitchen or a playground), people mainly fixate information-rich areas, such as object locations. When objects are viewed with the foveal part of the retina, identification is most efficient. Visual acuity rapidly decreases at higher eccentricities. Saccades are ballistic movements that bring a selected part of the visual stimulus to the fovea. The main goal of a fixation is to identify the fixated object. This local process requires the allocation of selective visual attention to the foveal stimulus. It is assumed that the presence of peripheral stimulus information is not required for local object identification. Other, global processes utilize the peripheral stimulus, for example to select a saccade target.

To dissociate foveal and peripheral information processing during free scene exploration, eye-contingent display-change techniques can be used: The presence and quality of foveal or peripheral information is manipulated in function of the time elapsed since the onset of fixations. Well known examples of eye-contingent display-change paradigms are the so-called moving mask and moving window paradigms, developed in the context of reading research. In moving mask experiments, foveal information is replaced by a mask after a preset delay from the beginning of each fixation. By manipulating the mask onset delay, it can be determined how long a foveal stimulus has to be present for accurate identification. In moving window experiments, peripheral information outside of a window, centered on the fixation position, is masked. By manipulating the size of the window, the useful field of vision can be estimated. More refined display-change techniques replace the foveal or peripheral image by a manipulated version of the stimulus, such as a low- or highpass filtered version.

In our experiments, participants freely explore scenes, in the context of an object-search task. Scenes are black-on-white line drawings, or computer-generated images (black-on-white or full-color). Participants' task is to count the number of non-existing objects (non-objects), as fast and accurately as possible. They self-terminate the scene presentation by a button-press, as soon as they feel confident about the number of non-objects in the scene. Eye movements are recorded and used on-line to achieve eye-contigent display changes. The trial duration (i.e., the scene inspection time) is considered a measure of overall task difficulty. The effect of task difficulty on the number of errors is minimized by instructions and feedback after each trial (the number of errors is defined as the absolute deviating of the counted number of non-objects from the correct number). Eye-movement statistics that are analyzed include the mean fixation duration and the mean saccadic amplitude.

Chronometry of information encoding during scene perception

Chronometry of foveal information extraction during scene perception.
Foveal information within an ovoid was masked by pixels with random grey-levels, after a delay of 15, 45, 75, or 120 ms from the onset of fixations. In a control condition, no masking occurred. It was found that in the 15-ms condition, more time was required to complete the non-object search task, compared with the other conditions. The control condition, and the 75- and 120-ms conditions hardly differed. It was concluded that for object-identification, sufficient foveal information could be encoded within the first 45 to 75 ms of fixations.
Foveal stimulus degradation during scene perception.
In the above experiment, it was observed that fixation durations did not vary in function of the mask onset delay. The was attributed to the fact that the noise mask completely terminated foveal encoding. In this study, foveal image manipulations were included that degraded the foveal image, without completely removing it. Four experiments replicated the finding that foveal information can be extracted early during fixations. Furthermore, fixation durations increased as foveal information was degraded earlier during fixations.
Peripheral versus foveal information processing in scene perception.
The first experiment of this study was an extended version of the first moving mask experiment (see above). It largely replicated earlier findings. The following four experiments delayed the appearance of the foveal or peripheral information at the beginning of fixations. During the later part of fixations, both foveal and peripheral information were presented. It was found that both foveal and peripheral information are used from the onset of fixations. Delaying the foveal image resulted in longer fixation durations. When the peripheral image was delayed, fixation durations increased as well, but to a lesser extent. Saccade target selection was disturbed only when the peripheral image was delayed.

The use of coarse and fine peripheral information during scene perception

In co-operation with Martien Wampers, several studies have been performed regarding the use of coarse and fine peripheral information during scene perception:
The use of coarse and fine peripheral information during scene perception.
Peripheral information was lowpass or highpass filtered, whereas foveally the normal stimulus was presented. In a control condition the normal stimulus was presented both foveally and peripherally, but a white ellipse outlined the foveal area. High-frequency (fine) peripheral information seemed to be more useful compared with low-frequency (coarse) information.
Scene exploration with Fourier filtered peripheral information.
Peripheral information was degraded during the initial 150 ms of fixations, while participants explored full-color scenes, in the context of a non-object search task. Degradations included lowpass, highpass, and bandpass Fourier filtering, and blanking of the peripheral image. Images were filtered with a Matlab filter program. In a no-change control condition, the stimulus was undegraded throughout fixations. A second control condition reduced the stimulus luminance. This induced a stimulus-change during fixations, without changing the spatial frequency content of the peripheral image. Scene inspection times increased in the degradation conditions, relative to the control conditions, but no differences were found among the degradation types. This indicates that, to some extent, peripheral information is used during the initial 150 ms of fixations, but that no preference for a specific spatial-frequency range is present.

Miscellaneous studies

Tachistoscopic presentation of hybrid scenes (I)
Recognition of hybrid scenes (with conflicting central and peripheral information) was compared to the recognition of normal scenes, and scenes with no peripheral information, as a function of practice and instruction.
Tachistoscopic presentation of hybrid scenes (II)
In this follow-up experiment we decreased the size of the central part of stimuli. We also included stimuli that had no central information.
Eye-lens movement
Exploratory experiment regarding overshoots at the end of saccades, as measured by DPI Eye-trackers. The overshoots are presumably caused by movement of the eye-lens, relative to the eye-ball.

Eye-contingent display-change techniques

The moving overlay technique
A technique using two ATVista Video Graphics Adapters to produce moving masks and moving windows for high-resolution graphical images.
A pixel-resolution video switcher
Three synchronized graphics boards and a custom-built video switcher enable moving windows without limitations on the graphical content inside or outside the window.

Methodology Miscellaneous

Online Categorization of Saccades and Fixations
Fast eye-contingent display changes require fast online categorization of raw eye position sample data as saccades or fixation. We use an algorithm that was developed by Andreas De Troy.
Filtering of full-color images
Explanation of full-color filtering, as was used in several of our experiments. Includes a demonstration program that shows how colored gif-images can be low-, band-, and highpass filtered by Matlab.
Extension of the STAGE library for the ATVista Video Graphics Adapter.
van Diepen, P. M. J. (1997) PvDSTAGE Version 1.0 Reference Manual. An extension of the STAGE library for the ATVista (Psych. Rep. No. 215). Laboratory of Experimental Psychology, University of Leuven, Belgium.
Software notes on the use of the ATVista in visual perception research.
van Diepen, P. M. J. (1993). Use of the ATVista Videographics Adapter on visual perception research (Psych. Rep. No. 154). Laboratory of Experimental Psychology, University of Leuven, Belgium.

Stimulus material

Line drawings
Currently, our laboratory has the disposal of thirty scene backgrounds and several hundreds of objects and non-objects that can be assembled to realistic scenes (The image library and associated software is available via FTP). An example shows a scene background with several objects and non-objects in it. If your browser supports GIF animation, you will see a moving window with degraded contrast, following an imaginary scanpath.
Computer-generated images
A library of 3D models was created using 3D-Studio. It contains models of 12 scene backgrounds, 112 objects, and 40 non-objects. An example of a rendered scene shows a kitchen background, with several objects and one non-object.
The non-objects that we use are meaningless figures with a part-structure and size-range comparable to that of real objects. We use non-objects as targets for the search task to evoke fixations that are long enough ensure object identification, without the necessity to actually name the object. Here are some examples of non-objects:
examples of non-objects

Paul M. J. van Diepen - October 2002