Supplement to Mental Imagery
The Problem of Demand Characteristics in Imagery Experiments
In the field of human experimental psychology it is now widely recognized that an experimenter's hopes and expectations about the outcome of an experiment can have considerable (usually unintended) effects on the actual outcome, sometimes to the point of invalidating the results (Rosenthal, 2002). One of the most important ways this can occur is through the effect of what are called experimental demand characteristics on the performance of the human subjects (Orne, 1962; Rosnow, 2002).
In a seminal article, Orne (1962) defined the demand characteristics of the experimental situation as “the totality of cues which convey an experimental hypothesis to the subject,” and pointed out that such cues may include:
the rumors or campus scuttlebutt about the research, the information conveyed during the original solicitation, the person of the experimenter, and the setting of the laboratory, as well as all explicit and implicit communications during the experiment proper. A frequently overlooked, but nonetheless very significant source of cues for the subject lies in the experimental procedure itself, viewed in the light of the subject's previous knowledge and experience. For example, if a test is given twice with some intervening treatment, even the dullest college student is aware that some change is expected, particularly if the test is in some obvious way related to the treatment. (Orne, 1962)
Orne argued (and the point is now generally accepted by experimental psychologists) that in most experimental situations involving human subjects, the subjects are strongly motivated to “help” the experimenter by providing the sorts of results that they think are expected or hoped for. Even though they will not normally be told what the experimenter is hoping or expecting to find before they take part, most experimental subjects will do their best to infer or guess these expectations or hopes from the demand characteristics of the situation, and will often be fairly successful. (The problem is exacerbated if, as is so often the case, the subjects are psychology undergraduates, more than averagely motivated to please their professors and to further the goals of psychological science, and better placed than most to make a good guess as to what an experiment might be about.) Unfortunately, these guesses (whether right or wrong) can become a significant determinant of a subject's actual performance in the experiment, and can overshadow the contribution of the factors that the experiment was intended to investigate. If these “demand” effects are not kept under control, experimental results may reflect the subjects' desire to please much more than they provide any insight into the real mechanisms of cognition.
The degree to which this is a serious problem depends in large part on the ways particular experiments are constructed. For example, in an experiment to see whether the use of an imagery mnemonic can improve memory recall, we can probably be confident that, whether or not the subject guesses what the experiment is about, any actual improvement in memory performance is indeed due to the mnemonic. After all, simply wanting to please the experimenter will not actually give someone a better memory.
Other experiments, however, including many of the most interesting experiments on the spatial properties of imagery, depend at crucial junctures upon the subjective judgement of the subjects (as to whether they are doing the experimental task correctly, or have successfully completed it, for example). There is good evidence that experimental demand can have significant distorting effects on findings in experiments of this sort. Richman and his collaborators (Richman et al., 1979 a,b; Mitchell & Richman, 1980; Goldston et al., 1985) have shown experimentally that the results of mental scanning experiments can be markedly affected by demand characteristics, Sheehan & Neisser (1969) and DiVesta, Ingersoll, & Sunshine (1971) have shown that self-ratings of imagery vividness are probably so affected, and Intons-Peterson (1983) has shown that quite a range of cognitive imagery processes are surprisingly susceptible even to unintentionally and very subtly conveyed experimenter expectations.
Intons-Peterson (1983) tested the demand susceptibility of three different experimental imagery tasks: a task used by Finke & Kurtzman (1981a, 1981b) in their work on “measuring the visual angle of the mind's eye”; an image scanning task based upon the well-known experiment of Kosslyn, Ball & Reiser (1978), in which subjects were asked to scan across a mental image of a map; and a mental rotation task based on an experiment by Cooper & Shepard (1975). Intons-Peterson required her subjects to do two versions of each task, one which called for the use of mental imagery, and another where the relevant stimuli were actually perceptually present.
However, she did not “run” the experimental subjects herself. Instead she used a number of “assistant” experimenters to instruct the subjects in their tasks, and to collect the results. Half of these assistants were led to believe that their subjects would perform better (more quickly or more accurately) in the perceptual condition of the experiments, and the others were led to believe that their subjects would perform better in the imagery condition. (Plausible rationales were cooked up to make each type of expectation seem believable.) The assistant experimenters were warned not to reveal these expectations to the subjects, and, indeed, they were not allowed to use their own words in instructing the subjects, but had to read the instructions from written scripts (which were the same regardless of the experimenters induced expectations). Despite this, the subjects' performances, on all the tasks, turned out to be significantly affected by their experimenter's expectations. They performed better in the condition (imagery or perceptual) in which their experimenter expected them to perform better.
When videotapes of the experimental sessions were carefully examined, it eventually became apparent what was going on. The assistant experimenters were reading out the instructions for the condition where they expected better performance more slowly and clearly. This quite unintentional non-verbal cue turned out to be quite enough to affect the subjects' performances.
Intons-Peterson's experiment does not necessarily invalidate the claims of Kosslyn, Shepard, and others about the inherently spatial, analog nature of visual imagery. After all, as Jolicoeur & Kosslyn (1985) point out, the mental scanning effect demonstrated by Kosslyn, Ball & Reiser (1978) (i.e., a linear relationship between the distance scanned over an image and the time taken) was still apparent in Intons-Peterson's findings, quite regardless of her experimenters' expectations about image versus perceptual conditions. (The same can be said for the mental rotation effect.) Furthermore, the image scanning and image rotation effects have been demonstrated in experiments where no explicit instructions to scan or rotate (or even to form a mental image) were given (Finke & Pinker, 1983; Pinker, Choate, & Finke, 1984; Shepard & Metzler, 1971).[1] Nevertheless, Intons-Peterson's findings should give us considerable pause. It is clear that many imagery experiments may be significantly affected even by subtle, unintentionally conveyed and unconsciously absorbed, demand characteristics. The literature of the cognitive psychology of imagery is full of arguments and counter-arguments over whether one or another experimental result is meaningful, or can be dismissed as an effect of demand. Such criticisms deserve to be taken seriously. They can only be evaluated by giving attention to the specifics of particular experimental designs.
Unfortunately, and perhaps especially when one is interested in a “subjective” phenomenon like imagery, it can sometimes be difficult or impossible to design an experiment that both addresses the question of interest and precludes the possibility that demand characteristics have affected the outcome. In such circumstances, one possible recourse, that has been used quite frequently in imagery experimentation (particularly by Kosslyn), is to conduct a post-experimental interview with each of the subjects, and ask them what they think the experimental hypothesis and the variables of interest might have been. Data from subjects who guess anything like correctly can then be excluded from the final data analysis. However, as Orne (1962) pointed out, although they probably help, techniques of this sort are far from fully satisfactory. If, as may often be the case, a subject intuits the purpose of an experiment only in a vague, inarticulate way, then, even though this might be sufficient to affect their experimental performance, they might not be able, or confident enough, to explain their guess to the experimenter. Furthermore, post-experimental interviews can themselves be affected by demand characteristics. Inasmuch as a subject realizes that the experimenter hopes that they have not understood the purpose of the experiment, they may be inclined to downplay how much they have, in fact, guessed.