In 1991, in this Journal, I published a paper on the problems of winning acceptance of a diagnosis of mental retardation in death penalty cases in the face of hostile prosecutors and a skeptical judge (Baroff, 1991). The concern, then as now, is the potential exposure of a convicted defendant to execution. In 1991, only a few of the 38 states with the death penalty had specifically proscribed its application to persons with mental retardation. In the year 2002, however, the U.S. Supreme Court declared the use of the death penalty unconstitutional in the case of persons with mental retardation, meaning that in no state can a convicted defendant who has mental retardation be executed (Atkins v. Virginia, 2002). Indeed, such punishment has been virtually eliminated throughout the world.
Given the importance to defendants in capital cases of being adjudicated as “mentally retarded,” we can expect that attorneys will turn to psychologists for assistance in the case of defendants whose mental status suggests cognitive impairment. The psychologist will have the responsibility for marshalling the relevant evidence in terms of the three criteria commonly employed in this diagnosis: general intelligence (IQ), adaptive behavior, and age at manifestation. The psychologist will then have the responsibility for justifying the diagnosis in an adversarial court hearing designed for this purpose. The major obstacle to the acceptance of the diagnosis is likely to be the defendant's IQ as it relates to the state's statutory definition of mental retardation. Where the defendant's previous IQs have commonly been in the “borderline intelligence” range, a current score(s) falling in the retarded range, 70 or below, will be challenged. The defendant may be viewed as deliberately lowering his or her score, so-called “malingering” and, in effect, deceiving the examiner. Indeed, one of the dissenting justices in the recent Supreme Court decision (Atkins v. Virginia, 2002) expressed the view that one can “feign” mental retardation. On more than one occasion, I have had reason to subsequently question the score that a defendant obtained!
The “adaptive behavior” aspect of the diagnosis may, paradoxically, be less problematic because its measurement is more subjective. It lacks the “objective” and quantitative status accorded the IQ and is, therefore, although more open to interpretation and argument, perhaps given less weight than the more objective and venerable intelligence test. The defendant's IQ is going to be compared to a specific number, 69 or 70, and the diagnosis is going to rest heavily on the apparent validity of that IQ in the eyes of the court. Parenthetically, it is the judge and not the psychologist who ultimately decides whether the defendant will be recognized as having mental retardation.
Reconciling IQ Differences
In my 1991 paper, I focused on potential differences in IQs between the Revised Beta, a group intelligence test, and the then Wechsler Adult Intelligence Scale-Revised—WAIS-R (Wechsler, 1997). I pointed out that the Beta was, and probably still is, widely used as a screening test for inmates entering the correctional system. Low scores on the Beta would then lead to subsequent evaluation on an individually administered test, such as the WAIS-R. The psychologist would rely on the latter test for the ultimate diagnosis. The concern at this time, however, is with score differences (a) between two or more administrations of the same test, (b) between different editions of the same test (e.g., WAIS-R and WAIS-III), and (c) between two different tests (e.g., WAIS-III and Stanford-Binet). With respect to a comparison between different tests, the latest version of each, WAIS-III (Wechsler, 1997) and Stanford-Binet–Fourth Edition—SB-IV (Thorndike, Hagen, & Sattler), are considered, although there will be reference to the earlier Binet edition, the venerable Form L-M.
IQ 70: An Artificial Boundary
Before discussing differences between IQs, it is useful to recognize that the score of 70, the ceiling for one intelligence range, may also be the floor for the one above it (if 70 rather than 69 is set as the upper limit for the mental retardation range). The distinction between the ranges of “mental retardation” and “borderline” intelligence is wholly arbitrary and based entirely on the statistics of the “normal curve,” with IQ 70 or 69 simply representing that point on the normal curve that is 2 standard deviations (SDs) below the mean. This quality of arbitrariness is inherent in drawing distinctions on any trait that exists on a continuum; in psychological terms, a continuous rather than discrete variable. There is no discernable behavioral difference between individuals scoring at the ceiling of the mentally retarded range and the floor of its next higher neighbor, the borderline level. The difference between these neighboring points at IQs 69 or 70 and 71, for example, have no more behavioral significance than the differences between IQs 79 and 80 or any other neighboring points on the distribution. Indeed, by virtue of the standard error of intelligence tests, scores that differ by only 2 IQ points are not truly different anyway. Another way of illustrating the artificiality of IQ 70, or any other single IQ, is to compare the distribution of IQs to the color spectrum. The latter ranges its colors in a series of very small steps, such that the transition from one color to another (e.g., yellow to orange) is so gradual that it is only at the extreme of one color that its difference from the other is obvious. The point is that we are asked to draw a diagnostic distinction that treats mental retardation as distinctly different from borderline intelligence. In fact, like the color spectrum, they really overlap. The point of this discussion is that rendering a decision as to mental retardation by IQ alone would be even more arbitrary if the scores are either at the ceiling of one range or at the floor of the next. Of course, the diagnosis is also based on adaptive behavior, and it can be expected that this criterion will assume added significance when scores are at the boundary of the IQ ranges.
Differences Between the Same Test
Apart from variation in test scores tied to the inherent error in any measurement and reflected in the reliability of a test, its standard error, and its confidence intervals, the concern here is with a test result that seems inconsistent with either previous scores or clinical history. With respect to IQ and clinical history, particular attention should be given to the defendant's school record, which provides a history of scholastic achievement uncontaminated by the possibility of a later wish to represent oneself as having mental retardation. Few, if any students, deliberately seek to lower their grades! School achievement is always adversely affected by mental retardation, and it is in the school record that one is most likely to find evidence of mental retardation. School records will also permit the establishment of mental retardation prior to age 18, the upper limit set by the American Association on Mental Retardation (AAMR) and, presumably, state guidelines.
An Elevated Score
Apart from motivational or personality factors, the score on a test may be elevated because of the defendant's familiarity with it. Over a period of time, and in capital cases this can be a decade or more, the cognitively limited defendant may have been tested several times and, often, with the same test. In a recent case of mine, three WAIS-Rs were administered in the period from 1987 to 1995, each by a different examiner. The effect of test repetition is, of course, to increasingly familiarize the examinee with the content of the test and, thus, risk a score elevated by practice. In the standardization of the WAIS-III, a study was made of the stability of test scores in a representative sample of 394 subjects under conditions of test–retest over a mean interval of little more than one month (34.6 days). Reported in the WAIS-III Technical Manual, the mean retest scores were higher, a difference attributed “mainly to practice” (p. 57). The effect of test repetition appears to be greater on the nonverbal or performance tests than on the verbal ones. Verbal IQs tended to increase by from 2 to 3 points; Performance IQs, from 3 to 8 points; and Full Scale IQs, from 2 to 3 points. Although the Full-Scale IQ change is relatively modest and approximates the standard error of the test, it nevertheless can push a score that had been in the high 60s, and in the mentally retarded range, above the IQ 70 ceiling and into the borderline level. Judges will, hopefully, have some discretion in applying statutory guidelines. The obvious implication is to avoid repeating the same test within such a short time interval. If there is no alternative, and the second test score is significantly elevated, especially on the Performance Scale, the psychologist can offer a practice effect as an explanation. The option of using a different test (e.g., the Stanford-Binet: Fourth Edition) is discussed later.
A Depressed Score
Here the concern is with malingering, the deliberate lowering of a test score in order to gain legal benefits. The elimination of the death penalty for defendants with mental retardation was probably not desired by many prosecutors(!), who would prefer to pursue a case with the widest range of options open to them. One of these options is to offer a plea bargain through which a capital defendant escapes the possibility of the death penalty in return for a confession that implicates codefendants. The removal of that threat deprives the prosecution, that is, the state, of a powerful incentive for inducing defendant cooperation. Nor is such plea-bargaining unusual. It is likely that the great majority of criminal cases are settled by a plea bargain, which avoids the necessity of a trial and a potentially more severe sentence if one is convicted. In any case, we can expect the state to challenge the defendant's assertion of mental retardation by conducting its own examination. This determination will commonly be conducted with the same test, thereby increasing the likelihood of a higher score just because of practice!
Can we anticipate some capital defendants to malinger? The answer is yes, but it is more likely to be in the form of a pretended psychiatric disorder. Being perceived as having mental retardation is no more desirable in the prison population than in the general community, and the choice of a Machiavellian strategy is least likely in those with intellectual limitations! Nor are defense lawyers likely to encourage their clients to deliberately lie. Nevertheless, one needs to be alert to test results inconsistent with the clinical history, paying particular attention to the school record.
Within the examination itself, the psychologist will be especially attentive to the defendant's level of motivation; to the degree of persistence, especially on more difficult items; to unexpected failures, that is, to missing items that individuals of this general level of intelligence typically pass—perhaps, more often on the Information subtest or on the Performance scale. Performance scale tasks afford greater opportunity of directly observing the effort to succeed.
Malingering in cognitively impaired individuals is likely to take the form of an apparent unwillingness to extend effort. Presenting such a defendant with the Copying test from the Stanford-Binet provides the opportunity to judge his or her cooperation on a series of drawings that are age-graded and range from the extremely simple to the moderately complex. Only individuals with the most severe impairments will be unable to reproduce the simplest drawings (e.g., a straight line), and in the absence of an obvious motor or visual impairment, failure on items presumed to be well within an individual's ability range raises the question of cooperativeness and malingering. A defendant should then be informed that his or her degree of effort is questioned and that the examination cannot proceed under such conditions. The reasons for the psychologist's presence are reiterated. The psychologist is there at the request of the defendant's lawyer, and the intent is to try to be helpful, but this cannot be done without cooperation.
Except where there is also a psychiatric disorder present, defendants with mental retardation generally show good motivation. Admittedly, encouraging best efforts might lead to a slightly higher score than under other conditions, but the function of the examiner is to try to ascertain the defendant's “true” abilities, not to allow them to be disguised. The psychologist is not an advocate; that role is left to the attorney. As one lawyer put it, “Try to tell us the truth; if you don't, someone else (the state) will!”
Differences Between Two Editions of the Same Test: WAIS-III vs. WAIS-R
With capital cases commonly carried over several years from the time of conviction to the date of an execution, as noted earlier, the impaired defendant is likely to have had multiple examinations, and the psychologist may be confronted with the need to explain a current intelligence test score in relation to earlier ones. Given the inherent error in any test score, it would be surprising if all tests had the same result. It is, of course, major differences in IQ that create difficulties, especially if they result in a change of diagnostic classifications (e.g., from borderline intelligence to mental retardation). Death row defendants in much of the 1990s were likely to have been examined on the WAIS-R; the current edition, the WAIS-III, first came into use in 1997. It is comparisons between these two tests that are likely to create problems. The WAIS-III, the intelligence test most likely to be employed at present, can be expected to yield a score that will fall below that of the WAIS-R. In a study of 197 adults, ages 16 to 74 (mean age = 44), the Full-Scale IQ difference was almost 3 points (2.9), with Verbal and Performance scale differences of 1.2 and 4.8 points, respectively (WAIS-III Technical Manual, 1997).
Differences in IQ between current and preceding versions of a test are addressed in some detail in the Manual, in which the author(s) point out that later versions commonly produce a lower score, presumably at least initially. Research also suggests that scores on any given version of a test will tend to increase over time (Flynn, 1984, 1987; Matarazzo, 1972), an increase that is viewed as inflationary rather than as real. The effect is to raise all scores over time such that the individual's IQ will be elevated relative to that of the population at the time that the test was standardized. This constitutes an unwarranted increase in IQ. Interestingly, it is estimated that the rate of increase is about one third of an IQ point per year. Thus, over the 16 years that the WAIS-R was employed, from 1981 to 1997, an individual's test score can be expected to have increased, on the average, by almost 5 points. Parenthetically, the cause of the increase is variously attributed to such phenomena as improved nutrition and health, better education, and a wider dissemination of information. In fact, the population has gained modestly in general intelligence, but the obtained IQ, at any point in time, is interpreted in terms of the population at the time that the test was standardized, hence, an artificial elevation of that IQ. One of the reasons for publishing a new edition of a test is to correct for this inflationary effect (WAIS-III Technical Manual, 1997). In terms of WAIS-III and WAIS-R comparisons, one can anticipate that the greatest difference will be between test scores obtained later in the life cycle of the WAIS-R.
A further confounding factor in comparing the WAIS-R and WAIS-III may be a practice effect on the WAIS-R, if it has been give several times. Because this, too, represents an artificially increased score, the net effect is to potentially further magnify the difference between later WAIS-R scores and the WAIS-III! In sum, the psychologist must be sensitive to both the inflationary and practice effects in comparing WAIS-R and WAIS-III IQs and be prepared to explain them to a skeptical judge and prosecutor—no easy task! The use in the courtroom of a color spectrum overhead should be helpful!
Differences Between Tests: WAIS-III and the Stanford-Binet: Fourth Edition
Differences between the WAIS-R and the WAIS-III tend to be dwarfed by those between the Stanford-Binet and the earlier Wechsler tests. When comparisons were made of the WAIS and WAIS-R in relation to the Stanford-Binet Form L-M, investigators found much higher scores on the Wechsler scales, at least for institutionalized adolescents and young adults tested on both scales (Ring, 1985; Spitz, 1986). Mean WAIS and WAIS-R IQs tend to average 12 to 15 points higher than those of the Stanford-Binet L-M. A similar finding with mentally retarded populations has been reported with respect to the WAIS-R and the SB-IV, the current version of the Stanford-Binet (Thorndike et al., 1986). In a mentally retarded sample studied in connection with the standardization of the SB-IV, Thorndike et al. found that the mean IQs on the two scales were 63.8 (SB-IV) and 73.1 (WAIS-R). The 9-point difference was attributed to the lower floor of the Stanford-Binet—IQ 38 on the latter and 45 on the WAIS-R. In a later study of a mentally retarded sample, Spruill (1991) found an average difference of 15 points between the SB-IV and the WAIS-R, a magnitude comparable to those of the earlier WAIS and WAIS-R studies. To the degree that the difference between the two tests is tied to the lower floor of the Stanford-Binet, with the floor of the WAIS-III unchanged from WAIS-R, still at 45, we can continue to expect higher IQs on the WAIS-III than on the SB-IV.
As the test of first choice for evaluating defendant's for possible mental retardation, the SB-IV has two disadvantages. First, the population on which it was standardized extends only up to age 23—no IQ equivalents are available beyond that age—and, second, the Wechsler tests have been the choice for the general adult population for more than a half century. They have precedent on their side with clinical populations. I make regular use of selected subtests of the SB-IV in mental retardation evaluations, irrespective of the defendant's age; notably the subtests Absurdities and Comprehension. Absurdities in particular has powerful “face” validity because all of the items are presented in a visual mode. The quality of the person's logical thinking is open to the observer. Moreover, both subtests provide age-equivalent scores, an invaluable aid in explaining how the defendant's thinking compares to that of the typically developing individual. Further, as noted earlier, the Copying subtest of the SB-IV is useful in determining possible malingering.
In summary, although the use of the SB-IV in evaluating defendants under age 24 would appear likely to produce lower IQs than would the WAIS-III because of its lower floor, it is my view that initial testing should be done with the WAIS-III. The SB-IV should be reserved for situations where proximity to an earlier WAIS-III would raise the likelihood of a practice effect on a repeat WAIS-III. On the other hand, the use of selected SB-IV subtests is entirely appropriate for any age defendant because the subtest scores provide age-equivalents. If tests are chosen simply because of the expectation that they will produce a lower IQ, a mockery is made of the law and of those who must carry it out. Whatever our views toward capital punishment, the obligation of the psychologist is to apply the most commonly used tests in the evaluation of defendants, reserving other tests for special situations. This is the psychologist's ethical obligation (see Section 2.02 of Ethical Principles of Psychologists and Code of Conduct, American Psychological Association, 1992). Its violation is to assure that the criminal justice system will reject psychologists as objective evaluators of criminal defendants.
Author: George S. Baroff, PhD, 417 Granville Rd., Chapel Hill, NC 27514