Secular change in the B/W IQ gap (in the US)

[I updated this chart. A description of the study types was added. And some minor corrections were made.]

I’m in the process of updating Lynn’s 2006 compendium of African American IQ scores. If anyone is aware of any recent standardization samples that I have missed in which scores are decomposed by race, let me know.

Excel file.

Yerkes, R. M. (Ed.). (1921). Psychological examining in the U.S. Army: Memoirs of the National Academy of Sciences (Vol. 15). Washington, DC: U.S. Government Printing Office.

Shuey, A. M. (1966). The Testing of Negro Intelligence. New York: Social Science Press.

Gottfredson, L. S. (2005). Implications of cognitive differences for schooling within diverse societies. Pages 517-554 in C. L. Frisby & C. R. Reynolds (Eds.), Comprehensive Handbook of Multicultural School Psychology. New York: Wiley.

Loehlin, J. C., Lindzey, G., and Spuhler, J. N. (1975). Race Differences in Intelligence. San Francisco, CA: Freeman.

Coleman, J. S. (1966). Equality of Educational Opportunity. Washington, D.C.: U. S. Office of Education.

Osborne, R. T., and McGurk, F. C. (1982). The Testing of Negro Intelligence. Athens, GA: Foundation for Human Understanding

Broman, S. H., Nichols, P. L., and Kennedy, W. A. (1975). Preschool IQ. New York: J. Wiley.

Arthur R. Jensen. Educability and Group Differences . New York: Harper and Row, 1973

Roth, P. L., Bevier, C. A., Bobko, P., Switzer, E S., and Tyler, P. (2001). Ethnic group differences in cognitive ability in employment and educational settings: a meta-analysis. Personnel Psychology, 54, 297-330.

Dickens,W.T.,&Flynn, J.R. (2006). Black Americans reduce the racial
IQ gap: Evidence from standardization samples. Psychological
Science, 17, 913–920

Kaufman, A. S., and Doppelt, J. E. (1976). Analysis of WISC-R standardization data in terms of the stratification variables. Child Development, 47 165-171.

Murray, C. (2007). The magnitude and components of change in the black–
white IQ difference from 1920 to 1991: A birth cohort analysis of the
Woodcock–Johnson standardizations. Intelligence, 35, 305−318.

Avolio, B. J., and Waldman, D. A. (1994). Variations in cognitive, perceptual and psychomotor abilities across the working life span: examining the effects of race, sex, experience, education and occupational type. Psychology and Aging, 9, 430-442.

Mercer, J. R., and Lewis, J. F. (1984). System of Multicultural Pluralistic Assessment: Manual. San Antonio, TX: Psychological Corporation.

Reynolds, C. R., Chastain, R. L., Kaufman, A. S., and McLean, J. E. (1987). Demographic characteristics and IQ among adults: analysis of WAIS-R standardization sample as a function of the stratification variables. Journal of School Psychology, 25, 323-342.

Herrnstein, R. J., and Murray, C. (1994). The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press.

Dunn, L. M. (1988). Bilingual Hispanic Children on the U. S. Mainland. Honolulu: Dunn Educational Services.

Thorndike, R. L., Hagen, E. P., and Sattler, J. M. (1986). Stanford-Binet Intelligence Scale: Fourth Edition Manual. Chicago: Riverside.

Nyborg, H., and Jensen, A. R. (2000). Black-white differences on various psychometric tests: Spearman’s hypothesis tested on American armed services veterans. Personality and Individual Differences, 28, 593—599

Murray, C. (2006). Changes over time in the Black–White difference on
mental tests: Evidence from the children of the 1979 cohort of the
National Longitudinal Survey of Youth. Intelligence, 34, 527–540.

Pnfitera, A., Lawrence, L. G., and Saklofske, D. H. (1998). The WISC-III in context. In A. Prifitera and D. H. Saklofske (Eds.). (1998). WISC-III Clinical Use and Interpretation. San Diego, CA: Academic.

Kaufman, J. C, McLean, J. E., Kaufman, A. S., and Kaufman, N. L. (1994). White-black and white-Hispanic differences on fluid and crystallized abilities by age across the 11 to 94 year range. Psychological Reports, 75, 1279-1288.

Kramer, R. A., Allen, L., and Gergen, P. J. (1995). Health and social characteristics and children’s cognitive functioning: results from a national cohort. American Journal of Public Health, 85, 312-31

Rowe, D. C. (2002). IQ, birth weight, and number of sexual partners in white, African American, and mixed race adolescents. Population and Environment., 23, 513-524.

Weiss, L. G. (2010). WAIS-IV clinical use and interpretation: Scientist-practitioner perspectives. London: Academic.

24 thoughts on “Secular change in the B/W IQ gap (in the US)

  1. Quite a trend line …

    Seriously, is there anything else in the human sciences that is this stable? For example, I don’t think height differences among populations are quite as stable as IQ differences.

  2. ““the fundamental constant of sociology” was coined, so far as I know, by La Griffe du Lion. I usually amend it to “the fundamental constant of American sociology” because I don’t want to presume about the rest of the world, and risk finding out later that it doesn’t apply in, say, Eritrea.

  4. Your chart shows black IQ, not the black/white gap. Up to three IQ numbers in some rows. What do they mean? Thanks for your research.

    • Thanks for the comments.

      The multiple IQ scores were based on different subtests (e.g., Verbal IQ). The new chart now shows only FSIQs or gQs when reported. The numbers are relative to a White mean of 100, so the chart also shows the B/W gap.

  6. So given the apparent fact that Black IQ scores have remained stable, why have some claimed that the gap is closing or that the Flynn effect is more prevalent in blacks?

    • Flynn argued:

      AFQT79 82 –> AFQT97 85
      SBIV 86 — > SBV 88
      WISC3 85 –> WISC4 88

      = narrowing. I agree that there has been some. The samples have been getting more and more representative and you see fewer and fewer low scores. The age 7-18 gap is probably ~12 points now (versus 15 in the early 1900s) and the age 18-26 gap is probably ~15 points (versus 18 points in the early 1900s.) So that would represent a 3 point narrowing.

      • So the three point narrowing is mostly the result of including more samples from the 7-18 age group? Does this mean that the change in the average Black IQ is mostly a change in who is being tested and included in the data rather than an actual change in cognitive ability?

        Can’t emphasize how useful this blog post has been by the way! I plan on writing a document defending “The Bell Curve” from the common criticisms that come up all the time as well as giving some updated figures for as many of the charts and stats as possible.

        Specifically I’m looking to defend the six points Herrnstein and Murray mention in pages 22-23 of the paperback edition.

        • I updated the figure above again.

          “So the three point narrowing…”

          It’s difficult to tell.

          For one, it’s not clear what magnitude the gap was before the 1960s. Before the ’60s, there were no nationally representative surveys; there are only reviews of unrepresentative samples. And even those reviews are not without ambiguity. For example, Jensen 1998 p 376 reports a 10 point gap between Black and White high school students between 1922 and 1965 based on Shuey’s 1965 review. But others, looking at the same data, derive a high school gap of 1 SD. Here, for example, were Linda Gottfredson’s estimates:
 Part of the difference between Jensen and Gottfredson’s estimates is due to the selection criteria used and part is due to the method of calculating the standardized difference. The point is that one can get different values by using different methodologies.

          As for the 1960s on, one confounding issue is age, as there is an age x gap effect (as seen in longitudinal — not just cross sectional — analyses). For background, read section A1 here: If you look at adult samples you see something somewhat different than if you look at child samples. But, of course, it’s rather difficult to disentangle age effects from secular effects and to disentangle both from sampling effects.

          To give an example, here were the WAIS IV results:
 No different than the WAIS-R results 30 years ago. But, before you jump to a conclusion, here were the cohort breakdowns:
 But, before you jump to a conclusion, here was Murray (2007)’s discussion of the WAIS-III:

          “The argument for an unchanging B–W difference in IQ over the course of the 20th century is not refuted. The defenders of the Shuey studies and of the Army Alpha and Beta estimates can mount counter-arguments. They canalso call upon theWAIS standardization data,which show
          a B–W difference of only 0.99σ for the 41 blacks born from 1904 to 1923.3″

          The data is riddled with ambiguity. And this allows people to selectively cite evidence in support of a position.

          (1) There is an age x gap effect and this seems to be larger now than before
          (2) There probably was some secular narrowing in IQ scores (which doesn’t necessarily mean that there was secular narrowing in latent abilities)
          (3) It’s not clear how much because it’s not clear what the gap was AND what the gap now is taking age into account.
          (4) The gap now –2012 — for the age 25-30 population is no less than 1 SD and we know this because the age 12-16 AFQT gap was 1 SD in 1997. (Dickens and Flynn argued that the AFQT showed a secular decrease from 1980 to 1997, but the issue is not clear because the mean age of the samples cited differed by 4 years and because the tests were given in different formats).
          (5) The 30+ gap as of now might be larger. A larger gap shows up in some samples but not in others. See table 1 here:

          The above is a meta-analysis based on 105 studies from 1970 to 1998. If the 30+ gap was much larger than 1 now, it needs to be explained why the meta-analytic gap in industry was only 1 SD in this study, given the average age of workers — about 40.

          (6) It’s not clear to me what the contemporaneous < 25-30 gap is.


          "Specifically I’m looking to defend the six points Herrnstein and Murray mention in pages 22-23 of the paperback edition"

          If you need help let me know. What are the 6 points? As for this blog, take my estimates with a grain of salt and double check the sources cited.

  7. The 1980 AFQT given to the NLSY79 sample had a huge W-B gap of something like 18.6 points, and bigger among males. An analysis in 1995 showed that lots of low scoring black males had apparently given up part way through this 105 page test, making their scores even lower. The 1997 AFQT was made more adaptive so it would be less humiliating to low performers. I believe the gap was then 14.7 points.

    In general, test designers in recent decades (generations?) have been sensitive to issues that would enlarge the W-B gap.

    • Steve,

      Blacks do better than Whites on computer adapted versions of tests per se, so I imagine that that change had an effect on the d. Here’s an excerpt from a NAEP technical paper:

      Perhaps the most comprehensive study of the comparability of delivery modes for population groups is that of Gallagher, Bridgeman, and Cahalan (2000), who addressed the issue with large samples of examinees taking a variety of admissions and licensure tests. The tests were the Graduate Record Examinations (GRE®) General Test, Graduate Management Admission Test (GMAT®), SAT I: Reasoning Test, Praxis: Professional Assessment for Beginning Teachers, and Test of English as a Foreign Language (TOEFL®). These investigators discovered that delivery mode consistently changed the size of the differences between focal- and reference-group performance for some groups on both verbal and mathematical tests, but only by small amounts. Of particular interest to the current study is that for Black students and Hispanic students the difference in mathematical performance relative to White students was smaller on computer-based tests than on paper tests. From one mode to the other, the difference in performance between groups changed by up to .24 standard deviation units, depending upon the test. Also, the difference on mathematical tests between White female students and White male students was smaller on the paper versions than on the online editions. This difference changed as a function of delivery mode by up to .12 standard deviations, again depending upon the particular test”

      You can see the same CAT (computerized adaptive test) effect with regards to the NAEP. Basically, switching to CATs shaved off 0.1 SD of the B-W gap, but increased the M-F gap 0.1. The effect shows up across tests. I imagine, though, that most researchers chalk this effect up to psychometric bias — decreasing it in the case of Blacks and Whites and increasing it in the case of Males and Females. But my guess would be that it has something to do with differences in visual spatial processing. (Refer to JL’s comment on the “Race realism classics” post.) Generally, by finagling with tests enough, one can decrease gaps. Imagine an indefinite number of tests with different formats on which Blacks and Whites commonly differ due to g difference but differently differ for idiosyncratic reasons. Test designers in the US tend to pick the formats that produce lower B-W ds even when theses formats increase other group ds.

  8. Chuck : “What are the 6 points?”

    He refers to Herrnstein & Murray 1994 book, “The Bell Curve”.
    I have the book. What they say, pages 22-23, is :

    1. There is such a thing as a general factor of cognitive ability on which human beings differ.
    2. All standardized tests of academic aptitude or achievement measure this general factor to some degree, but IQ tests expressly designed for that purpose measure it most accurately.
    3. IQ scores match, to a first degree, whatever it is that people mean when they use the word intelligent or smart in ordinary language.
    4. IQ scores are stable, although not perfectly so, over much of a person’s life.
    5. Properly administered IQ tests are not demonstrably biased against social, economic, ethnic, or racial groups.
    6. Cognitive ability is substantially heritable, apparently no less than 40 percent and no more than 80 percent.

    In other words, nothing new.

    • Thanks.

      It’s telling that any of these now well established points are felt to need defending.

      You don’t have a PDF version of the book do you?

      • Glad a PDF is available to give this information more coverage. The paperback edition also has an excellent afterward that seems VERY prophetic about how the book would be handled in the years to come.

        Two copies of the 1996 paperback can be bought for under $30 (I think Amazon sells about fifty copies a week), and for even more coverage this PDF will come in handy.

        People can create an unnecessary uproar about the book all they want – it’s just going to get the book more readers. ;-)

  9. OT, but: Is this true about the Minnesota Transracial Adoption Study? I checked the wiki article and saw that somebody had added this in, but I had never seen it before:

    “The study found that age at adoption was significantly associated with measured IQs:

    Children’s background Age 7 IQ Age 17 IQ
    Early placed adopted children 111 99
    Late placed adopted children 97 92″

  10. thanks Chuck, it’s great that we’re making all of these resources available for everybody. i needed access to the original papers.

