Claudia Krenz, Ph.D. (datafriend @ gmail-.-com)
This is a web page about a logistic regression, the first one interpreted 1 in The Bell Curve (Herrnstein & Murray, New York: The Free Press, 1994). There has been much discussion of the book but no examination of this or any other of its published analyses. The Bell Curve continues to generate discussion and influence U.S. social and educational policy. Its conclusions were based on the authors' interpretations of the results of 100 separate statistical analyses (logistic regressions). Although printouts from all analyses were included in one of the book's appendices, the authors were criticized for publishing them without having earlier submitted them to peer review.
Despite these--and many other--objections, there has been no investigation of the published statistical analyses on which the book's conclusions were based. The same simple statistical model was used throughout: 3 variables--scores on the Armed Services Qualifying Test (AFQT), a socioeconomic status index (SES), and AGE. This model was used to predict 100 different outcomes for different subgroups of the Bureau of Labor Statistics' venerable National Longitudinal Survey of Youth (NLSY). HM argue that AFQT scores are a measure of intelligence. Almost all their analyses were logistic regressions, including the one examined here, the first discussed in the book.
This page examines the book's statistical output (Appendix 4, p. 622) and the authors' interpretation (Chapter 5, p. 127). In this analysis, HM used their 3-variable model to predict whether white NLSY cases lived above or below the POVERTY level. HM took AFQT and SES scores from the beginning of the NLSY and used them, along with their AGE covariate, to predict POVERTY a decade later. Murray made the book's data public long ago: Anyone with an internet connection can download them. The sample reported in the book's first analysis is obtained by excluding, from the higher-income NLSY X-Sectional subsample, all but white non-students--without missing values on any of the 4 analysis--AFQT, SES, AGE, & POVERTY--variables (N=3367). While HM used JMP, any commercially available statistical package can be used. Replication of the first analysis with STATA yielded numbers identical to those published. HM did not, however, examine their residuals; I did.
HM knew not to interpret their data analysis without examining the fit of their model--here, the 3 above mentioned variables. HM wrote that "the usual measure of goodness of fit for multiple regressions [is] R-square" (p. 617). Based on the R-square statistic, HM concluded their model adequately fit their data. Murray, however, subsequently learned otherwise, later referring to R-square as "ersatz and unsatisfactory" in this context (1995). Another way to examine the fit of a statistical model in a logistic regression is the classification table, an intuitively obvious cross-tabulation of the predicted status of cases by their actual status: here, a cross-tabulation of whether cases predicted by HM's 3-variable model to be above or below the poverty level actually were or were not.
After obtaining results identical to those published (down to four significant digits), the fit of HM's model was addressed by examining its classification table: HM's model predicted none of the cases living below the poverty level correctly (N=244): all were incorrectly predicted to be living above it. Interpreting the results of this first analysis is meaningless, because it predicts none of its cases of interest correctly: the Analysis section below shows my replication of HM's published analysis (and its corresponding classification table). The Subjects section describes the NLSY and the Data section, the variables used in the analysis.
To partially address this question, the same logistic regression described here was repeated on an independent group of subjects, the lower-income NLSY Supplemental sample (N=1067). As with the published analysis, the rate of false positives was high (around 90%): that's high enough to shake the statistical moorings of The Bell Curve's conclusions.
Links to the 5 sections of this web page are highlighted and listed below: What variables did HM use? Which subjects? Did their model fit? Me (this web page being no less "theory laden" than any other human cognitive activity). Documentation (described in the left column below)?
You can do exactly as I did: download the data and analyze them (made available by Murray, stumbled upon by me years ago (downloaded on the spot--3.5 MB was a commitment back then--from a still viable link). The internet has given us a new way of making knowledge public--data even more than software want to be free, the "entry fee" being in the present example a functioning statistical package and unfettered public internet access (not even an industrial strength search engine is needed because of URLs, links). Setting the level of access to knowledge low is in the public interest, e.g., the empirical history of how the microsoft corporation behaved is a matter of public record (for this you will need that search engine): words later used to characterize don't alter their having occurred: Anyone anywhere anytime can search on a phrase like "E pur si muove": "sharing data" in the decentralized public domain is in the public interest.
Setting the bar for access to knowledge low is in the public interest, as is setting the bar for what constitutes knowledge high, i.e., more rigorous, a point Platt (1964) made strongly--now online.
HM did misinterpret their data: You can establish this for yourself as easily--publicly and empirically--as the numerical replication of the first source table in their Appendix 4 discussed above. Just so, that Radio Netherlands observed that "radio tikrit" included an astrological forecast is a matter of fact, public record--but whether that meant that the U.S. was trying to convert those within its broadcast range to astrology would be a matter of interpretation.
HM were criticized for not having submitted their work for "peer review" before publication--not sharing it with other academic scientists--"peers" in the sense of substance not of being employed by U.S. academic institutions--as were Pons and Fleischmann, who announced "cold fusion" to some journalists in Salt Lake City, Utah [living 200 miles north at the time, I heard it said on local news: it's been an exciting life I've led]. Would HM's misinterpretation have been caught by the "peer review" process (I'd guess it'd depend on the journal-- as Platt noted, a "lifetime of achievement" in one discipline being equivalent to just a few years in one more rigorous)?
Given the prevalence of uncertainty--and being at least as smart as squirrels--humans have instituted epistemological processes like "peer review." An example is a medical journal editor noticing that the manuscript s/he's reviewing reports more "degrees of freedom" in the results section than it had "subjects" earlier and decides against publication. Varmus studied "peer review"--operationalized by extramural review--as an empirical process, using scores aggregated over reviewers and committees as empirical data. Sometimes "peer review" fails us, e.g., publication of methodologically flawed racist articles published in U.S. academic journals in the 1920s (or worse, in the former totalitarian u.s.s.r., when publication meant "biologist" Lysenko agreed with you--and if he didn't you'd be lucky to be just in jail). To the extent we forget that "peer review" is an empirical process, we risk "a return to the dogmatism of science of the middle ages and of a number of religions today" (Robertson, 1999). And I think the issue of public math and "peer review" unavoidable: the first interactive site I found online let anyone specify levels of variables like "prevalence" and "fertility" to compute outcomes like number of orphaned children.
By and large though epistemological behaviors like "peer review" serve us well, errors typically being, as in Student's title, "Errors of Routine Analysis"--positively skewed in terms of impact, i.e., most such errors are mundane. We must still though remain cognizant of the fact that a "peer reviewed" anything is a tautology, something reviewed by someone someone else calls "peer." Galileo's example underscores the importance of who constitutes such committees--putting the pet theory of Urban VIII, a man with the power to imprison him, into the words of "Simplicio," the simpleton in a three-way conversation does not strike me as accidental--perhaps Galileo questioned the authority of "belief" (requiring only "subjective certainty") as a litmus test for knowledge, what he had observed: "E pur se muove," nevertheless it moves.
The public internet has given us a new never-before-in-our-species form of decentralized countryless knowledge. That this page has been cited in books with titles like "quantitative genetics" and "multivariate analyses" suggests that others have used the same data I link to and gotten the same results I did (great textbook example: rare to get an unmessy number like 0). It's now less necessary to take anyone's word for anything--and that can't help but facilitate the growth of knowledge. I got numbers nearly identical to those HM published in The Bell Curve and, on examination, noticed that they meant nothing. You can follow the links to your right and get exactly the same results as I (and HM before). It is beyond the scope of the present page to address questions like whether The Bell Curve should be classified as an "urban legend" and about poverty (for that I refer the reader to Real Change, one of Seattle, WA's better papers, available online and on the street). It is the purpose of the present page to empirically examine one analysis, the first, the POVERTY analysis.
1 Although the subjects in this first analysis were whites only, the book's fame-notoriety came from its positing a relationship between "ethnicity" and intelligence ... Were we to substitute "nationality" for "ethnicity," the argument--keep in mind that test-score differences can be discussed without positing unnecessary constructs like intelligence --would go like this:
To illustrate, average American standardized math test scores appear on the blue side of the multi-colored "bell curve" shown to the right (which is a picture of the cover of The Bell Curve, now out-of-print but listed at Amazon, this facsimile's source).
... two years earlier, 30% of AK's 8th graders--compared to 21% of TX's and 17% of CA's--scored "proficient" on the National Assessment of Educational Progress' math test, the closest the U.S. had to a "gold standard." Why didn't that leg notice that AK students were scoring higher than those in the home states of the whining oil execs? Provincialism? Perhaps--but the opinions of business [heretofore usually men] have influenced U.S. public education since its inception (Callahan, 1962; Cremin, 1961; Nasaw, 1979; Wise, 1979).
And there's also the apocryphal tale of the space mission run awry because one set of engineers was using metric while another wasn't (Mars Orbiter, 1998).
|On the other hand, I once read--don't know if it's true--that early 20th century Turkey switched from Arabic to Roman numerals (if it did, it probably didn't take much XVII × MDCLXI ÷ DMing to switch back).|
Was the U.S. invasion of Iraq the triumph of ideology or stupidity? Ignoring the ethical issues screaming from the invasion and ensuing "reconstruction"--which is impossible--what went wrong? Is it fascism or a dimwittedness? One forthcoming article suggests that that some involved with post-tsunami reconstruction are working to dispossess those left who lived on the coasts with hotels for tourists and commercial fish farms. This is an awesomely stupid combination (what tourist would want to go swimming in fish farm waste?). The greed is not surprising--"every need's got an ego to feed" (Bob Marley). Also not surprising is the dearth of human empathy and compassion ("I'm so glad it happened," etc.). What surprises me is the stupidity of the plan: tourists will notice that going for a swim is followed by coming out lice infested (based on studies of salmon fry whose migratory paths cross commercial fish farms--research sponsored by wild fishing industry). As to the media, I quit watching TV on hearing CNN's trenchant comment on sodamned hussein's initial change of venue: "he's lost weight" (web sites that start off on an ad hoc basis like blackboxvoting.org and buzzflash.com show that the internet provides sometimes the only alternative to information pollution, the ever rising kipple tide--which is not to say it's not mostly crap). What is going wrong?
Everyone's inner ethnostatistician is piqued by reading that more and more Americans are engaging in unsafe low carb high fat diets while more and more medical associations are warning against them (that the warnings by and large are available by subscription only is probably just a byproduct of copyright law). Lack of access to knowledge cannot though explain public spectacles like the U.S. Secretary of Education writing a letter to the Utah state legislature comparing results from two different achievement tests whose definitions of proficiency are of unknown relationship to each other (but expected to be different, 4/11/05). Yet another example was the state of Alaska's Attorney General: he resigned 2/05, because someone had noticed he was promoting, to the Taiwanese, a coal cleaning technology in which he had significant shares from his dais as AG while on the public dime (and then having, in his words, "accidentally" erased emails pertaining to his portfolio upon learning that he was under investigation); before resigning, he "reasoned" that the governor anointing his daughter U.S. senator--a very succinct example of that inbreeding problem--was OK because sovereignty resided in the legislature, something to do with the 17th Amendment for which he could find no citations (one would not accept such work from students; why accept them from adults like state attorneys general?). The examples, at least in the U.S., are seemingly endless. Questions about who is making the bad decisions are publicly answered only by creating scapegoats then stuffed into the gaping decision making holes. As to the future, the only thing the first generations of students tested annually to produce "accountability" scores are certain to have instilled in them is the ability to follow directions or orders.
But, again, to conclude that Americans are less intelligent than other humans, we'd need to establish, were we being rational, that standardized math test scores were a good measure of intelligence (and dismiss repeated instances of doing dumb things with numbers as pathological) ... As with discussions of "nationality" and intelligence, so with discussions of "ethnicity" and intelligence: a link between the variable and the construct of interest must be established before interpreting one as an indicator of the other. This web page isn't about that debate. It's an empirical examination of a statistical analysis published in The Bell Curve, the first one.
3 Examining the impact of Kant's writings on all subsequent thinkers--including Kuhn, 1962, to whom all research experience-in-the-world is but a footnote--illustrates the "peer review" process. Kant credited Hume for awakening him --he'd been full prof for a decade--from his "dogmatic slumbers:" Kant synthesized Leibniz's rationalism and Newton's empiricism by placing the INDIVIDUAL human knower at the center of all knowledge. He outlined a model whereby a knowing mind might construe a world, arguing that understanding is a product of our perceptions and "reason" working together: The mind experiences nothing without perceptions--and has nothing to think about without "reason." In the preface to the second edition of his first Critique, Kant noted that "Reason approaches nature ... to be taught by it ... not in the character of a pupil who listens to everything that the teacher chooses to say, but of an appointed judge who compels the witnesses to answer questions which he has himself formulated ... (p. 20)."
He agreed with the rationalists that what is known through the senses is merely appearance and that reason plays a critical role in the knowledge chain--a point on which he disagreed with the empiricists, with whom he agreed that human knowledge is grounded in the perceptions of our physical senses and that the physical world--"the thing-in-itself"--is unknowable in the sense of unprovable (the problem of induction).
Kant separated "knowledge" from "belief," and "opinion:" the first requiring both objective and subjective certainty; the second, subjective certainty, and the third, neither. In so doing, his "critical philosophy" separated science and religion and placed ethics beyond revelation, convention, and outside authority.
Prominent rationalist and empiricist academic journal "peer reviews" of Kant's Critique of Pure Reason (1787) were scathing--several university towns banning his book as "subversive" ... But in less than a decade--it took just months for researchers to discredit "cold fusion" by communicating on USENET their many unsuccessful attempts to replicate P & F's results in their own labs--Kant's "critical philosophy"--what Palmquist (1996) called his "Copernican turn"--was taught throughout his area: students and recruiters from rival universities flocked to him, and some regarded him as a seer on matters irrelevant. As far as the latter, there are always those who set the bar of explanatory relevance so low that they believe in astrology, etc. As far as the former, Kant's peers had not been targeted and befuddled by advertising campaigns; no opinion polls had been taken; it wasn't because he was employed by the University of Köningsberg or because enron or u-haul math had occurred--and there are no questions of authenticity or verisimilitude; nothing magical or celestial happened: Kant's "peers" came to agree with him--"free will"?--because they thought him correct. In the centuries since--whether quantitatively measured in MB or qualitatively reflected in text like the title of one of Karatani's essays "On the Thing-in-Itself" and Heidegger's term "being-in-the-world"-- more has been written about him than he wrote. The insight of his Critiques--yes, three: Kant spoke for human equality: revolutions in America and France could not have escaped his notice--is that the everyday world consists of what experience is like as it happens, whether the experience is of someone conducting a statistical analysis or someone deciding which cabbage to pick. The corollary is that the reality we perceive is the only reality of which we can speak with certainty.