Introduction

The following interdisciplinary teaching resource provides an historically contextualized outline of various 19th and 20th century ability testing technologies ostensively designed to assess both animal and human mentality.  Along the way critical evaluations of the motives, methods, and consequences of past research traditions (including: early animal studies, intelligence testing trials, school tracking practices, vocational guidance programs, university entrance exams, and school accountability initiatives) are made.  Proposals regarding the applicability of "transformative" interpretations of this past ability testing data are also put forward.

As the title Psychology, Society, and Ability Testing indicates, this work utilizes a wider unit of analysis than any of the previous accounts of the ability testing area.  The history of biological and anthropological science, as well as the history of American public education, ethnic or class demographics, and politics are used to contextualize the past uses of ability testing technologies.  The societal role of successive vocational guidance and ameliorative educational programs, in particular, are also mobilized to stress the likely content and procedural aspects of transformative assessment in the 21st century school, university, or workplace.

Stated briefly, a transformative approach to ability testing will have to do what the other two standard approaches (i.e., Mental Darwinism and Interactionism) have never been designed to do.  It will have to deal specifically with the typical threefold transformation of human mentality -the transformation from a strictly individual, biologically mediated, mentality; into a group, socially mediated mentality; into a cultural, societally mediated mentality.  In other words, contrary to the shared innate capacity and social relations assumptions of the other two approaches to human intellect, the very capacity for paying attention is recognized as being transformed phylogenetically, ontogenetically, and socio-historically.

The empirical/theoretical consequence of making such a shift in our assessment methodology would be that we must consistently adopt a social and societal unit of analysis when considering the normal mental development of human beings.  Similarly, the practical/technical consequence of this shift would be that: (1) the available historical (largely individualized) data on animal mentality would have to be reinterpreted to include a social aspect; and (2) that our contemporary (largely interactionist) human ability testing technologies would have to be replaced by (or augmented with) a new set of assessment tools dealing specifically with the characteristic threefold transformation of human mentality.

Historiographic Rationale, Influences, and Definitions

In an attempt to live up to Michael Sokal's (1984) demand that psychological testing histories be clear about which tests or apparatus are being discussed, I have used the historiographic tactic of presenting a series of representative photographs obtained from psychology textbooks and test training manuals dating from 1911 to 2002.  The overall theoretical strategy or underlying motives for carrying out the research, however, are threefold: (1) to stress the inadequacy of what I call mental Darwinism in the early years of animal and human ability testing; (2) to make a similar assessment of the long-standing interactionist account of human mentality (including its human relations, cold war, equality era, and accountability era manifestations); and (3) to introduce an alternative transformative account which resolves the systematic difficulties produced by the two former disciplinary approaches.

The resulting historical narrative, therefore, is not solely a descriptive end in itself, but is intended a means of popularizing what I view as a viable rationale for the production, application, and interpretation of future ability testing technologies.  It will be demonstrated that the transformative approach is viable not only in systematic/contentual terms (i.e., as an explanation of the development of animal and human mentality) but also in practical/procedural terms (i.e., as a means around the past ethical, legal, and political pitfalls of traditional assessment).

In order to move beyond both 19th and 20th century accounts of mental ability, early-21st century psychology must explicitly adopt an historically responsible, transformative, approach to the mentality of ape, primitive, child, and adult.  By historically responsible I mean simply that the assumptions and empirical tools used by the two predominant assessment methodologies (Mental Darwinism and interactionism) must be recognized as part of (and partial to) a previous era's societal ethos -one that is completely incompatible with the observable realities of contemporary society. 

Regarding the past disciplinary role of interactionism

It should be mentioned up front, however, that the following critique of the current limitations of interactionism is not intended to downplay its past disciplinary role in attempting to displace the explicitly Mental Darwinist era of psychometric test (1859-1947). With regard to different forms of interactionism (1947-2002), I have therefore distinguished between traditional, revised, and dynamic interactionism to reflect those efforts. The argument presented here is that the Mental Darwinist tradition was not sufficiently displaced by 20th century interactionism.

Acknowledging the historical embeddedness of the typical pattern of transformation of human mentality (i.e., which is created by way of culturally provided tools), in particular, has special importance when we attempt to design or implement ameliorative educational programs; assess current standards of public education and university entrance examinations; or produce equitable standards for workplace selection and retraining initiatives.

Influences

Three booklength treatments of psychological history or theory, have figured prominently in my initial choice of topic and theoretical approach.  The first work, is largely theoretical in nature.  Charles Tolman's Psychology, Society, and Subjectivity (1994), has argued that only by making a societal unit of analysis available to contemporary psychologists can the discipline progress from its typically decontextualized or descriptive analysis of mentality to a more explanatory account.  I will apply his premise to the particular subdiscipline of ability testing but other applications (e.g., perception, learning, motivation, personality, etc.) are equally possible given the proper funding and time allowances.

Unfortunately, Tolman's book utilizes objectionable terminology such as "bourgeois psychology" (for those psychologies that originate from and tend to address purely middle class values), and "variable psychology" (for the all-to-common blind reliance on the "independent variable - dependent variable model" of empirical research methods).  This rhetoric, while factually accurate, makes his book unconducive to a friendly reception by psychologists who utilize the experimental or correlational methods in their daily practice.  While embracing Tolman's fundamental premise, I will do so in a way that will hopefully appeal to open-minded graduate students, clinical practitioners, or educational researchers who are already skeptical about the rather ahistorical theoretical basis of contemporary testing procedures.

The present (rather pedantic) historical coverage of past testing practice is intended as a common reference point for those schooled in both traditional statistical procedures and for (largely anti-empirical) so-called "critical" psychologists or educators.  In other words, both "Educational Testing Service types" and "past test deniers" should gain something from this work.

Ironically, the second work that influenced me is the very embodiment of the bourgeois psychology that Tolman has argued against.  It is co-authored by R. M Thorndike.  He is both the grandson of E. L. Thorndike (one of the initial builders of the now entrenched associationist mental testing industry) and son of R. L. Thorndike (a prominent proponent of mid-to-late 20th century ability testing).  In that sense, not only the invention, application, and interpretation of American ability testing, but also its official history has become an in-house family affair.

Thorndike & Lohman's book, A Century of Ability Testing (1990), is predominantly a straightforward chronology of which tests were developed at what time and by whom.  Unfortunately, like DuBois (1970), it is far too brief to contain much critical history.  Nevertheless, the authors do attempt a tentative "theoretical synthesis" of recent testing techniques, and, in doing so we are treated to a rare glimpse of the best of what liberal interactionist concepts and additive methods have to offer.  It should be understood though, that one of the recurring themes of the present work is: why (and how) such in-house attempts to revise (rather than surmount) 20th century interactionist accounts fall short of the desired explanatory and democratic mark.

Finally, of importance to my choice of ability testing as a topic is Raymond Fancher's The Intelligence Men: Makers of the IQ controversy (1985).  While I will corroborate Fancher's argument that the conceptual tools used in today's testing debates have remained largely unchanged for a century, I will also provide a specific set of historical examples that indicate how an alternative approach to both conservative mental Darwinism and liberal interactionist accounts of testing is attainable in principle

Subsequent books were published during the production of this work and while they do not constitute direct intellectual influences, they have been helpful in promoting my efforts to contextualize the current state of subdisciplinary debates, legal considerations, and federal or state political initiatives up to 2002.  They include: Contemporary Intellectual Assessment (Flanagan, et al., 1997); The Big Test (Lemann, 1999); Law and School Reform (Heubert, 1999); Standardized Minds (Sacks, 1999); and Creating Equal: My fight against race preferences (Connerly, 2000).  All of these, however, lacked one fundamental ingredient which has served to keep me going: an explicitly stated, viable, methodology to overcome present difficulties.

My great hope is that by providing a socio-culturally informed sampling of past ability testing endeavors (the good, the bad, and the ugly), the present work will make it easier for future proponents of testing to adopt de facto an alternative transformative approach.  The utilization of both "primary" and secondary (text and media) source material here, should not therefore be surprising.  In fact, this work can be considered as one example of the kind of tertiary history which ostensibly pure historians (and experimentalists alike) often ridicule.  But such metahistory, I would argue, is occasionally useful to those sincerely seeking to understand the historical rationale for past, present, or possible standards of research in their discipline.

Definitions

Although past research in the area of ability testing has tended to play both ends of the phylogenetic, ontogenetic, and socio-historical mental ladder against the middle, the transformative approach (which I will elaborate) will avoid denigrating either human or animal intellectual abilities in such a manner. This, of course, is by no means the first common sense attempt to do so (see fig 1).

Figure 1 The Struggle of Intelligence through 'Four Kingdoms of Nature' (a bronze sculpture by J. O. Schweizer, photo from Huntley, 1897/1906). This evocative image originally appeared in the eighth reprinting of a transcendentalist "human betterment" textbook. Florence Huntley's common-sense critique of Darwin's Descent of Man (1871), was that he portrayed human mentality as if it were "merely" that of "an improved ape" (p. 165). Her equally cogent assessment of moral philosopher Henry Drummond's work Ascent of Man (1894), was that he tended to conflate ethics and morals. She concludes that we must take an ethically guided "natural scientific approach" rather than a purely moral or physicalist approach to the study of mental evolution. For her, this entails recognizing a "spiritual side" to human intelligence (p. 127), as something which accompanies the "material, animal, and individual" aspects of human existence. We may confidently reject the mystical spiritualist details of Huntley's approach but we can not fault her main premise: "that modern intelligence has not as yet correctly interpreted its own array of facts" (p. 142). This accusation still applies to contemporary general psychology which has been built largely on the basis of a now outmoded 19th-century Neo-Darwinian continuity view of mental evolution.

In more modern terminology, the historiography of ability testing can be defined somewhat broadly as a history of our study of: (1) comparative animal and human mentality (through philosophical or experimental methods); (2), of typical human mental development, readiness to learn, or individual differences (through observational, correlational, or longitudinal methods -including so-called intelligence or aptitude assessment, achievement tests, entrance exams, or intervention programs); and of (3) vocational guidance, assessment, or prediction (through interest questionnaires, vocational test batteries, situational tests, employee opinion polls, or training programs).

In the realm of specifically human ability testing, various terms like "individual, social, societal, and cultural" take on overwhelming importance -especially when considering the disciplinary implications of past and present testing practices for related issues such as school tracking, cognitive assessment (or recovery) after brain damage, or for funding appropriate ameliorative educational programs (such as Head Start or for promoting universal daycare).  Important as these terms may be, their actual subdisciplinary usage varies widely and a considerable confusion of levels of analysis exists in the historical ability testing literature.

Along the lines of Danziger (1990), I suggest that the very standards of usage within a given researcher's career (or across various research groups), is indicative of the wider disciplinary methodologies to which those researchers respectively subscribe.  In other words, knowing "what you're looking at" when one views a set of testing results (or when one considers the effectiveness of an educational intervention program) can only become clearer once one has literally had a look at the respective historical results of the three main kinds of past methodological approaches to ability testing research: mental Darwinist, interactionist, and transformative.

Chapter Overview

In chapters 1-2, the issue of how best to describe the qualitative difference and quantitative continuity between individual and social intelligence in animals, and societal intelligence in human beings will be introduced.  In the standard historical accounts of our discipline, it is often pointed out that the main influence of Darwin's theory of natural selection (on psychology) has been that human mental functions can no longer be viewed as isolated from their animal counterparts (Boring, 1950; Fancher, 1990; Watson & Evans, 1991; Benjafield, 1996; Schultz & Schultz, 1996).

In that sense, Darwin's tentative ventures into the realm of mind (1871; 1872) helped establish what has come to be known as evolutionary, comparative, or general psychology.  Ever since the end of the19th century, therefore, the discipline of psychology has attempted to provide a scientific/empirical (rather than a spiritual/anecdotal) account of both lower and higher manifestations of mentality.  Ability testing, whether in the form of the experimental animal research or in the predominantly correlational methods applied to human subjects has been part of this endeavor.

Without detracting from the historically supportive role of mental Darwinism on early research, it is argued that the widespread acceptance of Darwin's continuity view of mental evolution made it unwarrantably difficult for psychologists to produce a convincing developmental account of the structural and functional discontinuities between animal and human mentality.  The disciplinary history of this methodological difficulty is then traced, through successive chapters, by highlighting the application and interpretation of ability testing technologies designed to investigate both the mental abilities of various animal species and the intellectual abilities of school children, soldiers, the industrial working class, and (of course) university students.

Under the logic of mental Darwinism, all qualitative discontinuities of mind (whether phylogenetic, ontogenetic, or socio-historical) are routinely converted into quantitative continuities.  Fortunately, as Cambridge science historian Robert M. Young (1968, 1970, 1973, 1985) and others, have pointed out there have been many kinds of evolutionary theories at work in the history of psychology and alternatives to mental Darwinism do exist (see Gruber, 1980; Tolman, 1987a, 1987b, 1990, 1994).

The goal of chapters 1-4 (which cover a period from 1859-1932), therefore, is to demonstrate the lamentable historical pervasiveness of the continuity view of mental evolution in both early American animal psychology; and the initial Eugenics, Military, or Mental Efficiency based manifestations of testing in America.  An important distinction is made between organic evolution (which entails physiological and anatomical comparisons); mental evolution (which entails comparisons of individual or group intellectual abilities); and cultural evolution (which entails comparisons between human preliterate, illiterate, and literate individuals, groups, or societies).

It will be shown that many of the important figures in the formation of Western evolutionary and psychological discourse (including Darwin, Romanes, Loeb, Jennings, Watson, Yerkes, Terman) have chosen either a continuity or a discontinuity position on one or other of the above three fronts.  Their approaches to organic and mental evolution will be highlighted in order to provide historical examples of evolutionary psychologizing, prior to, during, and after the formal establishment of the testing subdiscipline.

Chapters 4-6 draw from past critiques of individual and group testing traditions in order to lay bare the contemporaneous additive and discriminatory assumptions about human intelligence between 1918-1963.  It will be demonstrated that the shared continuity assumptions (i.e., between the animal and human realms of inquiry) were present in the W.W.I, Great Depression era, W.W.II, and Cold War era applications of ability testing.  In particular, during the Cold War era of "meritocracy" (1948-1963), the standardized testing industry (which served a central administrative/personnel sorting role during the rise of a military-industrial complex) was afforded a vicarious respectability that was not well deserved.

Chapter 7 indicates that the mid-1960s to late-1970s manifestations of the so-called liberal interactionist position (which emerged during the "era of equality" in America 1964-1980, and which was often touted as our last best defense against past scientific racism), contained the same methodological weaknesses of prior mental testing traditions; namely the "rectangular capacity metaphor" of human intellect and an appeal to IQ scores as an empirical assessment method.  In other words, traditional and liberal interactionism were both updated forms of mental Darwinism and thus failed to recognize the potentially discernible (and testable) societal aspects of human mentality for what they are -distinct (but not separate) products of our ever-present cultural, economic, political, and ideological existence.

Chapter 7 also provides details on how the lack of fundamental change within the testing industry was initially challenged during the 1970s by way of theoretical, historical, or political critiques which questioned the past ideology (and predictive validity) of testing.  Further, important litigation which initially slowed the previous expansion of the testing industry in the areas of vocational and public school assessment is covered.  It is pointed out, however, that the continued use of a largely unquestioned interactionist platform for both democratic social reform (and for empirical program assessment) during this era of equality allowed the proactive Great Society and Head Start ameliorative programs to remain vulnerable (at least initially) to a resurgence of biogenic accounts of human mentality.

Chapter 8 first indicates that the predominant late-20th century disciplinary "testing malaise," -in which all the basic assumptions of traditional testing approaches had already been successively questioned (within academic circles) but where no clear alternative approach was forthcoming (from those using or providing ability tests)- can be characterized as a tactical, somewhat disingenuous posture of a profession that was expecting a profound downsizing of educational test use to occur.  Under the ensuing federally mandated "era of accountability" (1981-2002), however, the testing industry was again expanded (largely unchanged) into all areas of the American educational system.  It is then argued that the attained outcome of this unprecedented ubiquity of testing in the schools has not lived up to its original rationale of promoting "Excellence in Education."

Also, in chapter 8, the exciting possibility of producing animal tests that include a social (i.e., group) dimension to their analysis and of producing human ability tests that include a societal dimension to their analysis will be raised.  This is done by way of: (1) a constructive criticism of both so-called dynamic interactionist and neo-Vygotskian varieties of research; and (2) reference to both a programmatic categorical diagram and to selected examples of transformational research from periods covered throughout this work.

It is then concluded that the structural and functional distinctions allowed by this transformative approach not only help us reinterpret older animal data, but also allows us to recognize and explain the typical developmental pattern of emergence of higher mental processes in human beings.  To recognize/accept/acknowledge this point in its historical rich detail is to: (1) avoid the historical pseudo-arguments (or ethical/legal pitfalls) of the past ability testing traditions; (2) to face the contemporary significance of previously collected data (in animal or applied psychology); and (3) to open up new vistas of endeavor for psychology, society, and ability testing.

Stellar historical examples covered

A carefully elaborated set of guiding lights (historical figures or research traditions) which encroached upon the transformative view of mentality is given throughout this work.  In chapter 1 the stars are: (1) Darwin (1859) on the continuity and discontinuity of organic evolution (and other assorted jaw-first theorists regarding human evolution -including E. Dubois, R. Dart); and (2) George Romanes (1888) on the social aspects of animal mentality (who raised the question of what sort of empirical practices might be developed to expose or reflect these qualitative aspects of mental evolution).  Similarly, chapter 2 highlights Oskar Pfungst's experimental investigation of the horse called "Clever Hans" between 1904-1907. Pfungst's interpretation of the obtained data stressed both the evolutionary basis of Hans' abilities and the social aspects of his particular upbringing.

After outlining the rise of the American public school system (and its administrative infrastructure) up to W.W.I in chapter 3, chapter 4 then provides two important historical exemplars.  Firstly, Walter Dill Scott (the first American professor of Applied Psychology) who objected to the W.W.I "intelligence tests" of the Yerkes group and developed both "Trade tests" (to assess the familiarity of recruits with military work) and a "peer review" questionnaire for assessing Officer Cadets, is mentioned.  Secondly, Bird T. Baldwin's Iowa based research on Rural vs. Urban students (1924-1927) which included the comparison of the available teaching resources and civic attitudes toward education in those divergent educational contexts, is outlined and contrasted with Terman's Stanford group of generalized mental testers. 

In chapter 5, Depression era youth programs (including the CCC and NYA) which both: (1) eventually attained a near universal access status and (2) provided important ameliorative educational programs are highlighted.  Similarly, the "Eight-Year Study" -where college entrance exams for selected schools were waived between 1933-1941 (thereby providing an argument for increased college access to those who may not have passed standardized entrance examinations) is elaborated. Both the W.W.II era Armed Forces University (which provided educational opportunities for personal advancement of military personnel during the war) and the GI Bill of Rights (designed to give returning veterans a leg up into the middle class) are then pointed out as extensions of this egalitarian trend which stand in marked contrast to the selection emphasis of contemporary refinements to the educational and vocational ability testing subdisciplines.

In chapter 6 (which describes the Cold War context and societal role of ability testing technologies from 1947-1963), the educational aspects of the GI Bill (which provided unprecedented higher educational access for returning W.W.II veterans) are outlined. The Cold War ideologies behind not adopting or extending this open model of higher educational access past that exceptional student cohort are also discussed in some detail. 

Chapter 7 first describes the context of progressive Great Society legislation and then highlights the Head Start program (which addressed both sociological circumstances and motivational issues as a half-step out of traditional interactionism). James Lawler's (1978) use of the ground breaking distinction between "inheritance" and "heritability" of IQ scores (as one part of his socio-historically informed means around the long-standing biogenic myopia of past approaches to racial differences in standardized test performance), is then emphasized. The contemporaneous disciplinary and technopolitical reasons behind the lack of recognition of these methodological breakthroughs are also briefly described. 

In chapter 8, after outlining successive revised interactionist attempts to deal with the historicity of testing data, various Neo-Vygotskian approaches to mental assessment (which take seriously Vygotsky's distinction between crystallized knowledge and ongoing development in the extra-individual "Zone of Proximal Development") are highlighted. Some discussion regarding the subdisciplinary penetration of this approach (to date) is provided and it is argued that while the older technopolitical biases against its widespread adoption by 21st century ability testers have been overcome, much work remains to be done (both within and outside that tradition).

In particular, the more thoroughly transformative views of: (1) Vygotsky & Luria (1930) -which stressed the socio-historical aspects of culturally mediated tool use in the production of higher mental functions; and (2) A.N. Leontiev's (1978; 1981) "Activity Theory" approach to animal and human intellect, are both mentioned as fruitful sources for the required methodological improvements. The utility of their transformative approach to human intellect for explaining the observed historical pattern of test score changes (often called the Flynn effect) is then demonstrated. Finally, a categorical hierarchy of the "Transformative Structure of Animal and Human Intellect" (as I see it) is provided and the implications for future testing (including the realization that social and societal intellect do not reside in the head of an individual) are offered up for posterity.

Together, these historical examples provide the heritage of successful approximations of a new transformative approach to intellectual assessment (i.e., an approach in which a general and specific account of the mechanisms by which higher societal mental processes are produced from lower biological, individual, and social forms becomes possible). Stated more plainly, this series of theoretically informative "one-offs" and either unrecognized or actively ignored research efforts are highly indicative of the kind of disciplinary approach which 21st century ability testing needs to adopt if it is to live up to its intended democratic role in American society.