100% found this document useful (1 vote)

1K views120 pages

Eugene J. Webb, Donald T. Campbell, Richard D. Schwartz, Lee Sechrest - Unobtrusive Measures - Nonreactive Research in The Social Sciences - Rand Mcnally (1966) PDF

Uploaded by

atoledemango

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views120 pages

Eugene J. Webb, Donald T. Campbell, Richard D. Schwartz, Lee Sechrest - Unobtrusive Measures - Nonreactive Research in The Social Sciences - Rand Mcnally (1966) PDF

Uploaded by

atoledemango

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 120

UNOBTRUSIVE MEASURES

No~zreactiveResearch in, the Social Sciences

EUGENE J. WEBB
DONALD T. CAMPBELL
RICHARD D. SCHWARTZ
LEE SECHREST
Northwestern University

RAND MFNALLY & COMPANY CHICAGO

EDGARF. BORGATTA,
Advisory Editor

Alford, Party and Society Preface

Borgatta and Crowther, A Workbook for the Study of Social Interaction
Processes
Christensen, ed., Handbook of Marriage and the Family
Demerath, Social Class in American Protestantism
Faris, ed., Handbook of Modern Sociology This monograph has had a series of working titles, and we
Glock and Stark, Religion and Society in Tension should identify them for the benefit of our friends who shared early
Hadden and Borgatta, American Cities: Their Social Characteristics drafts. To some, this is The Bulljighter's Beard-a provocative, if
Kaplan, ed., Science and Society uncommunicative, title drawn from the observation that toreros7
March, ed., Handbook of Organizations beards are longer on the day of the fight than on any other day. No
Nye and Hoffman, The Employed Mother in America one seems to know if the torero's beard really grows faster that day
Scott, Values and Organizations: A Study of Fraternities and Sororities
because of anxiety or if he simply stands further away from the
Warren, The Community in America
Warren, ed., Perspectives on the American Community blade, shaking razor in hand. Either way, there were not enough
Webb, Campbell, Schwartz, and Sechrest, Unobtrusive Measures American aficionados to get the point, so we added . . . and Other
Nonreactive Measures.
This title lasted for a while, but the occasionally bizarre
content of the material shifted the working title to Oddball Re-
search, Oddball Measures, and the like. Most of our friends have
known the manuscript under one of the "oddball" labels, and it is
only a fear of librarians that has caused us to drop it. In this day of
explicit indexing, we feared that the book would nestle on a shelf
between Notes for the T Quarterback and Putting Hints for Begin-
ning Golfers. As much as we might enjoy the company of an Arnold
Palmer, we prefer it outside the library. A widely circulated version
used the non-title Other Measures. The list of titles we have
specifically decided not to use is even longer and less descriptively
adequate.
In presenting these novel methods, we have purposely
avoided consideration of the ethical issues which they raise. We
have done so because we feel that this is a matter for separate
consideration. Some readers will find none of the methods objec-
tionable, others may find virtually all of them open to question.
Copyright @ 1966 b y Rand McNally & Conzpany Each school is welcome to use this compilation to buttress its
All rights reserved
Printed in U.S.A. b y Rand M?Nally & Company
position- either to illustrate the harmless ingenuity of social scien-
Library of Congress Catalog Card Number: 66-10806 tists or to marshal a parade of horribles. Although the authors vary
vi PREFACE PREFACE vii

in moral boiling points, we are all between these positions. Some of als. Manipulations aimed at the arousal of anxiety or extreme
the methods described strike us a s possibly unethical; their inclu- aggression could conceivably produce lasting damage to the
sion is not intended a s a warrant for their use. But we vary among psychological health of experimental subjects.
ourselves in criteria and application. W e do not feel able at this What is needed is a set of criteria by which various research
point to prepare a compelling ethical resolution of these complex techniques can b e appraised morally. Each of the social sciences
issues. Nonetheless, we recognize the need of such a resolution has attempted to develop a code of ethics for guidance in these
and hope that our compilation will, among other things, stimulate matters. So far, however, these have suffered from the absence of
and expedite thoughtful debate on these matters. a careful analysis of the problem. W e need a specification of the
Perhaps the most extreme position on this matter has been multiple interests potentially threatened by social science re-
stated by Edward Shils (1959). H e asserts that all social science search: the privacy of the individual, his freedom from manipula-
activity should b e disciplined by careful attention to the problem of tion, the protection of the aura of trust on which the society
privacy. He would rule out any "observations of private behavior, depends, and, by no means least in importance, the good reputa-
however technically feasible, without the explicit and fully in- tion of social science.
formed permission of the person to be approved." His concern on T h e multiple methods presented here may do more than raise
this issue would lead hinl to recommend that questionnaire and these questions for discussion. They may provide alternatives by
interview studies b e sharply limited by ethical considerations. which ethical criteria can b e met without impinging on important
Among the practices he deplores are (1)the simulation of warmth interests of the research subjects. Some of the methods described
by the interviewer to insure rapport and (2) giving the appearance here, such as the use of archival records and trace measures, may
of agreement to answers on controversial questions to encourage serve to avoid the problems of invasion of privacy by permitting
the expression of unpopular attitudes. the researcher to gain valuable information without ever identify-
H e would have the interviewer not only avoid such practices ing the individual actors or in any way manipulating them. If
but also disclose, presumably in advance, his purpose in asking the ethical considerations lead us to avoid participant observation,
question. This disclosure should include not only a statement of interviews, or eavesdropping in given circumstances, the novel
the researcher's "personal goal, e.g., to complete a thesis7' but also methods described in this monograph inay b e of value not only in
his "cognitive intention." Groups or types of questions "ought to be improving and supplementing our information but also in permit-
justified by the explanation of what the answers will contribute to ting ethically scrupulous social scientists to do their work effec-
the clarification of the problem being investigated." Even the tively and to sleep better at night.
technique of participant observation seems to Shils "morally W e received notable aid from the following of our associates:
obnoxious.. . manipulation" unless the observer discloses at the Howard S. Becker, James H. Crouse, Kay C. Kujala, Irene E.
outset his intention of conducting a social scientific investigation. Nolte, Michael L. Ray, Jerry R. Salancik, Carole R. Siegman,
Most social scientists would find this position too extreme. If it Gerald Solomon, and Susan H. Stocking. T h e acute eye and
were adopted, it would add enormously to the problems of reac- sensitive pen of Rand McNally's Lucia Boyden we bow befhre.
tivity with which this n ~ o n o g r a p lis~ primarily concerned. Nev- This study was supported in part by Project C-998, Contract
ertheless, Shils' position specifies some of the dangers to ille 3-20-001, with the Media Research Branch, Office of Education,
citizen and social science of an unconscionable invasion of U. S. Departnlent of Health, Education and Welfare, under pro-
privacy. Few would deny that social scientists can go too far in visions of Title VII of the National Defense Education Act.
intruding on privacy. Recording deliberations in a jury room or Grateful acknowledgment is made to the following for permis-
hiding under beds to record pillow talk are techniques which have sion to use copyrighted material:
led to moral revulsion on the part of large numbers of profession- American Journal of Psychology; Annual Review of Psycho-
...
vlll PREFACE

logy; Atheneum Publishers; Basic Books; Cambridge University

Press; Criterion Books; Doubleday; Free Press of Glencoe and
Macmillan; Harcourt, Brace and World; Holt, Rinehart and
Winston; Human Organization and the Society for Applied
Anthropology; Little, Brown and Company; Public Opinion
Quarterly; Oxford University Press; Simon & Schuster; Speech
Monographs and the Speech Association of America; University
of Chicago, Graduate School of Business; and Yale University
Press.
The errors which have penetrated the perimeter of our friends
we acknowledge. Surely every reader will think of studies which To the memory of Sir Francis Galton
could have been included, yet were heinously omitted. If such
studies are sent to the senior author, at Northwestern, an amended
bibliography will be prepared and distributed-either in another
edition of this book or separately.

E.J.W.
D.T.C.
R.D.S.
L.S.

Evanston, Illinois
June, 1965
Table of Contents

PREFACE v

1. APPROXIMATIONS T O KNOWLEDGE 1
Operationism and Multiple Operations - Interpretable Com-
parisons and Plausible Rival Hypotheses -Internal and
External Validity-Sources of Invalidity of Measures-
Reactive Measurement Effect: Error from the Respondent -
Error from the Investigator-Varieties of Sampling Error
-The Access to Content - Operating Ease and Validity
Checks

2. PHYSICAL TRACES: EROSION AND ACCRETION 35

Natural Erosion Measures -Natural Accretion Measures -
Controlled Erosion Measures - Controlled Accretion Mea-
sures -Transforming the Data: Corrections and Index
Numbers-An Over-All Evaluation of Physical Evidence
Data

3. ARCHIVES I: THE RUNNING RECORD 53

Actuarial Records -Political and Judicial Records - Other
Government Records - The Mass Media - Data Transforma-
tions and Indices of the Running Records -Over-All Evalua-
tion of Running Records

4. ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 88

Sales Records - Industrial and Institutional Records -
Written Documents -A Concluding Note

5. SIMPLE OBSERVATION 112

Exterior Physical Signs -Expressive Movement - Physical
Location- Observation of Language Behavior: Conversation
Sampling-Time Duration - Time Sampling and Observation
- Over-All Comments on Simple Observation
xii TABLE OF CONTENTS

6. CONTRIVED OBSERVATION: HIDDEN HARDWARE AND

CONTROL 142
Hardware: Avoiding Human Instrument Error - Hardware:
Physical Supplanting of the Observer-The Intervening
Observer - Entrapment - Petitions and Volunteering- An
Over-All Appraisal on Hidden Hardware and Control UNOBTRUSIVE MEASURES
7. A FINAL NOTE Nonreactive Research in the Social Sciences
8. A STATISTICIAN ON METHOD

9. CARDINAL NEWMAN'S -EPITAPH

REFERENCES

INDEX
Approximations to Knowledge

This survey directs attention to social science research data

not obtained by interview or questionnaire. Some may think this
exclusion does not leave much. It does. Many innovations in
research method are to be found scattered throughout the social
science literature. Their use, however, is unsystematic, their
importance understated. Our review of this material is intended to
broaden the social scientist's currently narrow range of utilized
methodologies and to encourage creative and opportunistic exploi-
tation of unique measurement possibilities.
Today, some 90 per cent of social science research is based
upon interviews and questionnaires. We lament this overdepen-
dence upon a single, fallible method. Interviews and questionnaires
intrude as a foreign element into the social setting they would
describe, they create as well as measure attitudes, they elicit
atypical roles and responses, they are limited to those who are
accessible and will cooperate, and the responses obtained are
produced in part by dimensions of individual differences irrelevant
to the topic at hand.
But the principal objection is that they are used alone. No
research method is without bias. Interviews and questionnaires
must be supplemented by methods testing the same social science
variables but having different methodological weaknesses.
In sampling the range of alternative approaches, we examine
their weaknesses, too. The flaws are serious and give insight into
why we do depend so much upon the interview. But the issue is not
choosing among individual methods. Rather it is the necessity for a
multiple operationism, a collection of methods combined to avoid
2 UNOBTRUSIVE MEASURES APPROXIMATIONS T O KNOWLEDGE 3

sharing the same weaknesses. The goal of this monograph is not to These methods have been grouped into chapters by the
replace the interview but to supplement and cross-validate it with characteristic of the data: physical traces, archives, observations.
measures that do not require the cooperation of a respondent and Before making a detailed examination of such methods, it is
that do not themselves contaminate the response. well to present a closer argument for the use of multiple methods
Here are some samples of the kinds of methods we will be and to present a methodological framework within which both the
surveying in Chapters 2 through 6 of this monograph: traditional and the more novel methods can be evaluated.
The reader may skip directly to Sherlock Holmes and the
opening of Chapter 2 if he elects, infer the criteria in a piece of
The floor tiles around the hatching-chick exhibit at Chi- detection himself, and then return for a validity check.
cago's Museum of Science and Industry must be replaced
every six weeks. Tiles in other parts of the museum need not
be replaced for years. The selective erosion of tiles, indexed
by the replacement rate, is a measure of the relative popu- The social sciences are just emerging from a period in which
larity of exhibits. the precision of carefully specified operations was confused with
The accretion rate is another measure. One investigator operationism by definitional fiat - an effort now increasingly recog-
wanted to learn the level of whisky consumption in a town nized as an unworkable model for science. We wish to retain and
which was officially "dry." He did so by counting empty augment the precision without bowing to the fiat.
bottles in ashcans. The mistaken belief in the operational definition of theoretical
The degree of fear induced by a ghost-story-telling ses- terms has permitted social scientists a complacent and self-defeat-
sion can be measured by noting the shrinking diameter of a ing dependence upon single classes of measurement - usually the
circle of seated children. interview or questionnaire. Yet the operational implication of the
Chinese jade dealers have used the pupil dilation of their inevitable theoretical complexity of every measure is exactly
customers as a measure of the client's interest in particular opposite: it calls for a multiple operationism, that is, for multiple
stones, and Darwin in 1872 noted this same variable as an measures which are hypothesized to share in the theoretically
index of fear. relevant components but have different patterns of irrelevant
Library withdrawals were used to demonstrate the effect components (e.g., Garner, 1954; Garner, Hake, & Eriksen, 1956;
of the introduction of television into a community. Fiction Campbell & Fiske, 1959; Campbell, 1960; Humphreys, 1960).
titles dropped, nonfiction titles were unaffected. Once a proposition has been confirmed by two or more
The role of rate of interaction in managerial recruitment independent measurement processes, the uncertainty of its inter-
is shown by the overrepresentation of baseball managers who pretation is greatly reduced. The most persuasive evidence comes
were infielders or catchers (high-interaction positions) during through a triangulation of measurement processes. If a proposition
their playing days. can survive the onslaught of a series of imperfect measures, with
Sir Francis Galton employed surveying hardware to esti- all their irrelevant error, confidence should be placed in it. Of
mate the bodily dimensions of African women whose language course, this confidence is increased by minimizing error in each
he did not speak. instrument and by a reasonable belief in the different and diver-
The child's interest in Christmas was demonstrated by gent effects of the sources of error.
distortions in the size of Santa Claus drawings. A consideration of the laws of physics, as they are seen in that
Racial attitudes in two colleges were compared by noting science's measuring instruments, demonstrates that no theoretical
the degree of clustering of Negroes and whites in lecture halls. parameter is ever measured independently of other physical pa-
4 UNOBTRUSIVE MEASURES APPROXIMATIONS T O KNOWLEDGE 5

.ameters and other physical laws. Thus, a typical galvanometer Efforts in the social sciences at multiple confirmation often
/? .esponds in its operational measurement of voltage not only ac- yield disappointing and inconsistent results. Awkward to write up
:ording to the laws of electricity but also to the laws of gravitation, and difficult to publish, such results confirm the gravity of the
nertia, and friction. By reducing the mass of the galvanometer
- e e d l e , by orienting the needle7s motion at right a n d e s to gravity,
problem and the risk of false confidence that comes with depen-
dence upon single methods (Vidich & Shapiro, 1955; Campbell,
7, oy setting the needle's axis in jeweled bearings, by counterweight-
mg the needle point, and by other refinements, the instrument
1957; Campbell & McCormack, 1957; Campbell & Fiske, 1959;
Kendall, 1963; Cook & Selltiz, 1964). When multiple operations
F' designer attempts to minimize the most important of the irrelevant provide consistent results, the possibility of slippage between
physical forces for his measurement purposes. As a result, the conceptual definition and operational specification is diminished
e galvanometer reading may reflect, almost purely, the single pa-
rameter of voltage (or amperage, etc.).
greatly.
This is not to suggest that all components of a multimethod
% Yet from a theoretical point of view, the movement of the approach should be weighted equally. Prosser (1964) has observed:
- needle is always a complex product of many physical forces and ". . . but there is still no man who would not accept dog tracks in
laws. The adequacy with which the needle measures the conceptu- the mud against the sworn testimony of a hundred eye-witnesses
ally defined variable is a matter for investigation; the operation that no dog had passed by" (p. 216). Components ideally should be
itself is not the ultimate basis for defining the variable. Excellent weighted according to the amount of extraneous variation each is
illustrations of the specific imperfections of measuring instru- known to have and, taken in combination, according to their
ments are provided by Wilson (1952). independence from similar sources of bias.
Starting with this example from physics and the construction
of meters, we can see that no meter ever perfectly measures a
INTERPRETABLE
COMPARISONS
AND
single theoretical parameter; all series of meter readings are
PLAUSIBLE
RIVALHYPOTHESES
imperfect estimates of the theoretical parameters they are in-
tended to measure. In this monograph we deal with methods of measurement
Truisms perhaps, yet they belie the mistaken concept of the appropriate to a wide range of social science studies. Some of
"operational definition" of theoretical constructs which continues these studies are comparisons of a single group or unit at two or
to be popular in the social sciences. The inappropriateness is more points in time; others compare several groups or units at one
accentuated in the social sciences because we have no measuring time; others purport to measure but a single unit at a single point
devices as carefully compensated to control all irrelevancies as is in time; and, to close the circle, some compare several groups at
the galvanometer. There simply are no social science devices two or more points in time. In this discussion, we assume that the
designed with so perfect a knowledge of all the major relevant goal of the social scientist is always to achieve interpretable
sources of variation. In physics, the instruments we think of as comparisons, and that the goal of methodology is to rule out those
"definitional" reflect magnificently successful theoretical achieve- plausible rival hypotheses which make comparisons ambiguous
ments and themselves embody classical experiments in their very and tentative.
operation. In the social sciences, our measures lack intelligence. Often it seems that absolute measurement is involved, and
They tap multiple processes and sources of variance of which we that a social instance is being described in its splendid isolation,
are as yet unaware. At such a stage of development, the theoretical not for comparative purposes. But a closer look shows that
impurity and factorial complexity of every measure are not niceties absolute, isolated measurement is meaningless. In all useful mea-
for pedantic quibbling but are overwhelmingly and centrally surement, an implicit comparison exists when an explicit one is not
relevant in all measurement applications which involve inference visible. "Absolute" measurement is a convenient fiction and usu-
and generalization. ally is nothing more than a shorthand summary in settings where
6 UNOBTRUSIVE MEASURES APPROXIMATIONS T O KNOWLEDGE 7

plausible rival hypotheses are either unimportant or so few, spe- established between an over-all index and external variables is
cific, and well known as to be taken into account habitually. Thus, found due to only one component of the index. Cronbach (1958) has
when we report a length "absolutely" in meters or feet, we described this problem well in his discussion of dyadic scores of
immediately imply comparisons with numerous familiar objects of interpersonal perception. In the older inethodological literature,
known length, as well as comparisons with a standard preserved in the problem is raised under the term index correlations (e.g.,
some Paris or Washington sanctuary. Stouffer, 1934; Guilford, 1954; Campbell, 1955).
If measurement is regarded always as a comparison, there are Despite these limitations, the problem of index numbers,
three classes of approaches which have come to be used in which once loomed large in sociology and economics, deserves to
achieving interpretable comparisons. First, and most satisfactory, be reactivated and integrated into modern social science metho-
is experimental design. Through deliberate randomization, the dology. The tradition is relevant in two ways for the problems of
ceteris of the pious ceteris paribus prayer can be made paribus. this monograph. Many of the sources of data suggested here,
This may require randomization of respondents, occasions, or particularly secondary records, require a transformation of the raw
stimulus objects. In any event, the randomization strips of plausi- data if they are to be interpretable in any but truly experimental.
bility many of the otherwise available explanations of the differ- situations. Such transformations should be performed with the
ence in question. It is a sad truth that randomized experimental wisdom accumulated within the older tradition, as well as with a
design is possible for only a portion of the settings in which social regard for the precautionary literature just cited. Properly done,
scientists make measurements and seek interpretable compari- such transformations often improve interpretability even if they
sons. The number of opportunities for its use may not be stagger- fall far short of some ideal (cf. Bernstein, 1935).
ing, but, where possible, experimental design should by all means A second value of the literature on index numbers lies in an
be exploited. Many more opportunities exist than are used. examination of the types of irrelevant variation which the index
Second, a quite different and historically isolated tradition of computation sought to exclude. The construction of index numbers
comparison is that of index numbers. Here, sources of variance is usually a response to criticisms of less sophisticated indices.
known to be irrelevant are controlled by transformations of raw They thus embody a summary of the often unrecorded criticisms
data and weighted aggregates. This is analogous to the compen- of prior measures. In the criticisms and the corrections are clues to
sated and counterbalanced meters of physical science which also implicit or explicit plausible rival interpretations of differences,
control irrelevant sources of variance. The goal of this old and the viable threats to valid interpretation.
currently neglected social science tradition is to provide measures Take so simple a measure as an index on unemployment or of
for meaningful comparisons across wide spans of time and social retail sales. The gross number of the unemployed or the gross total
space. Real wages, intelligence quotients, and net reproductive dollar level of sales is useless if one wants to make comparisons
rates are examples, but an effort in this direction is made even within a single year. Some of the objections to the gross figures are
when a percentage, a per capita, or an annual rate is computed. reflected in the seasonal corrections applied to time-series data. If
Index numbers cannot be used uncritically because the imperfect we look at only the last quarter of the year, we can see that the
knowledge of the laws invoked in any such measurement situation effect of weather must be considered. Systematically, winter
precludes computing any effective all-purpose measures. depresses the number of employed construction workers, for
Furthermore, the use of complex compensated indices in the example, and increases the unemployment level. Less systemati-
assurance that they measure what they are devised for has in many cally, spells of bad weather keep people in their homes and reduce
instances proved quite misleading. A notable example is found in the amount of retail shopping. Both periodic and aperiodic ele-
the definitional confusion surrounding the labor force concept ments of the weather should be considered if one wants a more
(Jaffe & Stewart, 1951; W. E. Moore, 1953). Often a relationship stable and interpretable measure of unemployment or sales. So,
8 UNOBTRUSII'E MEASURES APPROXIMATIONS TO KNOWLEDGE 9

too, our custom of giving gifts at Christmas spurs December sales, is the comparison. Platt (1964) and Hafner and Presswood (1965)
as does the coinciding custom of Christmas bonuses to employees. have discussed this approach with a focus in the physical sciences.
All of these are accounted for, crudely, by a correction applied to A social scientist may reduce the number of plausible rival
the gross levels for either December or the final quarter of the hypotheses in many ways. Experimental methods and adequate
year. indices serve as useful devices for eliminating some rival interpre-
Some of these sources of invalidity are too specific to a single tations. A checklist of commonly relevant threats to validity may
setting to be generalized usefully; others are too obvious to be point to other ways of limiting the number of viable alternative
catalogued. But some contribute to a general enumeration of re- hypotheses. For some major threats, it is often possible to provide
current threats to valid interpretation in social science measures. supplementary analyses or to assemble additional data which can
The technical problems of index-number construction are rule out a source of possible invalidity.
heroic. "The index number should give consistent results for Backstopping the individual scientist is the critical reaction of
different base periods and also with its counterpart price or his fellow scientists. Where he misses a plausible rival hypothesis,
quantity index. No reasonably simple formula satisfies both of he can expect his colleagues to propose alternative interpretations.
these consistency requirements" (Ekelblad, 1962, p. 726). The This resource is available even in disciplines which are not avow-
consistency problem is usually met by substituting a geometric edly scientific. J. H. Wigmore, a distinguished -legal scholar,
mean for an arithmetic one, but then other problems arise. With showed an awareness of the criteria of other plausible explana-
complex indices of many components, there is the issue of getting tions of data:
an index that will yield consistent scores across all the different If the potential defect of Inductive Evidence is that the
levels and times of the components. fact offered as the basis of the conclusion may be open to one or
In his important work on economic cycles, Hansen (1921) more other explanations or inferences, the failure to exclude a
wrote, "Here is a heterogeneous group of statistical series all of single other rational inference would be, from the standpoint of
Proof, a fatal defect; and yet, if only that single other inference
which are related in a causal way, somehow or another, to the
were open, there might still be an extremely high degree of
cycle of prosperity and depression" (p. 21). The search for a metric probability for the Inference desired. . . . The provisional test,
to relate these different components consistently, to be able to then, from the point of view valuing the Inference, would be
reverse factors without chaos, makes index construction a difficult something like this: Does the euidentiaryfact point to the desired
task. But the payoff is great, and the best approximation to solving conclusion . . . as the inference . . . most plausible or most n,atural
out of the various ones that are conceivable? [1937, p. 251.
both the base-reversal and factor-reversal issues is a weighted
aggregate with time-averaged weights. For good introductory state- The culture of science seeks, however, to systematize the produc-
ments of these and other index-number issues, see Yule & Kendall tion of rival plausible hypotheses and to extend them to every
(1950), Zeisel (1957), and Ekelblad (1962). More detailed treat- generalization proposed. While this may be implicit in a field such
ments can be found in Mitchell (l921), Fisher (1923), Mills (1927), as law, scientific epistemology requires that the original and
and Mudgett (1951). competing hypotheses be explicitly and generally stated.
The third general approach to comparison may be called that Such a commitment could lead to rampant uncertainty unless
of "plausible rival hypotheses." It is the most general and least some criterion of plausibility were adopted before the rival hy-
formal of the three and is applicable to the other two. Given a pothesis was taken as a serious alternative. Accordingly, each
comparison which a social scientist wishes to interpret, this ap- rival hypothesis is a threat only if we can give it the status of a law
proach asks what other plausible interpretations are allowed by at least as creditable as the law we seek to demonstrate. If it falls
the research setting and the measurement processes. The more of short of that credibility, it is not thereby "plausible" and can be
these, and the more plausible each is, the less validly interpretable ignored.
APPROXIMATIONS T O KNOWLEDGE 11
10 UNOBTRUSIVE MEASURES

In some logical sense, even in a "true" experimental compari- lude one and spuriously produce the appearance of a difference
son, an infinite number of potential laws could predict this result. where in fact none exists. For the rival hypothesis of chance, we
W e do not let this logical state of affairs prevent us from interpret- fortunately have an elaborated theoretical model which evaluates
its plausibility. A p-value describes the darkness of the ever
ing the results. Instead, uncertainty comes only from those unex-
present shadow of doubt. But for index-number comparisons not
cluded hypotheses to which we, in the current state of our science,
are willing to give the status of established laws: these are the embedded in a formal experiment, and for the plausible-rival-
plausible rival hypotheses. While the north-south orientation of hypothesis strategy more generally, the threats to internal validity
planaria may have something to do with conditioning, no interview -the argument that even the appearance of a difference is spuri-
studies report on the directional orientation of interviewer and ous-is a serious problem and the one that has first priority.
interviewee. And they should not. External validity is the problem of interpreting the difference,
For those plausible rival hypotheses to which we give the the problem of generalization. T o what other populations, occa-
status of laws, the conditions under which they would explain our sions, stimulus objects, and measures may the obtained results be
obtained result also imply specific outcomes for other sets of data. applied? T h e distinction between internal and external validity
Tests in other settings, attempting to verify these laws, may enable can b e illustrated in two uses of randomization. When the experi-
us to rule them out. In a similar fashion, the theory we seek to test mentalist in psychology randomly assigns a sample of persons into
has many implications other than that involved in the specific two or more experimental groups, he is concerned entirely with
comparison, and the exploration of these is likewise demanded. internal validity -with making it implausible that the luck of the
T h e more numerous and complex the manifestations of the law, draw produced the resulting differences. When a sociologist care-
the fewer singular plausible rival hypotheses are available, and the fully randomizes the selection of respondents so that his sample
more parsimony favors the law under study. represents a larger population, representativeness or external
Our longing is for data that prove and certify theory, but such validity is involved.
is not to be our lot. Some comfort may come from the observation T h e psychologist may be extremely confident that a difference
that this is not an existential predicament unique to social science. is traceable to an experimental treatment, but whether it would
T h e replacement of Newtonian theory by relativity and quantum hold up with another set of subjects or in a different setting may b e
mechanics shows us that even the best of physical science experi- quite equivocal. H e has achieved internal validity by his random
assignment but not addressed the external validity issue by the
mentation probes theory rather than proves it. Modern philoso-
phies of science as presented by Popper (1935; 1959; 1962), chance allocation of subjects.
Quine (1953), Hanson (1958), Kuhn (1962), and Campbell (196% T h e sociologist, similarly, has not met all the validity concerns
by simply drawing a random sample. Conceding that he has taken
make this point clear.
a necessary step toward achieving external validity and generaliza-
tion of his differences, the internal validity problem remains.
Random assignment is only one method of reaching toward
Before discussing a list of some common sources of invalidity, internal validity. Experimental-design control, exclusive of ran-
a distinction must be drawn between internal and external validity. domization, is another. Consider the case of a pretest-posttest
Internal validity asks whether a difference exists at all in any field experiment on the effect of a persuasive communication.
given comparison. It asks whether or not an apparent difference Randomly choosing those who participate, the social scientist
can be explained away as some measurement artifact. For true properly wards off some major threats to external validity. But we
experiments, this question is usually not salient, but even there, also know of other validity threats. T h e first interview in a two-
the happy vagaries of random sample selection occasionally de- stage study may set into motion attitude change and clarification
APPROXIMATIONS T O KNOWLEDGE 13
12 UNOBTRUSIVE MEASURES

processes which would otherwise not have occurred (e.g., Crespi, Reactive Measurement Effect: Error from the Respondent
1948). If such processes did occur, the comparison of a first and The most understated risk to valid interpretation is the error
second measure on the same person is internally invalid, for the produced by the respondent. Even when he is well intentioned and
shift is a measurement-produced artifact. cooperative, the research subject's knowledge that he is participat-
Even when a measured control.group is used, and a persua- ing in a scholarly search may confound the investigator's data.
sive communication produces a greater change in an experimental Four classes of this error are discussed here: awareness of being
group, the persuasive effect may be internally valid but externally tested, role selection, measurement as a change agent, and re-
invalid. There is the substantial risk that the effect occurs only sponse sets.
with pretested populations and might be absent in populations
lacking the pretest (cf. Schanck & Goodman, 1939; Hovland, 1. The guinea pig effect- awareness of being tested. Selltiz
Lumsdaine, & Sheffield, 1949; Solomon, 1949). For more extensive and her associates (1959) make the observation:
discussions of internal and external validity, see Campbell (1957)
The measurement process used in the experiment may itself
and Campbell and Stanley (1963). affect the outcome. If people feel that they are "guinea pigs"
The distinction between internal and external validity is often being experimented with, or if they feel that they are being
murky. In this work, we have considered the two classes of threat "tested" and must make a good impression, or if the method of
jointly, although occasionally detailing the risks separately. The data collection suggests responses or stimulates an interest the
subject did not previously feel, the measuring process may
reason for this is that the factors which are a risk for internal distort the experimental results [p. 971.
validity are often the same as those threatening external validity.
While for one scientist the representative sampling of cities is a These effects have been called "reactive effect of measurement"
method to achieve generalization to the United States population, and "reactive arrangement" bias (Campbell, 1957; Campbell
for another it may be an effort to give an internally valid compari- & Stanley, 1963). It is important to note early that the awareness of
son across cities. testing need not, by itself, contaminate responses. It is a question
of probabilities, but the probability of bias is high in any study in
which a respondent is aware of his subject status.
Although the methods to be reviewed here do not involve
In this section, we review frequent threats to the valid inter- "respondents," comparable reactive effects on the population may
pretation of a difference -common plausible rival hypotheses. often occur. Consider, for example, a potentially nonreactive
They are broadly divided into three groups: error that may be instrument such as the movie camera. If it is conspicuously
traced to those being studied, error that comes from the investiga- placed, its lack of ability to talk to the subjects doesn't help us
tor, and error associated with sampling imperfections. This section much. The visible presence of the camera undoubtedly changes
is the only one in which we draw illustrations mainly from the most behavior, and does so differentially depending upon the labeling
involved. The response is likely to vary if the camera has printed on
popular methods of current social science. For that reason, par-
its side "Los Angeles Police Department" or "NBC" or "Founda-
ticular attention is paid to those weaknesses which create the need
tion Project on Crowd Behavior." Similarly, an Englishman's
for multiple and alternate methods.
presence at a wedding in Africa exerts a much more reactive effect
In addition, some other criteria such as the efficiency of the
research instrument are mentioned. These are independent of on the proceedings than it would on the Sussex Downs.
validity, but important for the practical research decisions which A specific illustration may be of value. In the summer of 1952,
must be made. some graduate students in the social sciences at the University of
14 UNOBTRUSIVE MEASURES APPROXIMATIONS TO KNOWLEDGE
15
I

Chicago were employed to observe the numbers of Negroes and istrations sets an example of potential reactive effects, self-con-
whites in stores, restaurants, bars, theaters, and so on on a south sciousness, and dissembling on the part of archivists.
side Chicago street intersecting the Negro-white boundary (East These reactive effects may threaten both internal and external
63rd). This, presumably, should have been a nonreactive process, validity, depending upon the conditions. If it seems plausible that
particularly at the predominantly white end of the street. No the reactivity was equal in both measures of a comparison, then
questions were asked, no persons stopped. Yet, in spite of this the threat is to external validity or generalizability, not to internal
hopefully inconspicuous activity, two merchants were agitated and validity. If the reactive effect is plausibly differential, then it may
persistent enough to place calls to the university which somehow 1
generate a pseudo-difference. Thus, in a study (Campbell &
got through to the investigators; how many others tried and failed McCormack, 1957) showing a reduction in authoritarian attitudes
cannot be known. The two calls were from a store operator and the over the course of one year's military training, the initial testing
manager of a currency exchange, both of whom wanted assurance was done in conjunction with an official testing program, while the
that this was some university nosiness and not a professional subsequent testing was clearly under external university research
casing for subsequent robbery (Campbell & Mack, in preparation). auspices. As French (1955) pointed out in another connection, this
An intrusion conspicuous enough to arouse such an energetic difference provides a plausible reactive threat jeopardizing the
reaction may also have been conspicuous enough to change behav- conclusion that any reduction has taken place even for this one
ior; for observations other than simple enumerations the bias group, quite apart from the external validity problems of explana-
would have been great. But even with the simple act of nose- tion and generalization. In many interview and questionnaire
counting, there is the risk that the area would be differentially studies, increased or decreased rapport and increased awareness
avoided. The research mistake was in providing observers with of the researcher's goals or decreased fear provide plausible
clipboards and log sheets, but their appearance might have been alternative explanations of the apparent change recorded.
still more sinister had they operated Veeder counters with hands The common device of guaranteeing anonymity demonstrates
jammed in pockets. concern for the reactive bias, but this concern may lead to validity
The present monograph argues strongly for the use of archival threats. For example, some test constructors have collected
records. Thinking, perhaps, of musty files of bound annual reports normative data under conditions of anonymity, while the test is
of some prior century, one might regard such a method as totally likely to be used with the respondent's name signed. Making a
immune to reactive effects. However, were one to make use of response public, or guaranteeing to hide one, will influence the
precinct police blotters, going around to copy off data once each nature of the response. This has been seen for persuasive com-
month, the quality and nature of the records would almost cer- munications, in the validity of reports of brands purchased, and for
tainly change. In actual fact, archives are kept indifferently, as a the level of antisocial responses. There is a clear link between
low-priority task, by understaffed bureaucracies. Conscientious- awareness of being tested and the biases associated with a tend-
ness is often low because of the lack of utilization of the records. ency to answer with socially desirable responses.
The presence of a user can revitalize the process-as well as The considerations outlined above suggest that reactivity may
create anxieties over potentially damaging data (Campbell, be selectively troublesome within trials or tests of the experiment.
1963a). When records are seen as sources of vulnerability, they Training trials may accommodate the subject to the task, but a
may be altered systematically. Accounts thought likely to enter practice effect may exist that either enhances or inhibits the
into tax audits are an obvious case (Schwartz, 1961), but admin- reactive bias. Early responses may be contaminated, later ones
istrative records (Blau, 1955) and criminal statistics (Kadish, not, or vice versa (Underwood, 1957).
1964) are equally amenable to this source of distortion. The Ultimately, the determination of reactive effect depends on
selective and wholesale rifling of records by ousted political admin- validating studies - few examples of which are currently available.
APPROXIMATIONS TO KNOWLEDGE
16 UNOBTRUSIVE MEASURES 17
I tude of this variable's effect (Orne, 1959; Orne, 1962; Orne &
Behavior observed under nonreactive conditions must be com-
Scheibe, 1964; Orne & Evans, 1965). Orne has noted:
pared with corresponding behavior in which various potentially
reactive conditions are introduced. Where no difference in direc- The experimental situation is one which takes place within
tion of relationship occurs, the reactivity factor can be discounted. context of an explicit agreement of the subject to participate in
a special form of social interaction known as "taking part in an
In the absence of systematic data of this kind, we have little
experiment." Within the context of our culture the roles of
basis for determining what is and what is not reactive. Existing subject and experimenter are well understood and carry with
techniques consist of asking subjects in a posttest interview them well-defined mutual role expectations [1962, p. 7771.
whether they were affected by the test, were aware of the decep-
Looking at all the cues available to the respondent attempting to
tion in the experiment, and so forth. While these may sometimes
puzzle out an appropriate set of roles or behavior, Orne labeled the
demonstrate a method to be reactive, they may fail to detect many
total of all such cues the "demand characteristics of the experi-
instances in which reactivity is a serious contaminant. Subjects
mental situation." The recent study by Orne & Evans (1965)
who consciously dissemble during an experiment may do so after-
showed that the alleged antisocial effects induced by hypnosis can
ward for the same reasons. And those who are unaware of the
be accounted for by the demand characteristics of the research
effects on them at the time of the research may hardly be counted
setting. Subjects who were not hypnotized engaged in "antisocial"
on for valid reports afterwards.
activities as well as did those who were hypnotized. The behavior
The types of measures surveyed in this monograph have a
of those not hypnotized is traced to social cues that attend the
double importance in overcoming reactivity. In the absence of
experimental situation and are unrelated to the experimental
validation for verbal measures, nonreactive techniques of the kind
variable.
surveyed here provide ways of avoiding the serious problems faced
The probability of this confounding role assumption varies
by more conventional techniques. Given the limiting properties of
from one research study to another, of course. The novelty of a
these "other measures," however, their greatest utility may inhere
test-taking role may be selectively biasing for subjects of different
in their capacity to provide validation for the more conventional
educational levels. Less familiar and comfortable with testing,
measures.
those with little formal schooling are more likely to produce
nonrepresentative behavior. The act of being tested is "more
2. Role selection. Another way in which the respondent's
different." The same sort of distortion risk occurs when subject
awareness of the research process produces differential reaction
matter is unusual or novel. Subject matter with which the respon-
involves not so much inaccuracy, defense, or dishonesty, but
dent is unfamiliar may produce uncertainty of which role to select.
rather a specialized selection from among the many "true" selves
A role-playing choice is more likely with such new or unexpected
or "proper" behaviors available in any respondent.
material.
By singling out an individual to be tested (assuming that being
Lack of familiarity with tests or with testing materials can
tested is not a normal condition), the experimenter forces upon the
influence response in different ways. Responses may be depressed
subject a role-defining decision-What kind of a person should I
because of a lack of training with the materials. Or the response
be as I answer these questions or do these tasks? In many of the
level may be distorted as the subject perceives himself in the rare
"natural" situations to which the findings are generalized, the
role of expert.
subject may not be forced to define his role relative to the behav-
Both unfamiliarity and "expertness" can influence the charac-
ior. For other situations, he may. Validity decreases as the role
ter as well as the level of response. It is common to find experimen-
assumed in the research setting varies from the usual role present
tal procedures which augment the experting bias. The instruction
in comparable behavior beyond the research setting. Orne and his
which reads, "You have been selected as part of a scientifically
colleagues have provided compelling demonstrations of the magni-
APPROXIMATIONS TO KNOWLEDGE 19
18 UNOBTRUSIVE MEASURES

The effect has been long established in the social sciences. In

selected sample.. . i t is important that you answer the ques-
psychology, early research in transfer of training encountered the
tions . . ." underlines in what a special situation and what a special
threat to internal validity called "practice effects": the exercise
person the respondent is. The empirical test of the experting
provided by the pretest accounted for the gain shown on the
hypothesis in field research is the extent of "don't know" replies.
posttest. Such research led to the introduction of control groups in
One should predict that a set of instructions stressing the impor-
studies that had earlier neglected to include them. Similarly,
tance of the respondent as a member of a "scientifically selected
research in intelligence testing showed that dependable gains in
sample" will produce significantly fewer "don't knows" than an
test-passing ability could be traced to experience with previous
instruction set that does not stress the individual's importance.
tests even where no knodedge of results had been provided. (See
Although the "special person" set of instructions may in-
Cane & Heim, 1950, and Anastasi, 1958, pp. 190-191, for reviews
crease participation in the project, and thus reduce some concern
of this literature.) Similar gains have been shown in personal
on the sampling level, it concurrently increases the risk of reactive
"adjustment" scores (Windle, 1954).
bias. In science as everywhere else, one seldom gets something for
While such effects are obviously limited to intrusive measure-
nothing. The critical question for the researcher must be whether
ment methods such as this review seeks to avoid, the possibility of
or not the resultant sampling gain offsets the risk of deviation from
analogous artifacts must be considered. Suppose one were inter-
"true" responses produced by the experting role.
ested in measuring the weight of women in a secretarial pool, and
Not only does interviewing result in role selection, but the
their weights were to be the dependent variable in a study on the
problem or its analogues may exist for any measure. Thus, in a
effects of a change from an all-female staff to one including men.
study utilizing conversation sampling with totally hidden micro-
One might for this purpose put free weight scales in the women's
phones, each social setting elicits a different role selection.
restroom, with an automatic recording device inside. However, the
Conversation samples might thus differ between two cities, not
recurrent availability of knowledge of one's own weight in a
because of any true differences, but rather because of subtle
semisocial situation would probably act as a greater change agent
differences in role elicitation of the differing settings employed. for weight than would any experimental treatment that might be
under investigation. A floor-panel treadle would be better, record-
3. Measurement as change agent. With all the respondent
ing weights without providing feedback to the participant, possibly
candor possible, and with complete role representativeness, there
disguised as an automatic door-opener.
can still be an important class of reactive effects - those in which
the initial measurement activity introduces real changes in what is
being measured. The change may be real enough in these in- 4. Response sets. The critical literature on questionnaire
methodology has demonstrated the presence of several irrelevant
stances, but be invalidly attributed to any of the intervening
but lawful sources of variance. Most of these are probably applica-
events, and be invalidly generalized to other settings not involving
ble to interviews also, although this has been less elaborately
a pretest. This process has been deliberately demonstrated by
demonstrated to date. Cronbach (1946) has summarized this litera-
Schanck and Goodman (1939) in a classic study involving informa-
ture, and evidence continues to show its importance (e.g., Jackson
tion-test taking as a disguised persuasive process. Research by
& -Messick, 1957; Chapman & Bock, 1958).
Roper (cited by Crespi, 1948) shows that the well-established
Respondents will more frequently endorse a statement than
"preamble effect" (Cantril, 1944) is not merely a technical flaw in
disagree with its opposite (Sletto, 1937). This tendency differs
determining the response to the question at hand, but that it also
widely and consistently among individuals, generating the reliable
creates attitudes which persist and which are measurable on
source of variance known as acquiescence response set. Rorer
subsequent unbiased questions. Crespi reports additional research
(1965) has recently entered a dissent from this point of view. He
of his own confirming that even for those who initially say "don't
validly notes the evidence indicating that acquiescence or yea-
know," processes leading to opinion development are initiated.
20 UNOBTRUSIVE MEASURES
APPROXIMATIONS T O KNOWLEDGE 21

saying is not a totally general personality trait elicitable by items urge more methodological research to make known the degree of
of any content. He fails to note that, even so, the evidence clearly error that may be traced to reactivity, our inclination now is to urge
indicates the methodological problem that direction of wording the use of compensating measures which do not contain the
lawfully enhances the correlation between two measures when reactive risk.
shared, and depresses the correlation when running counter to
the direction of the correlation of the content (Campbell, 1965b). Error from the Investigator
Another idiosyncracy, dependably demonstrated over varied To some degree, error from the investigator was implicit in the
multiple-choice content, is the preference for strong statements reactive error effects. After all, the investigator is an important
versus moderate or indecisive ones. Sequences of questions asked source of cues to the respondent, and he helps to structure the
in very similar format produce stereotyped responses, such as a demand characteristics of the interview. However, in these previ-
tendency to endorse the righthand or the lefthand response, or to ous points, interviewer character was unspecified. Here we deal
alternate in some simple fashion. Furthermore, decreasing atten- with effects that vary systematically with interviewer characteris-
tion produces reliable biases from the order of item presentation. tics, and with instrument errors totally independent of respon-
Kesponse biases can occur not only for questionnaires or dents.
public opinion polls, but also for archival records such as votes
(Bain & Hecock, 1957). Still more esoteric observational or erosion 5. Interviewer effects. It is old news that the characteristics
measures face similar problems. Take the example of a traffic of the interviewer can contribute a substantial amount of variance
study. to a set of findings. Interviewees respond differentially to visible
Suppose one wanted to obtain a nonreactive measure of the cues provided by the interviewer. Within any single study, this
relative attractiveness of paintings in an art museum. He might variance can produce a spurious difference. The work of Katz
employ an erosion method such as the relative degree of carpet or (1942) and Cantril(1944) early demonstrated the differential effect
floor-tile wear in front of each painting. Or, more elaborately, he of the race of the interviewer, and that bias has been more recently
might install invisible photoelectric timers and counters. Such an shown by Athey and his associates (1960). Riesman and Ehrlich
approach must also take into account irrelevant habits which (1961) reported that the age of the interviewer produced a bias,
affect traffic flow. There is, for example, a general right-turn bias with the number of "unacceptable" (to the experimenter) answers
upon entering a building or room. When this is combined with time higher when questions were posed by younger interviewers. Reli-
deadlines and fatigue (Do people drag their feet more by the time gion of the interviewer is a possible contaminant (Robinson &
they get to the paintings on the left side of the building?), there Rohde, 1946; Hyman et al., 1954), as is his social class (Riesman,
probably is a predictably biased response tendency. The design of 1956; Lenski & Leggett, 1960). Benney, Kiesman, and Star (1956)
museums tends to be systematic, and this, too, can bias the showed that one should consider not only main effects, but also
measures. The placement of an exit door will consistently bias the interactions. In their study of age and sex variables they report:
traffic flow and thus confound any erosion measure unless it is "Male interviewers obtain fewer responses than female, and few-
controlled. (For imagnative and provocative observational studies est of all from males, while female interviewers obtain their
on museum behavior see Robinson, 1928; Melton, 1933a; Melton, highest responses from men, except for young women talking to
1933b; Melton, 1935; Melton, 1936; Melton, Feldman, & Mason, young men" (p. 143).
1936.) The evidence is overwhelming that a substantial number of
Each of these four types of reactive error can be reduced by biases are introduced by the interviewer (see Hyman et al., 1954;
employing research measures which do not require the coopera- Kahn & Cannell, 1957). Some of the major biases, such as race,
tion of the respondent and which are "blind" to him. Although we are easily controllable; other biases, such as the interaction of age
APPROXIMATIONS T O KNOWLEDGE 23
22 UNOBTRUSIVE MEASURES

and sex, are less easily handled. If we heeded all the known biases, istrative reforms in conscientiousness altering the output of the
without considering our ignorance of major interactions, there "instrument" (Kitsuse & Cicourel, 1963).
could no longer be a simple survey. The understandable action by Where human observers are used, they have fluctuating
most researchers has been to ignore these biases and to assume adaptation levels and response thresholds (Holmes, 1958; Camp-
them away. The biases are lawful and consistent, and all research bell, 1961). Rosenthal, in an impressive series of commentary
employing face-to-face interviewing or questionnaire administra- and research, has focused on errors traceable to the experimenter
tion is subject to them. Rather than flee by assumptions, the himself. Of particular interest is his work on the influence of early
experimenter may use alternative methodologies that let him flee data returns upon analysis of subsequent data (Rosenthal et al.,
by circumvention. 1963. See also Rosenthal, 1963; Rosenthal &Fode, 1963; Rosenthal
& Lawson, 1963; Rosenthal, 1964; Kintz et al., 1965).
6. Change i n the research instrument. The measuring (data-
Varieties of Sampling Error
gathering) instrument is frequently an interviewer, whose charac-
teristics, we have just shown, may alter responses. In panel Historically, social science has examined sampling errors as a
studies, or those using the same interviewer at two or more points problem in the selection of respondents. The person or group has
in time, it is essential to ask: To what degree is the interviewer or been the critical unit, and our thinking has been focused on a
universe of people. Often a sample of time or space can provide a
experimenter the same research instrument at all points of the
research? pactical substitute for a sample of persons. Novel methods should
be examined for their potential in this regard. For example, a study
Just as a spring scale becomes fatigued with use, reading
of the viewing of bus advertisements used a time-stratified, ran-
"heavier" a second time, an interviewer may also measure differ-
dom triggering of an automatic camera pointed out a window over
ently at different times. His skill may increase. He may be better
the bus ad (Politz, 1959). One could similarly take a photograpllic
able to establish rapport: He may have learned necessary voca-
sample of bus passengers modulated by door entries as counted by
bulary. He may loaf or become bored. He may have increasingly
a photo cell. A photo could be taken one minute after the entry of
strong expectations of what a respondent "means" and code
every twentieth passenger. For some methods, such as the erosion
differently with practice. Some errors relate to recording accuracy,
methods, total population records are no more costly than partial
while others are linked to the nature of the interviewer's interpre-
ones. For some archives, temporal samples or agency samples are
tation of what transpired. Either way, there is always the risk that
possible. For voting records, precincts may be sampled. But for
the interviewer will be a variable filter over time and experience.
any one method, the possibilities should be examined.
Even when the interviewer becomes more competent, there is
We look at sampling in this section from the point of view of
potential trouble. Although we usually think of difficulty only when
the instrument weakens, a difference in competence between two restrictions on reaching people associated with various methods
and the stability of populations over time and areas.
waves of interviewing, either increasing or decreasing, can yield
spurious effects. The source of error is not limited to interviewers,
and every class of measurement is vulnerable to wavering calibra- 7. Population restrictions. In the public-opinion-polling
tion. Suicides in Prussia jumped 20 per cent between 1882 and tradition, one conceptualizes a "universe" from which a repre-
sentative sample is drawn. This model gives little or no formal
1883. This clearly reflected a change in record-keeping, not a
attention to the fact that only certain universes are possible for
massive increase in depression. Until 1883 the records were kept
any given method. A method-respondent interaction exists -one
by the police, but in that year the job was transferred to the civil
that gives each method a different set of defining boundaries for its
service (Halbwachs, 1930; cited in Selltiz et al., 1959). Archivists
universe. One reason so little attention is given to this fact is that,
undoubtedly drift in recording standards, with occasional admin-
24 UNOBTRUSIVE MEASURES APPROXIMATIONS T O KNOWLEDGE 25

as methods go, public opinion polling is relatively unrestricted. Yet money reward is selectively attractive - at least at the rates most
even here there is definite universe rigidity, with definite restric- researchers pay. A considerable proportion of the populace is
tions on the size and character of tlie population able to be functionally illiterate for personality and attitude tests developed
sampled. on college populations.
In the earliest days of polling, people were questioned in Not only does task-demandingness create population restric-
public places, probably excluding some 80 per cent of the total tions, differential volunteering provides similar effects, interacting
population. Shifting to in-home interviewing with quota controls in a particularly biasing way when knowledge of the nature of the
and no callbacks still excluded some 60 per cent-perhaps 5 per task is involved (Capra & Dittes, 1962). Baumrind (1964) writes of
cent unaccessible in homes under any conditions, 25 per cent not the motivation of volunteers and notes, "The dependent attitude of
at home, 25 per cent refusals, and 5 per cent through interviewers' most subjects toward the experimenter is an artifact of the experi-
reluctance to approach homes of extreme wealth or poverty and a mental situation as well as an expression of some subjects' per-
tendency to avoid fourth-floor walkups. sonal need systems at the time they volunteer" (p. 421).
Under modern probability sampling with callbacks and house- The curious, the exhibitionistic, and the succorant are likely
hold designation, perhaps only 15 per cent of tlie population is to overpopulate any sample of volunteers. How secure a base can
excluded: 5 per cent are totally inaccessible in private residences volunteers be with such groups overrepresented and the shy,
(e.g., those institutionalized, hospitalized, homeless, transient, in suspicious, and inhibited underrepresented? The only defensible
the military, mentally incompetent, and so forth), another 10 per position is a probability sample of the units to which the findings
cent refuse to answer, are unavailable after three callbacks, or will be generalized. Even conscripting sophomores may be better
have moved to no known address. A 20 per cent figure was found in than relying on volunteers.
the model Elmira study in its first wave (Williams, 1950), although Returning to the rigidity of sampling, what proportion of the
other studies have reported much lower figures. Ross (1963) has total population is available for the studio test audiences used in
written a general statement on the problem of inaccessibility, and advertising and television program evaluation? Perhaps 2 per
Stephan and McCarthy (1958), in their literature survey, show from cent. For mailed questionnaires, the population available for ad-
3 to 14 per cent of sample populations of residences inaccessible. dressing might be 95 per cent of the total in the United States, but
Also to be considered in population restriction is the degree to low-cost, convenient mailing lists probably cover no more than 70
which the accessible universe deviates in important parameters per cent of the families through automobile registration and tele-
from the excluded population. This bias is probably minimal in phone directories. The exclusion is, again, highly selective. If,
probability sampling with adequate callbacks, but great with however, we consider the volunteering feature, where 10 per cent
catch-as-catch-can and quota samples. Much survey research has returns are typical, the effective population is a biased 7 per cent
centered on household behavior, and the great mass of probability selection of the total. The nature of this selective-return bias,
approaches employ a prelisted household as the terminal sampling according to a recent study (Vincent, 1964), includes a skewing of
unit. This frequently requires the enlistment of a household mem- the sample in favor of lower-middle-class individuals drawn from
ber as a reporter on the behavior of others. Since those who answer unusually stable, "happy" families.
doorbells overrepresent the old, the young, and women, this can be There are more households with television in the United
a confounding error. States than there are households with telephones (or baths). I11 any
When we come to more demanding verbal techniques, the given city, one is likely to find more than 15 per cent of the
universe rigidity is much greater. What proportion of the popula- households excluded in a telephone subscription list -and most of
tion is available for self-administered questionnaires? Payment for these are at the bottom of the socioeconomic scale. Among sub-
filling out the questionnaire reduces the limitations a bit, but a scribers, as many as 15 per cent in some areas do not list their
UNOBTRUSIVE MEASURES APPROXIMATIONS TO KNOWLEDGE 27

number, and an estimate of 5 per cent over all is conservative. mer and winter vacations. An extended discussion of time sam-
Cooper (1964) found an over-all level of 6 per cent deliberately not pling has been provided by Brookover and Back (1965).
listed and an additional 12 per cent not in the directory because of
recent installations. T h e unlisted problem can be defeated by a 9. Pop~~latio7zstability over areas. Similarly, research popu-
system of random-digit dialing, but this increases the cost at least lations available to a given method may vary from region to region,
tenfold and requires a prior study of the distribution of exchanges. providing a more serious problem than a population restriction
Among a sample of known numbers, some 50 per cent of dialings common to both. Thus, for a comparison of attitudes between New
are met with busy signals and "not-at-homes." Thus, for a survey York and Los Angeles, conversation sampling in buses and com-
without callbacks, the accessible population of 80 per cent (listed- muter trains would tap such different segments of the communities
phone t~ouseholds)reduces to 40 per cent. If individuals are the as to be scarcely worth doing. Again, a comparison of employees'
unit of analysis, the effective sampling rate, without callbacks, washrooms in comparable office buildings would provide a more
may drop to 20 per cent. Random-digit dialing will help; so, too, interpretable comparison. Through the advantage of background
will at least three callbacks, but precision can be achieved only at data to check on some dimensions of representativeness, public
a high price. T h e telephone is not so cheap a research instrument opinion surveys again have an advantage in this regard.
as it first looks. Any enumeration of sources of invalidity is bound to be
Sampling problems of this sort are even more acute for the incomplete. Some threats are too highly specific to a given setting
research methods considered in the present monograph. Although and method to be generalized, as are some opportunities for
a few have the full population access of public opinion surveys, ingenious measurement and control. This list contains a long
most have much more restricted populations. Consider, for exam- series of general threats that apply to a broad body of research
ple, the sampling of natural conversations. What are the propor- method and content. It does not say that additional problems
tions of men and women whose conversations are accessible in cannot be found.
public places and on public transport? What is the representative-
ness of social class or role?

8. Populatio~sstability over time. Just as internal validity is The population restrictioils discussed here are apt to seem so
more important than external validity, so, too, is the stability of a severe as to traumatize the researcher and to lead to the abandon-
population restriction more important than the magnitude of the ment of the method. This is particularly so for one approaching
restriction. Examine conversation sampling on a bus or streetcar. social science with the goal of complete clescription. Such trauma
The population represented differs on dry days and snowy days, in is, of course, far from our intention. While discussion of these
winter and spring, as well a s by day of the week. These shifts restrictions is a necessary background to their intelligent use and
would in many instances provide plausible rival explanations of correction, there is need here for a parenthesis forestalling exces-
shifts in topics of conversation. Sampling from a much narrower sive pessimism.
universe would be preferable if the population were more stable First, it can be noted that a theory predicting a change in civic
over time, as, say, conversation samples from an employees' opinion, due to an event and occurring between two time periods,
restroom in an office building. Comparisons of interview survey might be such that this opinion shift could be predicted for many
results over time periods are more troubled by population instabil- partially overlapping populations. One might predict changes on
ity than is generally realized, because of seasonal layoffs in many public opinion polls within that universe, changes in sampled
fields of employment, plus status-differentiated patterns of sum- conversation on commuter trains for a much smaller segment,
28 UNOBTRUSIVE MEASURES APPROXIMATIONS TO KNOWLEDGE 29

changes in letters mailed to editors and the still more limited misleading when it is assume'd that raw data provide complete
letters published by editors, changes in purchase rates of books on description. Theory is necessarily abstract, for any given event is
relevant subjects by that minute universe, and so on. In such an so complex that its complete description may demand many more
instance, the occurrence of the predicted shift on any one of these theories than are actually brought to bear on it -or than are even
meters is confirmatory and its absence discouraging. If the effect is known at any given stage of development. But theories are more
found on only one measure, it probably reflects more on the complete descriptions than obtained data, since they describe
method than on the theory (e.g., Burwen & Campbell, 1957; processes and entities in their unobserved as well as in their
Campbell & Fiske, 1959). A more complicated theory might well observed states. The scintillation counter notes but a small and
predict differential shifts for different meters, and, again, the nonrepresentative segment of a meson's course. The visual data of
evidence of each is relevant to the validity of the theory. The joint an ordinary object are literally superficial. Perceiving an object as
confirmation between pollings of high-income populations and solid or vaporous, persistent or transient, involves theory going far
commuter-train conversations is much more validating than either beyond the data given. The raw data, observations, field notes,
taken alone, just because of the difference between the methods in tape recordings, and sound movies of a social event are but
irrelevant components. transient superficial outcroppings of events and objects much
The "outcropping" model from geology may be used more more continuously and completely (even if abstractly) described in
generally. Any given theory has innumerable implications and the social scientist's theory. Tycho Brahe and Kepler's observa-
makes innumerable predictions which are unaccessible to availa- tions provided Kepler with only small fragments of the orbit of
ble measures at any given time. The testing of the theory can only Mars, for a biased and narrow sampling of times of day, days, and
be done at the available outcroppings, those points where theoreti- years. From these he constructed a complete description through
cal predictions and available instrumentation meet. Any one such theory. The fragments provided outcroppings sufficiently stubborn
outcropping is equivocal, and all types available should be to force Kepler to reject his preferred theory. The data were even
checked. The more remote or independent such checks, the more sufficient to cause the rejection of Newton's later theory had
confirmatory their agreement. Einstein's better-fitting theory then been available.
Within this model, science opportunistically exploits the So if the restraints on validity sometimes seem demoralizing,
available points of observation. As long as nature abhorred a they remain so only as long as one set of data, one type of method,
vacuum up to 33 feet of water, little research was feasible. When is considered separately. Viewed in consort with other methods,
manufacturing skills made it possible to represent the same abhor- matched against the available outcroppings for theory testing,
rence by 76 centimeters of mercury in a glass tube, a whole new there'can be strength in converging weakness.
outcropping for the checking of theory was made available. The
telescope in Galilee's hands, the microscope, the induction coil,
the photographic emulsion of silver nitrate, and the cloud chamber
all represent partial new outcroppings available for the verification Often a choice among methods is delimited by the relative
of theory. Even where several of these are relevant to the same ability of different classes of measurement to penetrate into con-
theory, their mode of relevance is quite different and short of a tent areas of research interest. In the simplest instance, this is not
complete overlap. Analogously, social science methods with indi- so much a question of validity as it is a limitation on the utility of
vidually restricted and nonidentical universes can provide collec- the measure. Each class of research method, be it the question-
tively valuable outcroppings for the testing of theory. naire or hidden observation, has rigidities on the content it can
The goal of complete description in science is particularly cover. These rigidities can be divided, as were population restric-
30 UNOBTRUSIVE MEASURES MPROXIMATIONS TO KNOWLEDGE 31

tions, into those linked to an interaction between method and generate a new set of historical records. He may discover a new
materials, those associated with time, and those with physical set, but he is always restrained by what is available. We cite
area. examples later which demonstrate that this weakness is not so
great as is frequently thought, but it would be naive to suggest that
10. Restrictions on content. If we adopt the research it is not present.
strategy of combining different classes of measurement, it be-
comes important to understand what content is and is not feasible 11. Stability of content over time. The restrictions on content
or practical for each overlapping approach. just mentioned are often questions of convenience. The instability
Observational methods can be used to yield an index of of content, however, is a serious concern for validity. Consider
Negro-white amicability by computing the degree of "aggregation" conversation sampling again: if one is attending to the amount of
or nonrandom clustering among mixed groups of Negroes and comment on race relations, for example, the occurrence of ex-
whites. This method could also be used to study male-female tremely bad weather may so completely dominate all conversa-
relations, or army-navy relations in wartime when uniforms are tions as to cause a meaningless drop in racial comments. This is a
worn on liberty. But these indices of aggregation would be largely typical problem for index-making. In such an instance, one would
unavailable for Catholic-Protestant relations or for Jewish-Chris- probably prefer some index such as the proportion of all race
tian relations. Door-to-door solicitation of funds for causes relevant comments that were favorable. In specific studies of content
to attitudes is obviously plausible, but available for only a limited variability over time, personnel-evaluation studies have employed
range of topics. For public opinion surveys, there are perhaps time sampling with considerable success. Observation during a
tabooed topics (although research on birth control and venereal random sample of a worker's laboring minutes efficiently does
disease has shown these to be fewer than might have been ex- much to describe both the job and the worker (R. L. Thorndike,
pected). More importantly, there are topics on which people are 1949; Ghiselli & Brown, 1955; Whisler & Harper, 1962).
unable to report but which a social scientist can reliably observe. Public opinion surveys have obvious limitations in this regard
Examples of this can be seen in the literature on verbal which have led to the utilization of telephone interviews and built-
reinforcers in speech and in interviews. (For a review of this in-dialing recorders for television and radio audience surveys
literature, see Krasner, 1958, as well as Hildum & Brown, 1956; (Lucas & Britt, 1950; Lucas & Britt, 1963). By what means other
Matarazzo, 1962a). A graphic display of opportunistic exploitation than a recorder could one get a reasonable estimate of the number
of an "outcropping" was displayed recently by Matarazzo and his of people who watch The Late Show?
associates (1964). They took tapes of the speech of astronauts and
ground-communicators for two space flights and studied the dura- 12. Stability of content over area. Where regional compari-
tion of the ground-communicator's unit of speech to the astro- sons are being made, cross-sectional stability in the kinds of
nauts. The data supported their expectations and confirmed find- contents elicited by a given method is desirable.
ings from the laboratory. We are not sure if an orbital flight should Take the measurement of interservice rivalry as a research
be considered a "natural setting" or not, but certainly the astro- question. As suggested earlier, one could study the degree of
naut and his colleagues were not overly sensitive to the duration of mingling among men in uniform, or study the number of barroom
individual speech units. The observational method has consist- fights among men dressed in different uniforms. To have a valid
ently produced findings on the effect of verbal reinforcers unat- regional comparison, one must assume the same incidence of men
tainable by direct questioning. wearing uniforms in public places when at liberty. Such an assump-
It is obvious that secondary records and physical evidence are tion is probably not justified, partly because of past experience
high in their content rigidity. The researcher cannot go out and in a given area, partly because of proximity to urban centers. If a
32 UNOBTRUSIVE MEASURES APPROXIMATIONS TO KNOWLEDGE 33

cluster of military bases are close to a large city, only a selective by the same token, the latter are potentially the more reactive. But
group wear uniforms off duty, and they are more likely to be the in all such procedures, the great advantage is the interviewer's
belligerent ones. Another comparison region may have the same power to introduce and reintroduce certain topics. This ability
level of behavior, but be less visible. allows a greater density of relevant data. At the other extreme is
The effect of peace is to reduce the influence of the total level unobserved conversation sampling, which is low-grade ore. If one
of the observed response, since mufti is more common. But if all elected to measure attitudes toward Russia by sampling conversa-
the comparisons are made in peacetime, it is not an issue. The tions on public transportation, a major share of experimental effort
problem occurs only if one elected to study the problem by a time- could be spent in listening to comparisons of hairdressers or
series design which cut across war and peace. To the foot-on-rail discussions of the Yankees' one-time dominance of the American
researcher, the number of outcroppings may vary because of war, League. For a specific problem, conversation sampling provides
but this is no necessary threat to internal validity. low-grade ore. The price one must pay for this ore, in order to get a
Sampling of locations, such as bus routes, waiting rooms, shop naturally occurring response, may be too high for the experimen-
windows, and so forth, needs to be developed to expand access to ter's resources.
both content and populations. Obviously, different methods pre-
sent different opportunities and problems in this regard. Among 14. Access to descriptive cues. In evaluating methods, one
the few studies which have seriously attempted this type of should consider their potential for generating associated validity
sampling, the problem of enumerating the universe of such loca- checks, as well as the differences in the universes they tap.
tions has proved extremely difficult (James, 1951). Location Looking at alternative measures, what other data can they produce
sampling has, of course, been practiced more systematically with that give descriptive cues on the specific nature of the method's
pre-established enumerated units such as blocks, census tracts, population? Internal evidence from early opinion polls showed
and incorporated areas. their population biases when answers about prior voting and
education did not match known election results and census data.
On this criterion, survey research methods have great advan-
tages, for they permit the researcher to build in controls with ease.
There are differences among methods which have nothing to Observational procedures can check restrictions only for such
do with the interpretation of a single piece of research. These are gross and visible variables as sex, approximate age, and conspicu-
familiar issues to working researchers, and are important ones for ous ethnicity. Trace methods such as the relative wear of floor tiles
the selection of procedures. Choosing between two different meth- offer no such intrinsic possibility. However, it is possible in many
ods which promise to yield eclually valid data, the researcher is instances to introduce interview methods in conjunction with other
likely to reject the more time-consuming or costly method. Also, methods for the purpose of ascertaining population characteris-
there is an inclination toward those methods which have sufficient tics. Thus, commuter-train passengers, window shoppers, and
flexibility to allow repetition if something unforeseen goes wrong, waiting-room conversationalists can, on a sample of times of day,
and which further hold potential for producing internal checks on days of the week, and so on, be interviewed on background data,
validity or sampling errors. probably without creating any serious reactive effects for mea-
sures taken on other occasions.
13. Dross rate. In any given interview, a part of the conversa-
tion is irrelevant to the topic at hand. This proportion is the dross 15. Ability to replicate. The questionnaire and the interview
rate. It is greater in open-ended, general, free-response interview- are particularly good methods because they permit the investigator
ing than it is in structured interviews with fixed-answer categories; to replicate his own or someone else's research. There is a toler-
34 UNOBTRUSIVE MEASURES

ance for error when one is producing new data that does not exist
when working with old. If a confounding event occurs or materials
are spoiled, one can start another survey repeating the procedure.
Archives and physical evidence are more restricted, with only a
fixed amount of data available. This may be a large amount-
allowing split-sample replication-but it may also be a one-shot
occurrence that permits only a single analysis. In the latter case,
there is no second chance, and the materials may be completely Physical Traces: Erosion and Accretion
consumed methodologically.
The one-sample problem is not a issue if data are used in a
clearcut test of theory. If the physical evidence or secondary The fog had probably just cleared. The singular Sherlock
records are an outcropping where the theory can be probed, the Holmes had been reunited with his old friend, Dr. Watson (after
inability to produce another equivalent body of information is one of Watson's marriages), and both walked to Watson's newly
secondary. The greater latitude of the questionnaire and interview, acquired office. The practice was located in a duplex of two
however, permit the same statement and provide in addition a physician's suites, both of which has been for sale. No doubt
margin for error. sucking on his calabash, Holmes summarily told Watson that he
had made a wise choice in purchasing the practice that he did,
rather than the one on the other side of the duplex. The data? The
So long as we maintain, as social scientists, an approach to steps were more worn on Watson's side than on his competitor's.
comparisons that considers compensating error and converging In this chapter we look at research methods geared to the
corroboration from individually contaminated outcroppings, there study of physical traces surviving from past behavior. Physical
is no cause for concern. It is only when we naively place faith in a evidence is probably the social scientist's least-used source of
single measure that the massive problems of social research vitiate data, yet because of its ubiquity, it holds flexible and broad-
the validity of our comparisons. We have argued strongly in this gauged potential.
chapter for a conceptualization of method that demands multiple It is reasonable to start a chapter on physical evidence by
measurement of the same phenomenon or comparison. Overreli- talking of Slierlock Holmes. He and his paperbacked colleagues
ance on questionnaires and interviews is dangerous because it could teach us much. Consider that the detective, like the social
does not give us enough points in conceptual space to triangulate. scientist, faces the task of inferring the nature of past behavior
W e are urging the employment of novel, sometimes "oddball" (Who did the Lord of the Manor in?) by the careful generation and
methods to give those points in space. The chapters that follow evaluation of current evidence. Some evidence he engineers (by
illustrate some of these methods, their strengths and weaknesses, questioning), some he observes (Does the witness develop a tic?),
and their promise for imaginative research. some he develops from extant physical evidence (Did the murderer
leave his eyeglasses behind?). From the weighing of several differ-
ent types of hopefully converging evidence, he makes a decision on
the plausibility of several rival hypotheses. For example:

H I : The butler did it.

H z : It was the blacksheep brother.
H3: He really committed suicide.
UNOBTRUSIVE MEASURES PHYSICAL TRACES: EROSION AND ACCRETION 37

This chapter discusses only the physical evidence, those went for years without replacement (Duncan, 1963). A comparative
pieces of data not specifically produced for the purpose of com- study of the rate of tile replacement around the various museum
parison and inference, but available to be exploited opportunisti- exhibits could give a rough ordering of the popularity of the
cally by the alert investigator. It should be emphasized that exhibits. Note that although erosion is the measure, the knowledge
physical evidence has greatest utility in consort with other meth- of the erosion rate comes from a check of the records of the
odological approaches. Because there are easily visible popula- museum's maintenance department.
tion and content restrictions associated with physical evidence, In addition to this erosion measure, unobtrusive observation
such data have largely been ignored. It is difficult even to consider studies showed that people stand before the chick display longer
a patently weak source of data when research strategy is based on than they stand before any of the other exhibits. With this addi-
single measures and definitional operationism. The visibly tional piece of evidence, the question becomes whether or not the
stronger questionnaire or interview looks to be more valid, and it erosion is a simple result of people standing in one location and
may be if only one measure is taken. In a multimethod strategy, shuffling their feet, or whether it really does indicate a greater
however, one does not have to exclude data of any class or degree frequency of different people viewing the exhibit. Clearly an
solely because it is weak. If the weaknesses are known and empirical question. The observation and the tile erosion are two
considered, the data are usable. partially overlapping measures, each of which can serve as a check
It may be helpful to discriminate between two broad classes of on the other. The observation material is more textured for studies
physical evidence, a discrimination similar to that between the of current behavior, because it can provide information on both the
intaglio and the cameo. On one hand, there are the erosion number of viewers and how long each views the display. The
measures, where the degree of selective wear on some material erosion data cannot index the duration of individual viewing, but
yields the measure. Holmes's solution of the stairs on the duplex is they permit an analysis of popularity over time, and do so with
an example. On the other hand, there are accretion measures, economy and efficiency.
where the research evidence is some deposit of materials. Immedi- Those readers who have attended American Psychological
ately one thinks of anthropologists working with refuse piles and Association meetings have doubtless observed the popularity of
pottery shards. The trace measures could be further subdivided conditioning exhibits displaying a live pigeon or monkey (a Skin-
according to the number and pattern of units of evidence. We ner-boxed baby has also done well in recent years). This observa-
might have two subclasses: remnants, where there is only one or a tion offers independent evidence for the general principle that
few indicators of the past behavior available, and series, where dynamic exhibits draw more viewers than static ones. The hy-
there is an accumulative body of evidence with more units, possi- pothesis could be tested further by more careful comparison of tile
bly deposited over a longer period of time. For purposes of sim- wear about dynamic and static exhibits in the museum, making
plicity, it is easier to consider just the two main divisions of erosion corrections for their positional distribution. At least part of the
and accretion. correction would be drawn from the previously mentioned re-
search by Melton (1936) on response sets systematically present in
museum traffic flow.
The wear on library books, particularly on the corners where
Let us look first at some erosion measures. A committee was the page is turned, offers an example of a possible approach that
formed to set up a psychological exhibit at Chicago's Museum of illustrates a useful overlap measure. One of the most direct and
Science and Industry. The committee learned that the vinyl tiles obvious ways to learn the popularity of books is to check how many
around the exhibit containing live, hatching chicks had to be times each of a series of titles has been removed from a library.
replaced every six weeks or so; tiles in other areas of the museum This is an excellent measure and uses the records already main-
38 UNOBTRUSIVE MEASURES
PHYSICAL TRACES: EROSION AND ACCRETION 39
tained for other purposes. But it is only an indirect measure for the examples of behavior traces which were laid down "naturally,"
investigator who wants to know the relative amount of reading a without the intervention of the social scientist.
series of books get. They may be removed from the library, but not The detective-story literature, again, is instructive. In a favor-
read. It is easy to establish whether or not there is a close ite example (Barzun, 1961), a case hinged on determining where a
relationship between degree of wear and degree of checkouts from car came from. It was solved (naturally) by studying the frequen-
the library. If this relationship is positive and high, the hypothesis cies to which the car's radio buttons were tuned. By triangulation of
that books are taken out but selectively not read is accounted for.
the frequencies, from a known population of commercial-station
Note that the erosion measure also allows one to study the relative
frequencies, the geographic source of the car was learned. Here
use of titles which are outside the span of the library-withdrawal
was a remnant of past behavior (someone setting the buttons
measure. Titles placed on reserve, for example, are typically not
originally) that included several component elements collectively
noted for individual use by library bookkeeping. An alternative
considered to reach a solution. Unimaginatively, most detective
accretion measure is to note the amount of dust that has accumu-
fiction considers much simpler and less elegant solutions - such as
lated on the books studied.
determining how fast a car was going by noting the degree to which
Mosteller (1955) conducted a shrewd and creative study on insects are splattered on the windshield.
the degree to which different sections of the International En- Modern police techniques include many trace methods, for
cyclopedia of th.e Social Sciences were read. He measured the example, making complex analyses of soil from shoes and clothing
wear and tear on separate sections by noting dirty edges of pages to establish a suspect's probable presence at the scene of a crime.
as markers, and observed the frequency of dirt smudges, finger One scientist (Forshufvud, 1961) uncovered the historic murder of
markings, and underlining on pages. In some cases of very heavy Napoleon in 1821 on the basis of arsenic traces in remains of his
use, ". . . dirt had noticeably changed the color of the page so that hair.
[some articles] are immediately distinguishable from the rest of Radio-dial settings are being used in a continuing audience-
the volume" (p. 171). Mosteller studied volumes at both Harvard measurement study, with mechanics in an automotive service
and the University of Chicago, and went to three libraries at each department the data-gatherers (Anonymous, 1962). A Chicago
institution. He even used the Encyclopaedia Britannica as a automobile dealer, Z. Frank, estimates the popularity of different
control. radio stations by having mechanics record the position of the dial
A variation of the erosion method has been suggested by in all cars brought in for service. More than 50,000 dials a year are
Brown (1960) for studying the food intake of institutionalized c,hecked, with less than 20 per cent duplication of dials. These data
patients-frequently a difficult task. If the question is one of over- are then used to select radio stations to carry the dealer's advertis-
all food consumption of some administrative unit (say, a ward ing. The generalization of these findings is sound if (1)the goal of
under special treatment conditions compared with a control ward), the radio propaganda is to reach the same type of audience which
Brown makes the engagingly simple suggestion of weighing food now comes to the dealership, and (2) a significant number of cars
trucks that enter and garbage trucks that leave. The unit could be have radios. If many of the cars are without radios, then a partial
varied to be an individual tray of food, the aggregate consignment and possibly biased estimate of the universe is obtained. It is
to a floor or ward, or the total input and output of the hospital. reported, "We find a high degree of correlation between what the
rating people report and our own dial setting research" (p. 83).
The same approach could be used to study the selective
appeal of different radio stations. Knowing that various shopping
There are large numbers of useful natural remnants of past centers draw customers from quite discrete economic populations,
behavior that can be exploited. We can examine now a few one could observe dial settings in cars parked in the shopping
40 UNOBTRUSIVE MEASURES PHYSICAL TRACES: EROSION AND ACCRETION

centers and compare them. As a validation check on the discrimi- So, rough though it is, the measure of the economic rise and fall
of classical Greece was taken to be the area with which she
nation among the centers, one could (in metropolitan areas) note traded, in millions of square miles, as determined by the
local tax stickers affixed to the automobiles and compare these location of vases unearthed in which her chief export commodi-
with the economic data reported for tax areas by the United States ties were transported [p. 1171.
Census.
This measure was related to the need-for-achievement level of
Dial checking is difficult in public areas, because one cannot
classical Greece, estimated from a content analysis of Greek
easily enter the car and make a close observation. And the locking
writings.
of cars is a selective phenomenon, even if one would risk entering
Following the anthropological tradition of refuse study, two
an unlocked car. Sechrest (1965b) has reported that a significantly
larger proportion of college women lock their cars than do college recent reports demonstrate that refuse may be used for contem-
men. He learned this by checking doors of automobiles parked porary as well as historical research.
adjacent to men's and women's dormitories. Hughes (1958) observes:
DuBois (1963) reports on a 1934 study which estimated an . . . i t is by the garbage that the janitor
judges, and, as it were,
advertisement's readership level by analyzing the number of differ- gets power over the tenants who high-hat him. Janitors know
ent fingerprints on the page. The set of prints was a valid remnant, about hidden love-affairs by bits of torn-up letter paper; of
and the analysis revealed a resourceful researcher. Compare this impending financial disaster or of financial four-flushing by the
presence of many unopened letters in the waste. Or they may
with the anthropologist's device of estimating the prior population stall off demands for immediate service by an unreasonable
of an archeological site by noting the size of floor areas (Naroll, woman of whom they know from the garbage that she, as the
1962). Among the consistently detectable elements in a site are janitors put it, "has the rag on." The garbage gives the janitor
good indicators of the floor areas of residences. When these can be the makings of a kind of magical power over that pretentious
keyed to knowledge of the residential and familial patterns of the villain, the tenant. I say a kind of magical power, for there
appears to be no thought of betraying any individual and thus
group, these partial data, these .remnants, serve as excellent turning this knowledge into overt power. H e protects the tenant,
population predictors. but, a t least among Chicago janitors, it is not a loving protection
Other remnants can provide evidence on the physical charac- [P. 511.
teristics of populations no longer available for study. Suits of
armor, for example, are indicators of the height of medieval Sawyer (1961) recounts the problem of estimating liquor sales
knights. in Wellesley, Massachusetts. In a city without package stores,
The estimable study of McClelland (1961), The Achieving the usual devices of observation of purchase or study of sales
Society, displays a fertile use of historical evidence. Most of the records are of no help. Sawyer solved the problem by studying the
data come from documentary materials such as records of births trash carted from Wellesley homes and counting the number of
and deaths, coal imports, shipping levels, electric-power consump- empty liquor bottles.
tion, and remaining examples of literature, folk tales, and chil- The duration of the sampling period is a consideration in
dren's stories. We consider such materials in our discussion of studying traces of any product in which consumption of a visible
archival records, but they are, in one sense, a special case of trace unit takes a long time. The study must cover a large enough span
analysis. McClelland further reports on achievement-level esti- to guarantee that a trace of the behavior will appear if, in fact, it
mates derived from ceramic designs on urns, and he indexes the did occur. With estimation of whisky consumption, there is the
geographic boundaries of Greek trade by archeological finds of further demand that account be taken of such possibly confound-
vases. Sensitive to the potential error in such estimates, McClel- ing elements as holidays, birthdays, discount sales in nearby retail
land writes, stores, and unusual weather. This is particularly true if estimates
42 UNOBTRUSIVE MEASURES 43
PHYSICAL TRACES: EROSION AND ACCRETION

are being made of the relative consumption of specific types of Litter can also serve as a measure of conformity to restric-
liquor. A heat wave produces a substaiitial increase in the con- tions. One can measure with a direct criterion the effectiveness of
sumption of gin, vodka, and rum, while depressing consumption of antilitter posters which vary in severity or style.
scotch, brandy, and blended whiskies. Depending upon the area,
an unusually high level of entertaining produces consumption of
either more expensive or less expensive whisky than usual. T h e
temporal stability of many common products that could b e used to T h e methods discussed so far have all been ones in which the
measure behavior is quite low. social scientist has taken the data as they come and not intervened
Kinsey and his associates note tlie study of another trace in any way to influence the frequency or character of the indicator
measure-inscriptions in toilets. "With the collaboration of a material. There are conditions under which the social scientist can
number of other persons, we have accumulated some hundreds of intervene in the data-production process without destroying the
wall inscriptions from public toilets" (Kinsey et al., 1953, p. 673). nonreactive gains characteristic of trace and erosion data. H e
Their findings show a significant difference between men's and might want to do this, for example, to speed u p the incidence of
women's toilets in the incidence of erotic inscriptions, either critical responses - a sometimes nagging annoyance with slowly
writings or drawings. Seclirest (1965a), studying inscriptions in eroding or accreting materials. Or he might want to guarantee that
Philippine and United States toilets, also found a difference be- the materials under study were in fact equivalent or equal before
tween frequencies of male and female inscriptions - although they were modified by the critical responses. T h e essential point is
female inscriptions seemed relatively more frequent in tlie Philip- that his intervention should not impair the nonreactivity of the
pines. A widely circulated United States joke runs, "When a girl erosion and trace measures by permitting the subjects to become
can see the handwriting on the wall, she's in the wrong restroom." aware of his testing.
Sechrest also found indications of greater sexual and homosexual
preoccupatioii in the United States sample.
Some accretion data are built u p quickly, and the problem of
deciding on the appropriate period of study is negligible. Take the John Wallace, our former colleague, once noted that it would be
debris accuinulation of a ticker-tape parade. T h e New York San- possible to estimate tlie activity level of children by measuring the
itary Commission regularly notes how many tons of paper float rate at which they wear out shoes. It is theoretically possible to
down onto the streets during a parade. One might use this as start at any point in time with the shoes that children are wearing,
material in estimating the enthusiasm of response for some popu- measure the degree of wear, and then later remeasure. T h e
lar hero. Because the Sanitary Commission has been reporting on difference between the two scores might b e a measure of the effect
how hard it works for years, it is possible to employ a control level of some experimental variable. If the measurements were surrepti-
of tonnages showered down upon other heroes. Did John Glenn get tious, the experimenter would merely be noting a naturally occur-
a more or less enthusiastic response from tlie New Yorkers of liis ring event and not involving himself with the materials.
day than did Charles Lindbergh from the New Yorkers of liis? At Schulinan and Reisman (1959) indexed the activity level of
best these data are suggestive, and tlie demise of the ticker-tape children by having them wear self-winding wristwatches which
machine has meant a confounding of data for long-term historical were adapted to record the child's amount of movement. Schul-
analysis. While at one time ticker-tape parades had a dominance of man, Kasper, and Throne (1965) have validated the "actometer"
ticker tape in tlie air, today it is confetti. One can make correc- data against children's oxygen consumption.
tions, of course, but they must be tenuous. Still another way to improve the data-gathering process is to
44 UNOBTRUSIVE MEASURES PHYSICAL TRACES: EROSION AND ACCRETION 45

manipulate the recording material. In some cases, one might treat seal was intact for each pair of pages, and a cumulative measure of
the material to allow it to provide a more stable base. With floor advertising exposure was obtained by noting the total number of
tile, for example, surfaces are often coated to resist wear. Once the breaks in the sample issue. This method was developed because of
coating is worn through, erosion proceeds at a faster rate. For a pervasive response-set tendency among questionnaire respon-
research purposes, it would be desirable to lay uncoated tiles and dents to claim falsely the viewing or reading of advertisements. This
accelerate the speed with which information is produced. particular measure was valuable in establishing the degree to
In other cases, coating of materials may be desirable, either to which there was a spurious inflation of recall of advertisements.
provide a more permanent measure or to allow one where it It was not used alone, but in consort with more standard inter-
otherwise would be impossible. The wear on public statues, reliefs, viewing practices to provide a validity check.
and so forth may provide an example. Throughout Europe, one The content restrictions of this method are substantial. It does
may note with interest shiny bronze spots on religious figures and not provide data for a single page or advertisement, but instead
scenes. The rubbing which produced these traces is selective and only indicates whether or not a pair of pages were exposed to the
becomes most visible in group scenes in which only one or two person's eye. There is no direct evidence whether or not the
figures are shiny. The "Doors of Paradise" at the baptistry of San person even looked at advertisements which appeared on this pair
Giovanni in Florence demonstrate this particularly well. of pages. Nor is the method sensitive to how many people may have
A careful investigator might choose to work another improve- been exposed to a given pair of pages. One or more openings yields
ment on the floor-erosion approach. An important bias is that each the same response.
footfall is not necessarily an independent event. Once a groove on The fingerprint method suffers from fewer restrictions, and it,
a stair becomes visible, for example, those who walk on that stair too, could be improved by an unobtrusive move of the investigator.
are more likely to conform to the position of the groove than are It is possible to select special paper which more faithfully receives
those who walked before it became visible. This is partly due to the fingerprints, thereby reducing the risk that the level of exposure
physical condition of the stair, which tends to slide the person's will be underestimated. The greater fidelity of a selected paper
foot into the groove, and also possibly due to a response tendency would also improve the ability to discriminate among different
to follow in the footsteps of others. This may be partially controlled fingerprints on the page. It is clearly impractical and unwise to
in newer settings by placing mats on the steps and noting their base a complete study of advertising exposure on fingerprints; it is
wear. The mats could also hide the already eroded grooves. equally unwise not to consider coincidental methods which yield,
as the glue seals do, independent validation data. Clearly, the
greater the risk that awareness, response set, role evocation, and
other variables present to valid comparisons, the greater the
Just as with erosion measures, it is sometimes desirable for demand for independent, nonreactive, and coincidental measures.
the researcher to tamper with materials pertinent to an accretion From fingerprints to noseprints - and back to the museum for
comparison. Noted earlier was a fingerprint study of advertising a final example. The relative popularity of exhibits with glass
exposure (DuBois, 1963). Another procedure to test advertising fronts could be compared by examining the number of noseprints
exposure is the "glue-seal record7' (Politz, 1958). Between each deposited on the glass each day (or on some sample of time, day,
pair of pages in a magazine, a small glue spot was placed close to the month, and so forth). This requires that the glass be dusted for
binding and made inconspicuous enough so that it was difficult to noseprints each night and then wiped clean for the next day's
detect visually or tactually. The glue was so composed that it would viewers to smudge. The noseprint measure has fewer content
not readhere once the seal was broken. After the magazines had restrictions than most of the trace techniques, for the age of
been read, exposure was determined by noting whether or not the viewers can be estimated as well as the total number of prints on
46 UNOBTRUSIVE MEASURES PHYSICAL TRACES: EROSION AND ACCRETION 47

each exhibit. Age is determined by plotting a frequency distribu- collections from that area? The museum data might be one more
tion of the heights of the smudges from the floor, and relating these outcropping of an effect that could be tested and used in consort
data to normative heights by age (minus, of course, the nose-to-top- with other measures to evaluate an effect.
of-head correction).' In looking at the museum data. we could consider the news-
paper-story question as a problem in the effect of an exogenous
variable on a time series, and we are compelled to look for other
sources of variation besides the story. We know something of the
The examples provided suggest that physical-evidence data pool from which the critical responses come. Museum attendance
are best suited for measures of incidence, frequency, attendance, varies seasonally (highest in summer, lowest in winter), varies
and the like. There are exceptions. In a closely worked-out theory, cyclically (up on holidays, weekends, and school vacations), and
Ifor example, tlie presence or absence of a trace could provide a has had a strong secular movement upward over time. All of these
critical test or comparison. But such critical tests are rare com- known influences on museum attendance are independent of
pared to the times when the physical evidence-be it deposit or the newspaper story. They may be partialled out of the total
erosion-is one part of a series of tests. variance of the time series or, in descriptive statistics, be con-
When dealing with frequency data, particularly when they are trolled in index numbers. Such corrections, however, are less
in time-series form, it is essential to ask whether or not there are critical for comparisons across areas. To the degree that these
any corrections which may be applied to remove extraneous scores can be accounted for-in either inferential or descriptive
sources of variance and improve the validity of comparisons. terms -we achieve more sensitive research.
More so than most classes of data, the type of frequency data Auxiliary intelligence exists for most applications of physical-
yielded by physical materials is subject to influences which can be evidence data. It may be contained in records kept for other
known (and corrected for) without substa~itialmarginal research purposes, or come from prior knowledge. Consider the problem of
effort. estimating advertising exposure by the glue-seal method. In study-
The museum measures of noseprint deposits or tile erosion ing a single pair of pages, our only measure is the proportion of
can serve as examples. We can use these data to answer questions pages on which seals are broken. Of course, there are a number of
about the popularity of a give11 exhibit over time. Are the hatching variables known to influence the degree to which magazine pages
chicks as popular now as they once were? Is there a boom in are opened. The number of pages in the magazine, for example, or
viewing of giant panda exhibits? The answers to these questions the magazine's policy of either clustering or dispersing advertise-
might be of interest in themselves, or we might want them to ments throughout its pages will alter responses. Finally, readers of
evaluate the effect of some other variable. We could conceive of a some magazines systematically and predictably read ads more
study estimating the effect of newspaper stories on public behav- than do readers of other magazines.
ior. Did a story on the birth of a baby leopard in the zoo increase Each of these elements should be considered in evaluating the
the number of zoo visitors and the number of viewers of a leopard single medium under study. These factors might be combined into
exhibit in the natural history museum? The effect may be too a baseline index which states the reasonable exposure expectation
transitory for the erosion measure to pick up, but the noseprint for an ad appearing in a very thin issue of Magazine A, with a
deposits could index it. Or do accounts of trouble in a far-off spot dispersion strategy of ad placement. The observed score for the
increase the number of persons showing interest in museum given pair of pages can thus be transformed into a number which is
'The authors were told of such a research project, hut have been unable to in some way related to tlie expected value.
locate the source. If the study is not apocryphal, we should like to learn the source This approach to "description" is the one argued in Chapter 1
and give proper credit to so imaginative a n investigator.
-description and all research inferences are comparisons. When
PHYSICAL TRACES: EROSION AND ACCRETION 49
48 UNOBTRUSIVE MEASURES
Up to this point, we have been talking only of corrections
the control is not developed within the data-gathering of the study
applied to a single measure. Following in the index-number tradi-
(as it is not in most of the possible physical-evidence measures), it
tion, we can also consider a piece of physical evidence either as
can be generated by analysis of other available intelligence.
one component of an over-all index composed of several different
With more elaborate comparisons than evaluating a single
classes of data or as an element in a set of physical evidence
museum exhibit, the problems of extraneous variables and the
combined to make an aggregate index.
need for their control in both inference and description become
One might want an over-all measure, extending over time, of
magnified. Keeping with the museum, take the problem of compar-
the completeness with which an institution is being used. This
ing two exhibits located in different sections of the museum. The
could supplement information on the extent to which it is being
same information used earlier-the known variation by season,
attended. In a library, for example, the various types of physical
holidays, and so on - is again core intelligence for comparison of
evidence available (dust collection, page wear, card-catalogue wear)
the observed physical evidence. The concern becomes one of
could be gathered together and each individual component
interactions. At those predictably high and low attendance times,
weighted and then combined to produce the over-all index. Simi-
we can expect a significant interaction between the accessibility of
larly for museums.
the exhibit and the over-all level of attendance. The interaction
Assume that one wanted to note the effect of an anxiety-
could bias measures of both the number of viewers and the
producing set of messages on the alleged link betweeen cigarettes
duration of time spent viewing.
and lung cancer. One could, in the traditional way, employ a
We should expect the more accessible exhibit to have a
before-after questionnaire study which measured attitudes toward
significantly higher marginal lead on noseprints during high-at-
cigarettes and obtained self-reports of smoking. Or, one could
tendance periods than during low-attendance periods. Some of the
observe. If the anxiety-producing message were embedded in a
interactive difference may come from the greater individual fatigue
lecture, at some point toward the middle, it would be possible to
with large crowds, which might restrict the length of time each
observe the frequency of cigarettes lit before, during, and after the
visitor spends in the museum. Or the size of the crowds may slow
mention of the deleterious effects of smoking. Or one could wire a
movements so that a number of people with fixed time periods to sample of the chairs in the lecture hall and record the amount of
spend in the museum do not get around as much. Or there may be squirming exhibited by the auditors at various points. Or, follow-
a population characteristic such that a larger share of peak-time ing Galton7s (1885) suggestion, one could observe the amount of
visitors are indifferent viewers who lack a compelling interest in the I
gross body movement in the audience. Or one could note the
over-all content of the museum. They may view either more subsequent sale of books about smoking-ideally framing the
casually or more erratically. setting so that equally attractive titles were available that argued
Corrections must be made for both the main effects and the the issue pro and con.
interactions in such cases, and the easiest way to make them is to All of these alternatives are viable approaches to studying the
prepare corrections based on known population levels and re-
effect of the message, with physical evidence an important ele-
sponse tendencies. We should speak of statue rubbings per thou-
ment because of its ability to measure long-term effects and to
sand bypassers, or the rate of floor-tile replacement in a specific
extend the physical area of investigatioil beyond the immediate
display area per thousand summer visitors. These figures are first-
experimental setting. Depending on the degree of knowledge one
line transformations that are valuable. They can become more
had about the messages, the audience, and past effects, it would be
valuable if enough is known to consider them as the numerator of
possible to construct an index with differential weights for the
an index fraction. With the study of past behavior giving the
various component measures of effect. Each of the single measures
denominator, indices can be produced which account for irrele-
may be attacked for weakness, but taken cumulatively-as sepa-
vant variance and make for better comparisons.
50 UNOBTRUSIVE MEASURES PHYSICAL TRACES: EROSION AND ACCRETION 51

rate manifestations of an hypothesized effect-they offer greater Any single class of physical evidence is likely to have a strong
hope for validity than any single measure, regardless of its popu- population restriction, and all physical-evidence data are troubled
larity. by population problems in general. It is not, for example, easy to
get descriptive access to the characteristics of a possible popula-
tion restriction. One has the remnant of past behavior-a groove or
a pile-and it says nothing by itself of those who produced the
evidence.
T h e outstanding advantage of physical evidence data is its W e also must b e cautious of physical data because they may
inconspicuousness. T h e stuff of analysis is material which is vary selectively over time or across different geographical areas. It
generated without the producer's knowledge of its use by the is possible to get some checks on the character of these restric-
investigators. Just as with secondary records, one circumvents the tions by employing supplementary methods such as the interview
problems of awareness of measurement, role selection, interviewer or the questionnaire. Some inferences about the character of
effects, and the bias that comes from the measurement itself population bias may come also from a time-series analysis of the
taking on the role of a change agent. Thus, physical evidence is, for data, possibly linking the physical-evidence data to time series of
the most part, free of reactive measurement effects. It is still other data hypothesized to b e selectively contributing to the
necessary to worry over possible response sets which influence the variance. In a systematic investigation, a careful sampling of both
laying down of the data. With erosion measures, this might be so times and locations is possible, and internal comparison of the
obvious a bias as the tendency for people to apply more pressure to findings may offer some clues. But the assumption must be that
stairs when going up than when going down, or the less obvious any set of physical evidence is strongly subject to population
tendency to turn right. restrictions, and supplementary information is always required.
With accretion measures, there is the question of whether the Other methods also have population restrictions, and it may
materials have selectively survived or been selectively deposited. be possible to turn the fact of a restriction into an asset. There are,
Do some objects have a higher probability of being discarded in for example, certain subsets of the population virtually impossible
public places than others? Or, equally an issue, do some materials to interview. In such situations, an enterprising investigator should
survive the intervening events of time better? Arcl~eologicalre- ask whether the subject is leaving traces of behavior or material
search has always faced the problem of the selective survival of which offer some help in inferring the subject's critical behavior.
materials. Some of this selection comes from the physical charac- In this type of problem, the physical evidence is the supple-
teristics of the material: clay survives: wood usually does not. mentary data, and is used to fill in the population restrictions of
Other selection comes from the potential value of the material other methods used concurrently.
which might be discarded. In writing of the small decorated As for the content available to the reach of physical-evidence
stamps (seals) used by the ancient Mexican cultures, Enciso (1953) methods, there are substantial limitations. It is not often that an
noted, "If any gold or silver was used, the stamps have yet to be investigator tests a theory so precise in its predictions that the
found or have been melted long ago. Wood and bone have not appearance or absence of a single trace is a critical test of the
survived the ravages of time. This may explain the abundant theory. Most of the time, physical evidence is more appropriate for
survival of clay stamps" (p. iv). In Naroll's (1956) phrase, the clay indexing the extent to which an activity has taken place-the
stamps are "durable artifacts." W e also discuss this bias in our number of footfalls, the number of empties tossed aside. Because
comments on archival and available records, but it is significant these activities are influenced by many other variables, we seldom
here for the restrictions it places on the content of physical- have an absolutely clean expression of some state of being-thus
evidence data and thereby the ability to generalize findings. the necessity for corrections and transformations. Yet if enough
52 UNOBTRUSIVE MEASURES

information does exist, or can be produced, the content restric-

tions are controllable because they are knowable.
There is the positive gain that the amount of dross in physical
evidence is low or negligible. Typically, what is measured is
relatively uncontaminated by a body of other material which must
be discarded as not pertinent to the research investigation. One
can pinpoint the investigation closely enough to eliminate the Archives I: The Running Record
dross -something not possible in the more amorphous method of
conversation sampling, or of observation of "natural" behavior.
Compared with other classes of research methods, we have Possibly a wife was more likely to get an inscribed tablet if she
noted few examples of prior research using physical evidence. died before her husband than if she outlived him.
This is not through preference, but because we have been unable
to find more. Physical-evidence data are off the main track for The tablet cited here is a tombstone, and the quotation is from
most psychological and sociological research. This is understanda- Durand's (1960) study of life expectancy in ancient Rome and its
ble, but still regettable. The more visible weaknesses of physical provinces. Tombstones are but one of a plethora of archives
evidence should preclude its use no more than should the less available for the adventurous researcher, and all social scientists
visible, but equally real, weaknesses of other methods. If physical should now and then give thanks to those literate, record-keeping
evidence is used in consort with more traditional approaches, the societies which systematically provide so much material appropri-
population and content restrictions can be controlled, providing a ate to novel analysis.
novel and fruitful avoidance of the errors that come from reac- The purpose of this chapter is to examine and evaluate some
tivity. uses of data periodically produced for other than scholarly pur-
poses, but which can be exploited by social scientists. These are
the ongoing, continuing records of a society, and the potential
source of varied scientific data, particularly useful for longitudinal
studies. The next chapter looks at more discontinuous archives,
but here the data are the actuarial records, the votes, the city
budgets, and the communications media which are periodically
produced, and paid for, by someone other than the researcher.
Besides the low cost of acquiring a massive amount of perti-
nent data, one common advantage of archival material is its
nonreactivity. Although there may be substantial errors in the
material, it is not usual to find masking or sensitivity because the
producer of the data knows he is being studied by some social
scientist. This gain by itself makes the use of archives attractive if
one wants to compensate for the reactivity which riddles the
interview and the questionnaire. The risks of error implicit in
archival sources are not trivial, but, to repeat our litany, if they are
recognized and accounted for by multiple measurement tech-
niques, the errors need not preclude use of the data.
53
ARCHIVES I: THE RUNNING RECORD 55
54 UNOBTRUSIVE MEASURES
until now) than those in the lower reaches of Romaii society. This
More than other scholars, archeologists, anthropologists, and
bias is a risk to validity to the degree that mortality rates varied
historians have wrestled with the problems of archival data. Obvi-
across economic or social classes - which tliey probably did. The
ously, they frequently have little choice but to use what is available
more affluent were more likely to have access to physicians and
and then to apply corrections. Unlike the social scientist working
drugs, which, given the state of medicine, may have either short-
with a contemporaneous problem, there is little chance to generate
ened or lengthened their lives. It is to Durand's credit that he
new data which will be pertinent to the problem and wliich will
carefully suggests potential biases in his data and properly inter-
circumvent the singular weakness of the records being employed.
prets his findings within the framework of possible sampling error.
Naroll (1962) recently reviewed the methodological issues of
This same type of sampling error is possible when studying
archives in his book Data Quality Control. His central argument
documents, whether letters to the editor or suicide notes. We know
focuses on representative sampling. Does the archeologist with his
that systematic biases exist among editors. Some try to present a
thousand-year-old pottery shards, or the historian with a set of two-
"balanced" picture on controversial topics regardless of how
hundred-year-old niemoirs, really have a representative body of
unbalanced the mail. With the study of suicide notes, the question
data from which to draw conclusions? This is one part of "Croce's
must be asked whether suicides who do not write notes would have
Problem." Either one is uncertain of the data when only a limited
body exists, or uncertain of the sample when so much exists that expressed the same type of thoughts had they taken pen in hand.
selection is necessary. Any inferences from suicide notes must be hedged by the realiza-
Modern sampling methods obviate the second part of the tion that less than a quarter of all suicides write notes. Are both the
problem. We can know, with a specified degree of error, the writers and nonwriters drawn from the same population?
The demographer cannot get new Romans to live and die; the
confidence we can place in a set of findings. But the first part of
psychologist cannot precipitate suicides. And therein is the central
Croce's problem is not always solvable. Sometimes the running
problem of historical data. New and overlapping data are difficult
record is spotty, and we do not know if the missing parts can be
to obtain from the same or equivalent samples. The reduction of
adequately estimated by a study of the rest of the series. That is
error must come from a close internal analysis which usually
one issue. But even if the record is serially complete, the collection
means fragmenting the data into subclasses and making cross-
of the secondary sources impeccable, and the analysis inspired,
checks.
the validity of the conclusions must rest on assumptiolis of the
An alternative approach is feasible when reports on the same
adequacy of the original material.
phenomenon by different observers are available. By a compara-
There are at least two major sources of bias in archival
tive evaluation of the sources, based on their different qualifica-
records-selective deposit and selective survival. They are the
tions, inferences may be drawn on the data's accuracy (Naroll,
same two concerns one meets in dealing with physical-evidence
1960; Naroll, 1961). In examining an extinct culture, for example,
data. Durand's study of the ancient Roman tombstones illustrates
one can compare reports made by those who lived among the
the selective-deposit concern. Does a study of a properly selected
people for a long period of time with reports from casual visitors.
sample of tombstones tell us about the longevity of the ancient
Or there can be a comparison of the reports from those who
Romans, or only of a subset of that civilization? Durand, as noted,
learned the indigenous language and those who did not. For those
suggests that the timing of a wife's death may determine the
items on which there is consensus, there is a higher probability
chance of her datum (CCCI-CCCL) being included in his sample. It
that the item reported is indeed valid. This consensus test is one
is not only the wives who die after their husbands who may be
solution to discovery of selective deposit or editing of material. It
underrepresented. There is, too, a possible economic or social-
does not eliminate the risk that all surviving records are biased in
class contaminant. Middle- and upper-class Romans were more
the same selective way; what it does do is reduce the plausibility of
likely to have tombstones (and particularly those that survived
56 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 57

such an objection. The greater the number of observers with political area, the holes that exist in data series are suspect. Are
different qualifications, the less plausible the hypothesis that the records missing because knowledge of their contents would reflect
same systematic error exists. in an untoward way on the administration? Have the files been
Sometimes selective editing creeps in through an administra- rifled? If records are destroyed casually, as they often are during
tive practice. Columbus kept two logs - one for himself and one for an office move, was there some biasing principle for the research
the crew. Record-keepers may not keep two logs, but they may comparison which determined what would be retained and what
choose among alternative methods of recording or presenting the destroyed?
data. Sometimes this is innocent, sometimes it is to mask elements When estimating missing values in a statistical series, one is
they consider deleterious. In economic records, bookkeeping prac- usually delighted if all but one or two values are present. This gives
tices may vary so much that close attention must be paid to which confidence when filling in the missing cells. If the one or two holes
alternative record system was selected. The depreciation of physi- existing in the series have potential political significance, the
cal equipment is an example. Often deliberate errors or record- student is less sanguine and more suspicious of his ability to
keeping policy can be detected by the sophisticate. At other times, estimate the missing data.
the data are lost forever (Morgenstern, 1963).
One more example may serve. A rich source of continuing
data is the Congressional Record, that weighty but sometimes
humorous document which records the speeches and activities of Birth, marriage, death. For each of these, societies maintain
the Congress. A congressman may deliver a vituperative speech continuing records as normal procedure. Governments at various
which looks, upon reflection, to be unflattering. Since proofs are levels provide massive amounts of statistical data, ranging from
submitted to the congressman, he can easily alter the speech to the federal census to the simple entry of a wedding in a town-hall
eliminate his peccadillos. A naive reader of the Record might be ledger. Such formal records have frequently been used in descrip-
misled in an analysis of material which he thinks is spontaneous, tive studies, but they offer promise for hypothesis-testing research
but which is in fact studied. as well.
A demurrer is entered. Even if the data were originally Take Winston's (1932) research. He wanted to examine the
produced without any systematic bias that could threaten validity, preference for male offspring in upper-class families. He could
the risk of their selective survival remains. It is no accident that have interviewed prospective mothers in affluent homes, or fathers
archeologists are pottery experts. Baked clay is a "durable arti- in waiting rooms. Indeed, one could pay obstetricians to ask,
fact" that cannot be digested and decays negligibly. Naroll(1956) "What would you like me to order?" Other measures, nonreactive
comments that artifacts survive because they are not consumed in ones, might be studies of adoption records, the sales of different
use, are indifferent to decay, and are not incorporated into some layette colors (cutting the data by the class level of the store), or
other artifact so as to become unidentifiable. Discrete and durable, the incidence of "other sex" names-such as Marion, Shirley,
they remain as clues, but partial clues; other evidence was eaten, Jean, Jerry, Jo.
rotted, or re-employed. Short of complete destruction, decay by But Winston went to the enormous data bank of birth records
itself is no problem. It only becomes one when the rate and and manipulated them adroitly. He simply noted the sex of each
distribution of decay is unknown. If known, it may become a child in each birth order. A preference for males was indicated, he
profitable piece of evidence-as Libby's (1963) work with radiocar- hypothesized, if the male-female ratio of the last child born in
bon dating shows. families estimated to be complete was greater than that ratio for
For the student of the present, as well as of the past, the all children in the same families. With the detail present in birth
selective destruction of records is a question. Particularly in the records, he was able to segregate his upper-class sample of parents
58 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 59

by tlie peripheral data of occupation, and so forth. T h e same probably reflected the society's values (or those of a sufficiently
auxiliary data can be employed in any study to serve as a check on large segment of the society to keep tlie magazine economically
evident population restrictions-a decided plus for detailed alive) more adequately than those which failed. T h e issue was not
archives. one of getting a representative sample of all magazines, but,
This study also illustrates the time-sampling problem. For the instead, of magazines which printed material that would have
period studied, and because of the limitation to upper-class famil- recorded more faithfully the pertinent research information.
ies, Winston's measure is probably not contaminated by economic Christensen (1960) made a cross-cultural study of marriage
limitations on the absolute number of children, a variable that may and birth records to estimate the incidence of premarital sex
operate independently of any family sex preference. Had his study relations in different societies. H e simply checked off the time
covered only the 19301s,or were h e making a time-series compari- interval between marriage and birth of the first child-a procedure
son of economically marginal families, the time factor could offer a which showed marked differences in premarital conception, if not
substantial obstacle to valid comparison. T h e argument for the in activity ainong cultures. His study illustrates some of the
existence of such an economic variable would be supported if a problems in cross-cultural study. T h e rate of premature births may
study of the 1930's showed no sex difference ainong terminal vary across societies, and it is necessary to test whether this
children, but did show significant differences for children born in hypothesis can explain differences. Data on the incidence of
the 1940's. premature births of later-born children in each society permit this
Economic conditions are only one of the factors important to correction. A population problem to be guarded against in these
errors due to timing. Wars, depressions, and acts of God are all cross-cultural studies, however, is the differential recording of
events which can pervasively influence the comparisons of social births, marriages, and tlie like. There are Inany societies in which
science data. T h e subjective probability of their influence may be a substantial share of marriages are not formally entered in a
awkward to assign, yet the ability to control that influence through record-keeping system, although tlie parties initially regard the
index numbers and other data transformations is a reasonable and alliance to b e as binding as do those in other societies where
proper practice. records are more complete. T h e incidence in Mexico of "free-
There are many demographic studies of fertility levels in union" marriages is both extensive and selective- more prevalent
different societies, but Middleton (1960) showed a shrewd under- among working classes than other groups (Lewis, 1961).
standing of archival sources in his work. He developed two Simple marriage records alone were used by Burchinal and
sets of data: fertility values expressed in magazine fiction, alid Kenkel (1962) and Burchinal and Chancellor (1962). The records
actuarial fertility levels at three different time periods. For 1916, were used as a handy source by Burchinal and Kenkel to study the
1936, and 1956, he estimated fertility values by noting tlie size of association between religious identification and occupational sta-
fictional families in eight American magazines. A comparison with tus. T h e records provided a great body of data from which to work,
the population data showed that shifts in the size of fictional but also posed a sampling question. Are Inen about to be grooms a
families closely paralleled shifts in the true United States fertility good base for estimating the link between religion and occupation?
level. T h e small cadre of coufirmed bachelors is excluded from the
Middleton had a troublesome sampling problem. Since only a sample universe, and depending upon the dates of the records
small number of magazines continued publicatioll over tlie period studied, there can be an interaction between history and groom-
from 1916 to 1956, was the group of eight long-term survivors a dom.
proper sample? This durable group may not have been representa- A later study by Burchinal and Chancellor (1963) took the
tive, but it was quite proper. T h e very fact that these eight complete marriage and divorce records of the Iowa Division of
survived the social changes of the 40 years argues that they Vital Statistics for the years 1953 and 1959. From these records,
60 UNOBTRUSIVE 1IEASIIRES ARCHIVES I: TI-IE RUNNING RECORD 61

the authors compared marriages of same-religion and mixed-reli- Data showed the mean age at death of royalty to be 64,.04 years,
gion pairs for longevity. As might be expected, they found mixed men of literature and science 67.55 years, and gentry 70.22 years.
marriages to be significantly shorter-lived than same-reli,'mion ones. Another pioneering study, Durl<heirn's Suicide (1951), shows
Of the mixed marriages, those partners who described themselves an active exploration of archival source possibilities. He concluded
as Protestants without naming a specific affiliation showed the that "the social suicide rate can be explained only sociologically"
highest divorce rate. (p. 299) by relating suicide levels to religion, season of the year,
It might be well to note that such data may be contaminated time of day, race, sex, education, and marital status, doing all of
by self-selection error. Persons entering mixed marriages may be this for different countries. All of these variables were ohtained
more unstable or more quick to see divorce as a solution. Such from available archives, and their systematic manipulation pres-
people might not increase the chances of a durable marriage by aged the morass of cross-tabulations that were later to appear in
choosing a mate of the same religion. sociological research.
These same marriage records could be employed as tests of Wechsler (1961) integrated three different classes of archival
functional literacy. Taking a time series of marriage records, what data in his correlational study of the relationships among suicide,
is the proportion of people signing "X" at varying points in history? depressive disorders, and coiumunity growth. He went to the
Of all the marriage-record studies, probably none is more census for data on population change, to mental-illness diagnoses
engaging than Galton's (1870) classic on hereditary genius. Galton in hospital records, and to the vital statistics of the state to get the
used archival records to determine the eminence of subjects suicide incidence.
defined as "geniuses" and additional archives to note how their Another study employing death records is Warner's (1959)
relatives fared on eminence. Few scientists have been so sensitive work, The Living and the Dead. Death and its accoutrements in
as Galton to possible error in drawing conclusions, and, in a Yankee City were the subject of this multiinethod research.
section on occupations, he notes that many of the judges he Warner consulted official cemetery documents to establish a
studied postponed marriage until they were elevated to the bench. l~istoryof the dead and added interviewing, observation, and trace
Even so, their issue of legitimate children was considerable. In analysis as aids to his description of graveyards. "Their ground
Stein and Heinze's (1960) summary: "Galton points out that among and burial lots were plotted and inventory was taken of the
English peers in general there is a preference for marrying hier- ownership of the various burial lots, and listings were made of the
esses, and these women have been peculiarly unprolific" (p. 87). individuals and families buried in them" (p. 287).
And on the possible contaminant of the relative capacities of the His findings are of interest for what they say of response
male and female line to transmit ability: tendencies in the laying down of physical evidence. Here the
. . . the decidedly smaller number of transmissions along the response tendencies, and the way in which they vary across social-
female line suggests either an "inherent incapacity in the female class groups, become the major clues to the analysis. Warner
line for transmitting the peculiar forms of ability we are now dis- found the social structure of Yankee City mirrored (if this be the
cussing," or possibly "the aunts, sisters and daughters of proper verb) in the cemetery; he found evidence on family organi-
eminent men do not marry, on the average, so frequently as zation, sex and age differentiation, and social mobility. For exam-
other women." He believes there is some evidence for this latter
ple, the father was most often buried in the center of the family
explanation [p. 891.
plot, and headstones of males were larger than those of females. In
Galton (1872) even used longevity data to measure the efficacy of some cases, Warner found that a family which had raised its social
prayer. He argued that if prayer were efficacious, and if members status moved the graves of their relatives from less prestigious
of royal houses are the persons whose longevity is most widely and cemeteries to more prestigious ones.
continuously prayed for, then they should live longer than others. Tombstones would be an interesting source of data for com-
ARCHIVES I: THE RUNNING RECORD
62 UNOBTRUSIVE MEASURES
which he dominated for so many years. His grave in the family
parative analysis of different cultures. In matriarchal societies, for plot is unattended, but his statue stands in front of the state
example, is the matriarch's stone substantially larger than the capitol building [p. 481.'
husband's? Does the husband get a marker at all? What are the
And for a novelistic treatment of what remains behind, there
differences in societies with extended versus nuclear family struc-
is Richard Stern's (1960) commentary on Poppa Hondorp.
tures?
Warner's findings tie in with Durand's (1960) study of ancient The obituaries were Poppa Hondorp's measure of human worth.
Rome. In both studies, the relative dominance of the male was "There's little they can add or subtract from you then," was his
view. Poppa's eye had sharpened over the years so that he could
demonstrated by the characteristics of the tombstones. weigh a two-and-a-half-inch column of ex-alderman against
A more recent commentary on tombstones comes from Cro- three-and-a-quarter inches of inorganic chemist and know at a
wald (1964), who wandered through Moscow's Novo-Devich ceme- glance their comparative worth. When his son had one day
tery, noting the comparative treatment of old czarists and modern suggested that the exigencies of the printer and make-up man
communists. After noting that over Chekhov's grave a cherry tree might in part account for the amount of space accorded a
deceased, Poppa Hondorp had shivered with a rage his son
is appropriately blooming, he states:
knew he should never excite again. "Don't mess with credos,"
. . . the cemetery also tells a quieter, more dramatic tale. Climb- knew young Hondorp, so the obituaries were sacrosanct; the
ing out of some weedy grass is the washboard-sized marker of Times issued mysteriously from an immaculate source [p. 241.
Maxim Litvinov, once a Stalin foreign minister and the wartime Frecluently, one has a choice among different archival
Soviet envoy to America. His mitc of a marker reminds what
happened to those who fell from Stalin's favor [p. 121.
sources, and a useful alternative are directories, whether of resi-
dents, association members, or locations. Ianni (1957-1958) elected
Just as in ancient Rome, the timing of a wife's death makes a to use city directories as the primary source of data in his study of
difference in the nature of the tombstone. Here, this potential residential mobility. An analysis of these directories over time
contaminant is used as a piece of evidence. allowed him to establish the rates of mobility, and then relate these
mobility indices to the acculturation of ethnic groups.
Novo-Devich does show, too, that things have changed in Russia It is obviously a tedious task to perform such an analysis, and
since Stalin. For example, there is the great marble monument
the work includes a higl~ amount of dross. If possible, such
to Rosa Kaganovich. She was the wife of Lazar Kaganovich, the
Stalin lieutenant booted from power in 1957 by Premier Nikita mobility levels might be more efficiently indexed by access to
S. Khrusllchev. Kaganovich is in full disgrace, but he fell after change-of-address forms in the post office. But the question here
Stalin died. So his wife, who died in 1961, still got her big place becomes one of population restriction. Is the gain in efficiency that
in the cemetery. Fresh flowers decorate her marble [p. 121. comes from use of change-of-address forms worth the possible loss
in completeness of sampling? The answer comes, of course, from a
These objects are just big and small pieces of stone to the
preliminary study evaluating the two sources of data for their
uninformed, but to the investigator who possesses intelligence on
selective characteristics.
those buried and relates it to the stones, the humble and grandiose
For some studies, more selective directories are indicated,
memorials are significant evidence.
and the inclusion of a person in a directory serves as one element
In Rogow and Lasswell's (1963) discussion of "game puliti-
in the researcher's discriminations. Who's W h o i n America doesn't
cians," they note,
print everybody's name, nor does American Men ofscience. W . H.
. . . his relations with his immediate family were not close; 'Arn(11tlA. Rogow and Harold D. L a s s ~ ~ e lPower,l, Corruption, rrrzd R e c t i t ~ ~ d e ,
indeed his wife and children saw less of him during his active 0 1963. R q ~ r i n t r t l1)) p~rmission of' Prentice-Hall, Inc., Englew~)odCliffs, New
life than certain key individuals in his political organization. As Jerse).
a result he is remembered less by his family than by the state
64 UNOBTRUSIVE MEASURES I ARCHIVES I: THE RUNNING RECORD 65

Clark (1955) used both of these sources in his "A Study of Some of Kenneth Clark (1957) used the American Psychological As-
the Factors Leading to Achievement and Creativity with Special sociation's directory in his study of the psychological profession.
Reference to Religious Skepticism and Belief." Boring and Boring For any study extending over a long time period, the APA directory
(1948) used American Men of Science to choose the psychologists can be frustrating. As the number of psychologists grew, the detail
studied in their useful article on the intellectual genealogy of in the individual listings shrank. Thus, the number of items on
American psychologists. Fry (1933) had earlier used Who's Who in which a complete time series could be produced were reduced as
America in a study entitled "The Religious Affiliations of American tighter and tighter editing took place. The measuring instrument
Leaders." (See also Lehman & Witty, 1931.) I
was constant in its content for only a few pieces of information.
Fry's work showed that if one depends on the editors of such The change in the number of available categories of information is
directories for selective inclusion, one must also rely on the a detectable shift in the quality of the measure. Other changes,
individuals listed for complete reporting. All of the problems such as increasingly difficult requirements for membership or
associated with self-report are present, for the individual has a individuals responding to the greater bulk of the directory by
choice of whether or not he will include all data, and whether he writing more truncated listings, may change the character of the
will report accurately. The archive serves as an inexpensive instrument in a less visible way and produce significant differences
substitute for interviewing a large sample of subjects stratified which are, in fact, only recording artifacts.
along some known or unknown set of variables. Fry found that a Digging into the past, Marsh (1961) obtained the names of
1926 religious census showed 3.6 per cent of the general popula- 1,047 Chinese government officials from the government directo-
I
tion to be Jews, while only .75 per cent of the entries in Who's Who ries of 1778 and 1831-1879. He then correlated the ranks of the
in America were listed as Jews. Does this mean that Jews are less officials with the time required to reach a particular rank and with
distinguished, are discriminated against in being invited to appear other factors such as age and family background. If there was no
in the directory, or is there selective reporting by the Jews of their differential recording, one may conclude with Marsh that the rich
religion? Fry gave a partial answer to the question by a check of get there faster.
another directory - Who's Who in American Jewry. He found 432
persons in this directory who had reported no Jewish affiliation in POLITICAL
AND JUDICIAL RECORDS
Who's Who in America, thereby raising the Jewish percentage to
2.2. By raising the question of another plausible hypothesis for a A11 archival record - votes - is the dependent variable for
comparison (3.6 per cent of the population compared with .75 per ! office holders, the absolute criterion measure; but for the social
cent in the directory), he structured a question which was testable scientist, it is only an indicator. Votes cast by the people determine
by recourse to a second archival source highly pertinent to the the politician's most important piece of behavior (staying in office),
hypothesis. and votes cast by the legislator are the definitive test of his position
Babchuk and Bates (1962) employed the membership list of and alliances.
the American Sociological Association in their work, "Professor or Dozens of studies are available that have evaluated voting
Producer: The Two Faces of Academic Man." Membership lists of statistics by party or by individual. The ad hoc rhetorical condem-
professional organizations vary widely in their value. Given that nations of an opponent's record are at one end of a scale that is
the ASA membership list includes an efficiently obtainable list of anchored by sophisticated factor analysis on the other. In the
sociologists, how good a sample is it of the profession? There are interest of space, we have limited our examples to studies which
sociologists who are not ASA members, and they are likely to be have reported either unique or more convoluted manipulations of
qualitatively different from those who are members -particularly the essentially simple data.
on the professor versus producer dimension. And, a niggling point, The political slant of legislators has been a popular research
are sociologists a good sample of academic man? topic. "Progressivism" in the United States Senate was assessed
66 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 67

by Gage and Shimberg (1949). They wanted a sample of votes A11 elaborate set of analyses conducted by MacRae (1954a)
which would measure progressivism and picked ten bills for one illustrates a more complex analysis of multiple archival sources.
congress aild eight for another, getting coefficients of reproduci- MacRae asked how politicians with different seniority reduced the
bility of .88 and .91, respectively. With these data they studied a inherent insecurity of their elective jobs. T h e results are too
series of questions: Are younger senators more progressive than extensive to report, but these are among the variables used:
older ones? (No); Do senators from the same state tend to vote the Seniority (number of consecutive terms of office)
same? (No); Are regional differences significant? (Yes). Number of representatives from each party elected in dif-
MacRae (1954b) also studied the same type of legislative ferent years
tendency in his paper on the influence on voting of constituent Rates of vote-getting performance (an index of the legislator's
pressure and congressional social grouping. H e selected his sample vote compared with a control of the gubernatorial vote by
of critical votes by consulting the roll calls published by the New the candidate of the same party)
Republic and the CIO Nezus. His assumption was that these Primary-election performance (ratio to nearest competitor)
sources would only publish reports of votes on issues germane to Guttman liberal-conservative scale of voting issues
their presumably liberal readers. From these, he obtained a "lib- Voting behavior on key bills.
eral index" depending upon the direction of the vote. Only two classes of data are used-general election statistics and
A finer breakdown was made by Dempsey (1962), who esti- legislative voting- but MacRae's analysis indicates the richness of
mated conservative votes, but divided them along party and non- these archives for the venturesome student.
party issues. His subject was party loyalty in the Senate, and he T h e political scientist must work with roll-call votes; the
used the reports of roll-call votes on individual senators to provide desultory ayes and nays on voice votes are never traceable to their
a "Loyalty Shift Index." sources. An ,empirical question (one on which we call find no sub-
Moving over to the House, and the elnployment of all roll-call stantial research study) is the difference between bills voted on by
votes, Riker and Niemi (1962) looked at the question of congres- roll call or by voice. It is reasonable to expect that a difference
sional coalitions. They took votes on 87 roll calls, noting whether a exists and that some systematic criterion is at work. It may be that
congressman (1) voted on the winning side, (2) voted on the losing only the more significant votes get a roll call, but it is also possible
side, (3) did not vote when eligible, (4) did not vote when ineligible. that soine significant bills go through on a voice vote because
T h e rolls were classified into subsets, and finally an index of leaders of both parties choose to avoid the record on some sensi-
coalitions was produced. tive issues.
Farris (1958) also used roll-call votes to study coalitions within Taking the roll calls with proper hedging, though, there is the
the Congress. His article is of strong interest, because it details the choice of which bills yield the best evidence for a particular
methodological issues the political scientist faces in isolating research question, as well as directional decisions determining
ideological groups. Farris elected to use Guttinan scaling tech- the liberal or conservative stand on particular bills. T h e population
niques on a sample of roll-call votes from the House in the seventy- of' congressmen voting sometimes varies substantially also, for
ninth Congress. His sampling of votes was studied, and he ex- soine decide to evade a vote and a public stand on the bill. The
cluded 94 of the 231 possible roll calls because either there was no "pairing" system, in which an absent congressman inay announce
quorum, the vote was nearly unanimous, or sharp partisanship was what his vote would have been had he been present, elimiilates
shown. His scales included bills on "foreign policy" and "labor." some of this absentee error, but so long as congressmen avoid a
6'
. . . it is possible to construct three-, four-, and five-position vote, some population restriction is present.
ideological groupings by cross-tabulating members' positions on T h e content analysis of political speeches is another worka-
the several analytic issues" (p. 328). day practice of politicians and diplomats. In a study of group
68 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD

tension, for example, Grace and Tandy (1957) studied 13 speeches David Lawrence read it right
made by Soviet delegates before the League of Nations. Political Lippmann saw a Liberal light,
William Buckley sounded coolish,
tension may be indexed in many ways (see Bugental, 1948),but one Pearson's line was mostly foolish . . .
method is to search for archival evidence of activities designed to To play the game you choose your snippet
reduce tension and uncertainty. Of Peace on Earth and boldly clip it [p. 121.
For congressmen, one device is to study the degree to which In off-year and primary elections, only a small share of eligible
they make use of their franking (free mail) privilege. The rate of voters cast ballots. But this selective behavior is not damaging to
mail sent from congressional offices varies systematically, in a validity, since the critical variables-election or no election, mar-
pattern closely linked to the proximity of the election year. We are gin of victories-are posited directly on this selectivity. Lustig
not aware of any such study, but it should be possible to get an (1962) studied pro-integrationist voting by matching aggregate data
indirect measure of a congressman's perception of his job security on demographic characteristics and pro-integrationist votes. Pre-
by evaluating the use of this privilege. There are many specula- cinct votes in a southern campaign between a segregationist and
tions, for example, on what defines a "safe" seat for re-election to an integrationist were compared with census information. This is
the Congress. A journalistic benchmark is a classification based on an admittedly gross approach, but the sanctity of the voting booth
the extent of the congressman's victory. If he was elected by more precludes, usually, a direct study of individual voting behavior.
than 55 per cent of the vote, his seat is described as safe; if by less One can go back to survey questioning and relate these findings to
than 55 per cent, it is described as dangerous. It would be of the actual voting records, but the error in self-report of voting
interest to study the extent of mail sent out (correcting for different- behavior is so great that such a potentially reactive approach is
size constituencies) by the margin of victory in the last election. highly questionable. Digman and Tuttle (1961) have provided one
Such an analysis, combined with other intelligence, may provide a of the few pieces of research in which investigators were able to
more empirical definition of how congressmen themselves desig- sample individual ballots randomly. For most archival studies,
nate "safe" and "dangerous" seats. however, this degree of precision is unobtainable.
The behavior of the congressman can also be used to study The voting records of the people can also be used to measure
those outside the Congress. It is a common practice for a congress- the effect of experimenter intervention in pre-election settings.
man to insert into the Congressional Record newspaper columns Among the memorable studies in social science is Gosnell's (1927)
which reflect his point of view. In a study of political columnists, field experiment on getting out the votes. Selecting 12 electoral
Webb (1962a; 1963) employed these data for an estimate of precincts, Gosnell divided them into experimental and control
liberalism-conservatism among 12 Washington columnists. In- conditions. The experimental precincts were sent a series of
dividual members of Congress were assigned a liberal-conserva- nonpartisan messages encouraging registration and voting in a
tive score by evaluations of their voting record published by two forthcoming election, while the control precincts received only the
opposing groups - the conservative Americans for Constitutional normal pre-election stimulation - generally of a politicaUy partisan
Action and the liberal Committee on Political Action of the nature. The effect of the mail effort was determined by an analysis
AFL-CIO. The two evaluations correlated -.75 for Senators and of registration lists, pollbooks, and census material. Here Gosnell
-.86 for House members. Columnists were then ordered on the intervened in a "natural" way, established controls, and employed
mean score of the Congressmen who placed their articles into inexpensive archival records as his tellingly appropriate depen-
the Record. dent variable. Hartmann's (1936) study of "emotional" and "ra-
It would be valuable to supplement such an analysis with a tional" political campaign pieces also used votes.
comparative study of how various writers treat the same event. Bain and Hecock (1957) further demonstrated the ability to
Cogley (1963) suggested this in his verse on the interpretation of test persuasion principles in a natural laboratory free from the
the Papal encyclical, Pacem i n Terra: reactive biases of the university research suites. They were inter-
70 UNOBTRUSIVE MEASURES
ARCHIVES I: THE RUNNING RECORD 71

ested in the effect of physical position on alternative choices, and used by Snyder (1959) in a study of the degree of uncertainty in the
found data in the aggregate voting statistics from Michigan elec- whole United States judicial system. In one measure, uncertainty
tions. Michigan was chosen because of (1) the absence of a law was defined by the number of reversals of lower-court decisions by
requiring ballots to be burned after the election (an obvious the high court. With the precedent principle of stare decisis at
impediment to archival studies of voting behavior) and (2) the work in our court system, there should be few reversals if certainty
systematic rotation of candidates names on the Michigan ballot. is high. Moreover, to the degree that there is certainty (predictabil-
This rotation is practiced in several states, and represents an ity) of outcome upon appeal, there will be cases fought up through
assumption that position on the ballot does indeed have an effect. the inferior courts.
"Under [California] state law, the incumbent's name goes first on Green (1961) demonstrated the large store of judicial data
the ballot, and political handicappers give as much as a 20 per cent available to the social scientist. He gathered a sample of 1,437
edge - greater than the margin of most senatorial victories -to this cases from 1956-1957 police and court records of Philadelphia, in
psycl~ologicalprimacy" (Anonymous, 1964d, p. 28). order to study uniformity in sentencing and the criteria by which
Because of the Michigan system of rotation, Bain and Hecock sentences were decided. Three sets of variables were isolated as
could work an orthogonal analysis, establishing the vote of each sentencing criteria: legal factors, legally irrelevant factors, and
candidate and each position on the ballot. The findings supported factors in the criminal prosecution. The relative severity of differ-
the assumptions of veteran political hands and the ballot construc- ent types of sentences was measured by the extent of deprivation
tors: there was a significant position effect. It would be a provoca- of civil liberty. A series of nonparametric tests "provide assurance
tive study to take this and other naturally occurring possibilities that the deliberations of the sentencing judge are not at the mercy
for a position effect (e.g., the sale of goods in a supermarket when of his passions or prejudices but comply with the mandate of the
placed in different shelf positions), and compare the results with law" (p. 102).
those derived from the traditional experimental laboratory. There is a very real population restriction in using such data-
Schwartz and Skolnick (1962a) proposed a study of positive one that has been differential over time. It is a highly plausible
and negative incentives in tax compliance using changes in taxpay- hypothesis that appellate cases brought before the United States
ing in experimental and control groups as the dependent variable Supreme Court are not representative of the body of cases appear-
measure. This study would depend on the cooperation of the ing before the inferior courts. Historically, a substantial share of
Internal Revenue Service, which cannot legally disclose informa- high-court cases have involved affluent litigants, for the cost of
tion concerning the returns of any individual taxpayer (Schwartz, steering a case through the courts is demanding of both money and
1961). The problem can be avoided by group comparisons in which time. Some of the change in representativeness over time has
the individual's identity is not revealed to the researcher. Another come from the increasing affluence of our society, but more has
device is to study tax compliance in a state such as Wisconsin come from the growth of well-financed vested-interest groups who
where individual returns are legally available for research pur- will assume the costs of litigation for a party. The spurting growth
poses. within civil-rights groups of both legal talent and money, docu-
The votes of a judicial body provide data for other than obvi- mented by Vose (1959) and Krislov (1963), has meant an increasing
ous research. Kort (1957; 1958), Schubert (1959; 1963),Nagel (1962), number of cases before the appellate courts that might have never
and Ulmer (1963) have employed mathematical analyses of past been appealed fifty years ago. Thus, any comparison over time of
voting behavior by United States Supreme Court justices to pre- the behavior of the court system relative to some legal issue must
dict future votes-a more systematic attempt at the common game take account of this variation.
played by working lawyers and constitutional law experts. There may be some bias from the selection of cases (a content
The same body of information, Supreme Court decisions, was restriction), for only a portion of the cases submitted to the
72 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 73

Supreme Court are granted certiorari, i.e., accepted for ruling. In produce 297 characteristics for each city. And this in the era
research on some issues, it may be that the Court has systemati- before computers! Examples of Thorndike's measures are infant
cally excluded from consideration pertinent cases, or has excluded death rate, percentage of sixteen- and seventeen-year-olds attend-
a critical subclass. As in all cases of archival analysis, it is ing school, average salary of high-school teachers, and per capita
necessary to determine whether or not confounding population acreage of public parks.
restrictions exist. The advantage of legislative and judicial records Thorndike went further and gathered ratings of cities from
is that one can learn something of the nature of these restrictions. various occupational groups. He noted that "thoughtful people
It is a matter of record which legislators did not vote on roll calls, realize that popular opinions about cities derived from brief visits
and which cases were rejected for consideration by the court. and from what is heard and read about cities, are likely to err" (p.
142). How far they are likely to err (using his statistical data as
criteria) is demonstrated by these findings: infrequency of extreme
poverty correlates .69 with Thorndike's over-all goodness of cities,
Some government records are orthodox sources for the social but - .18 for the judgments of clergymen and social workers; the
scientist. The birth, death, and marriage archives, as we have infant death rate (reversed) correlates .82 with the aggregate
already seen, can be used for straightforward descriptive work or statistical index, but .03 with businessmen's ratings of the cities.
for less direct applications. Other records have less visible, but Few could read this report and not reap methodological profit.
equally fruitful, applications. In this section are examples of Mindak, Neibergs, and Anderson (1963) took tlie ongoing
research which has used power failures, municipal water pressure, records of parking-meter collections as one index of the effect
parking-meter collections, and the like as research data. of a Minneapolis newspaper strike. They hypothesized that one of
The weather is a reasonable start. Durkheim (1951), as noted the major effects of the strike would be a decrease in retail
above, used weather as one of the variables in his study of suicide. shopping. Since most of the parking meters were located in the
An early investigation by Lombroso (1891) also used archival downtown shopping area, revenue collection from them was a
analysis to note the effect of weather and time of year on scientific good piece of evidence on the strike's effect. The data showed
creativity. He drew a sample of 52 physical, chemical, and mathe- marked decreases during the months of the strike, using a control
matical discoveries and noted the time of their occurrence. His of previous years.
evidence, shaky as it is, showed that 22 of the major discoveries City budgets were the stuff of Angell's (1951) study on the
occurred in the spring, 15 in the autumn, 10 in the summer, and 5 moral integration of American cities. He prepared a "welfare effort
in winter. index" by computing local per capita expenditures for welfare and
There are studies, too, on the relationship between phases of combined this with a "crime index" based on FBI data to get an
b'
the moon and mental disorders. One made during World War I1 integration index."
showed psychosomatic illnesses to increase in the South Pacific Ross and Campbell (1965) showed that a close analysis of
with the fullness of the moon. It was subsequently discovered that traffic fatalities discounted the claim that a crackdown on speed-
Japanese bombing attacks followed the same pattern. ing in Connecticut had resulted in any significant decrease in the
Of all the studies using available records, few can measure up number of traffic fatalities.
to E. L. Thorndike's (1939) work on tlie goodness of cities. Aware A particularly interesting and novel use of data comes from
that "only the impartial study of many significant facts about cities the study of city water pressure as it relates to television viewing.
enables us to know them" (p. 147), Thorndike gathered 37 core For some time after the advent of television, there were anecdotal
pieces of information about each of 310 United States communi- remarks about a new periodicity in water-pressure levels -a perio-
ties. To develop his "goodness scale," he combined these to dicity linked to the beginning and end of programs. As the televi-
74 UNOBTRUSIVE MEASURES ARCHIVES I : T H E RUNNING RECORD 75

sion show ended, so the reports ran, a city's water pressure ures hit the island. Whole areas were blacked out, and it was
dropped, as drinks were obtained and toilets flushed. A graphic noticed that the timing of the power failures coincided with the
display of this hypothesis was provided by Mabley (1963), who time of commercials on the new television channel. The explana-
published a chart showing the water pressure for the city of tion provided was that viewers left their sets to turn on electric
Chicago on January 1, 1963. This was the day of a particularly water heaters to make tea. T h e resulting power surge, from so
tense Rose Bowl football game, and the chart shows a vacillating many heaters plugged in simultaneously, overloaded the capacity
plateau until the time the game ended, when the pressure level '
of a national power system unequipped to handle such peaks. The
plumnlets downward. commercials remain; new power stations have been built. This
Using this approach, one could study the relative popularity of measure is a more discontinuous one than water pressure, but it
different retirement times. Since a large amount of water is used would be of value to compare the water-pressure levels with power
by many people at bedtime, a comparison of the troughs at 9:00, demands. If the hypothesis is correct that the English were plug-
9:30, 10:00, and so on could be made. Similarly, a comparison ging in water heaters, one should find a higher correlation between
could be made in the morning hours to estimate time of arising. the two measures in the United Kingdom (with a small time lag as
Two problems arise. Do those who retire early use the same water precedes power) than in the United States.
amount of water a s those who retire late? It is, after all, possible Another imaginative link between two time series of archival
that a smaller number of showers and baths would be taken by data was provided by DeCharms and Moeller (1962). They gath-
those who retire late-particularly in areas with a high number of ered the number of patents issued by the United States Patent
apartment dwellers. Another difficulty of such a study is compari- Office from 1800 to 1950. Relating these to population figures, they
son across different time areas. Many people end the day by prepared a patent production index for 20-year periods over the
viewing television news. In the metropolitan Chicago area, for 150-year span. These data were then matched to findings from a
example, some three inillion people watch the 10:OO P.M. news- content analysis of children's readers for the same period, with a
casts. But the last major television newscast in the Eastern areas prime focus on achievement imagery. The matching showed a
is at 11:OO P.M. This one-hour variation might influence the times strong relationship between the amount of achievement imagery in
of people going to bed. T h e water-pressure index could help their sample of books and the number of patents per million
establish whether it did or not. Similarly, it could be used to study population.
the relative amount of attendance paid to entertainment and
commercial content of television. T h e critical point of study in this
research would be the water-pressure levels at the times of mid- Among the most easily available and massive sources of
show commercials. The prior decision of viewers to turn the set off continuing secondary data are the mass media. T h e variety, tex-
at a specified hour would influence the water-pressure index for ture, and scope of this enormous data pool have been neglected for
commercials at the beginning and end of shows, but should be a too long. In this section, we present a selected series of studies
minor rival hypothesis for those embedded within the entertain- which show intelligent manipulation of the mass media. We have
ment content of the program. necessarily excluded most content analyses and focused on a few
A similar measure, a more catastrophic one, has been mani- which illustrate particular points."
fest in the United Kingdom. This measure, electric-power failures, It is proper to start this section by citing Zipf, who sought
gives plausibility to the hypothesis that it is cominercials (and not order in diverse social phenomena by his inventive use of data that
earlier decisions to turn the set off) which influence the drops in
water pressure. At the time of the introduction of the Unitecl 'For general treatment 11f c~~lltent
analysis, see Berelson (1952): Pool (1959):
Kingdom's commercial television channel, a series of power fail- N ~ ~ r tetl i n l . (1963).
76 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 77

few others would perceive as germane to scientific inquiry. In a should be able to study a President's position on issues by studying
model study, Zipf (1946) looked at the determinants of the circula- the transcripts of his press conferences. Those answers on which a
tion of information. His hypothesis was that the probability of President stumbles in syntax, or which are prefaced by a string of
message transfer between one person and another is inversely evasive dependent clauses, may be symptomatic of trouble areas.
proportional to the distance between them. (See also Miller, 1947; Similarly, those questions which receive unusually long or short
Stewart, 1947; Zipf, 1949.) Without prejudice for content, he made replies may reflect significant content areas.
use of the content of the mass media, as well as sales performance. Analysis of transcripts such as these can be very difficult, and
How many and how long were out-of-town obituaries in the New often not enough substantive knowledge is available to rule out
York Times? How many out-of-town items appeared in the Chicago alternative hypotheses. A President is briefed on what are likely to
Tribune? Where did they originate? What was the sales level in be the topics of reporters' questions, and he has an opportunity to
cities besides New York and Chicago of the Times and Tribune? rehearse replies. The setting is not a nonreactive one, and the
To this information from and about the mass media, Zipf added awareness of his visibility and the import of his answers may
other archival sources. He asked the number of tons of goods influence their content and form. One must also make each
moved by Railway Express between various points, and checked President his own control. The verbal styles of Eisenhower,
on the number of bus, railroad, and air passengers between pairs Kennedy, and Johnson varied so greatly that any verbal index of
of cities. All of these were appropriate outcroppings for the test of syntax, glibness, or folksiness must be adjusted for the response
his hypothesis on inverse proportionality, and in all cases the data tendencies of the individual President.
conform, more or less closely, to his prediction. Less august reporting, that of the society news, served James
Other investigators have used the continuing record of the (1958) as evidence of community structure. The reporting of social
newspaper for their data. Grusky (196313) wanted to investigate the events is highly selective, of course, and most useful for studies of
relationship between administrative succession and subsequent the upper class (cf. Coleman & Neugarten, in preparation). The
change in group performance. One could manipulate leaders in a court-tennis victory of a truckdriver is not reported, nor the visit of
small-group laboratory, but, in addition, one can go, as Grusky did, his wife to Dubuque for the weekend.
to the newspapers for more "natural" and less reactive intelli- Comparison across different cities might be differentially
gence. From the sports pages and associated records, Grusky affected by shifts in the selectivity of society editors. It is a good
learned the performance of various professional football and assumption that the size of the city in which the paper is printed is
baseball teams, as well as the timing of changes in coaches and related to the selectivity of its social news: the larger the city, the
managers. Does changing a manager make a difference, or is it the greater the probability that a smaller segment of the city's popula-
meaningless machination of a front office looking for a scapegoat? tion appears in the society pages.
It does make a difference, and this old sports-writer's question is a Middleton (1960), mentioned earlier, conducted a longitudinal
group-dynamics problem, phrased through the stating of two study of the fertility values in magazine fiction, linking them to the
plausible rival hypotheses. In another study, Grusky (1963a) used actuarial fertility figures. This research suggested that the media,
baseball record books to study "The Effects of Formal Structures if carefully selected, can serve as a mirror of the society's values -
on Managerial Recruitment." He learned that former infielders or at least of some selective elements within the society.
and catchers (high-interaction personnel) were overrepresented How are psychologists, psychiatrists, and other psychologi-
among managers, while former pitchers and outfielders (low- cally oriented personnel differentiated by the society? One can ask
interaction personnel) were underrepresented. people of course, and one should. But of value, too, is a study of
This public-record characteristic of the newspaper also allows what the mass media contain on the question. Ehrle and Johnson
linguistic analysis. If verbal behavior really is expressive, then one (1961) plucked 4,760 cartoons, all of which pictured psychological
78 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 79

personnel, from six different consumer magazines. Their evidence Lloyd George, and underneath put the caption 'Do It Now,' and
suggests no substantial differentiation among the groups. This get the worst possible picture of Asquith and label it: 'Wait and
finding could be further tested by observing psychologists at See' " [p. 1651.
cocktail parties and noting how often they are asked, "Now how is As in all these studies of the running record, there is the
a psychologist different from a psychiatrist?" Or one could ask the opportunity for time-series analysis. One could learn if a medium
psychologists to relate their cocktail-party experience. is changing in its posture to a candidate (or potential candidate),
Ray (1965) has written of multiple confirmation through differ- and observe when. This last information, on the time of modifica-
ent sources of published material, noting as an example, "values of I tion, can help to validate other sources of data on the medium's
Hitler's Germany were compared with those of other countries by attitude.
content analyses of plays (McGranahan and Wayne, 1948), song- 1 The selective practice has been so prevalent over time that it
books (Sebald, 1962), handbooks for youth organizations (Lewin, is likely to have little "instrument decay" to invalidate time-series
1947), speeches (White, 1949) and the press (Lasswell, 1941j." analysis. It continues now, and offers some possibilities for inter-
Most of these have been examples of partial evidence con- esting new research. A television interviewer told Malcolm X, the
tained in the mass media of attitudes or social structure. Even the late Black Nationalist leader, that he was surprised at how much
most vitriolic critics of television commercials will admit that the Malcolm smiled. The Negro leader said that newspapers refused to
media themselves are a force within the society for socialization of print smiling pictures of him. For less extreme, but still marginal,
the young and attitude change of the old. Thus, they justify study. \ leaders, what is the pattern across time and regions? Of equal
I
G. A. Steiner (1963), for example, demonstrated the salience of interest would be the photographic strategy toward Richard Nixon
television for a national United States sample by showing the after his 1962 defeat for the California governorship. In a caustic
f
extreme alacrity with which sets are repaired. Before television, it and venomous statement immediately after the campaign, Nixon
had been said that a cigarette was the only object so compelling castigated the press for what he perceived as its anti-Nixon
that deprivation would cause one to walk out in a snowstorm. This stance. To casual observers, there was an immediate decrease in
energy allocation is also inferred in the advertising slogan, "I'd the favorability of the Nixon photographs printed.
walk a mile for a Camel." Tannenbaum and Noah (1959) studied press bias another way,
Much work exists on the political bias of the media. A content analyzing the verbs that appeared in sports headlines. They asked
analysis of press bias during presidential campaigns is as predicta- how many runs in baseball equal a "romp" or how many points in
ble as the campaigns themselves, but a fine, relatively unused, 1 football equal "X rolls over Y." In addition to providing descriptive
source is photographs. The element of editorial selectivity which is information on the empirical limits of such verb usage, they demon-
I
a contaminant in some other studies becomes the center of anal- / strated a home-town bias. The one-run margin might yield either
ysis. Editors have a large pool of photographs of a candidate from "Sox Edged 8-7" or "Sox Bludgeon Yankees 8-7."
which to pick, and the one they eventually choose is a revealing Winship and Allport (1943) were unconcerned with selling
piece of intelligence. One of the writers has noted this in American newspapers, but they did want to know something of the effect of
political campaigns, and Matthews (1957) has suggested that it is a positive and negative stimuli. Their study was conducted during
phenomenon that might be studied across societies. Writing of the the early years of World War 11, and they were opportunistic
British press, he states: enough to exploit the "victories" and "withdrawals" blazoned on
I headlines. Do potential readers buy more or fewer papers when
They [photographs] can be made to l i e . . . as Lord Northcliffe 1
was one of the first to discover. When he was using the Daily
Mail to try to get Asquith out as Prime Minister and Lloyd I positive headlines are used? For the measure of effect, they took
the street-stand sales of newspapers from four major cities, ignor-
George in, he once issued this order: "Get a smiling picture of ! ing the relatively invariant element of home-delivered news-
i
UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 81
80

papers. No significant sales difference could be traced to optimis- ratings of television westerns declined logarithmically once they
tic and pessimistic headlines. started to slide in popularity.
Optimism is a central element in another study - Griffith's Parker (1963) worked on the effect of the introduction of
television by studying library records. His topic was the differen-
(1949) original and adroit research on horse-race betters. The
tial effect of television on the reading of books. A time series was
newspapers supplied the odds, results, and payoffs for 1,386 horse
prepared of withdrawals, by type of book, from libraries in a series
races run in the spring and summer of 1947. His hypothesis is
of Illinois cities -both before and after television came to town. In
worth quoting:
one of the findings that looks like common sense (after the re-
If the psychological odds equaled the a posteriori [odds] given search is read), he learned that the withdrawals of nonfiction titles
by the reciprocal of the percentage [ofl winners, the product of were unaffected, but that there was a significant drop in the
the number of winners and their odds would equal the number
of entries at each odd group after correction of the odds had withdrawal of fiction titles.
been made for loss due to breakage and take. If the product R. W. Jones (1960) used the extent of library facilities and
exceeds the number of entries, the psychological odds were too personnel as the "index of progressivism" of a group of 154 Illinois
large; if the product is less than the number of entries, the odds towns. He corrected the raw figures by applying census data on
were too small [p. 2921.
community size, social class, race, occupation and income, popula-
The results, which should receive some distribution beyond tion age, and rate of community growth.
the archives of the American Journal of Psychology, suggest that Library withdrawals were also used by Parker (1964), who
long shots and favorites are overbet, while not enough money is put showed how a radio book-review program influenced the circula-
on horses with middle-range odds. tion of the books discussed.
The DeCharms and Moeller (1962) study of patents and Interest in library withdrawals was used by Vernon and
achievement imagery in children's readers was mentioned above. Brown (1963), who measured the "dynamic information seeking
Others have worked with the content of books, attempting to process" among tuberculosis patients. They predicted (and found)
puzzle out the popular mystery of why some books sell and others that the patients of uninformative doctors would be significantly
don't. Berreman (1940) gathered data on the sales of books as higher than patients of informative doctors in the degree to which
reported by the New York Herald Tribune book-review section and they sought out information relevant to their disease. They indexed
publicity outlays for individual titles, and performed a content this by the proportion of patients in the two groups who took
analysis: 60 titles were grouped into 12 classes by setting, theme, L ' t u b e r ~ ~ l o ~and
i ~ 7"non-tuherculosis"
' books from a hospital
and treatment of theme. He concluded, to oversimplify his com- library. In this study, the authors depended upon the reports of
plex P11.D. dissertation, that content was more important than patients to define informative and uninformative doctors.
publicity. His findings suggest that content probably depressed the Rashkis and Wallace (1959) demonstrated that the researcher
sale of the well-promoted poor sellers and "made" best sellers does not have to depend upon self-reports of the degree of attention
which had only feeble marketing efforts. (Cf. Kappel, 1948; paid by medical personnel. They observed the notes made by
Harvey, 1953). attending nurses on a patient's bedside record, notations that were
Sales data permit the testing of hypotheses on the way in both informal and required. The attention paid to the patient was
which items of popular culture rise and fall in favor. One hypothe- measured by the frequency of these notes per patient. This helps
sis now being tested by Eugene Webb and Robert Armstrong is to circumvent the possibility that perception of how much atten-
that sales of a book decline logarithmically once the sales peak has tion is being paid to one by medical personnel is heavily contami-
been reached. Data collected so far show only weak support for nated by the patient's degree of illness. The same amount of
this hypothesis. Webb (1962b) demonstrated in a similar study that attention would probably be differentially perceived under preop-
82 UNOBTRUSIVE PIEASURES ARCHIVES I: THE RUNNING RECORD 83

erative, postoperative, and about-to-be-released states. T h e bed- explanation of a research comparison. A recent Supreme Court
side chart may be more trustworthy. ruling on reapportionment of the House (to reflect population
Soviet writings on psychology could hardly be classed as mass distribution more adequately) will mean substantial changes in the
media, but the findings of O'Connor's (1961) research are of aggregate voting behavior of the House, influencing the decisional
substantive interest to any social scientist. O'Connor studied the setting for all congressmen, both those there before the change
amount of partisan philosophical (as opposed to empirical) content and the new members.
in Russian journal articles and notes "a tendency to move away With known changes in composition, it may b e necessary to
from philosophical prolegomena in journal articles and towards a segregate research findings by time periods in which relatively
direct discussion of experimental material" (p. 14, cited in Brozek, homogeneous external conditions held. This is a grosser correction
1964). than the more continuous correction possible for data linked to
population. Even with population, though, the only thoroughly
reliable data- the census totals - are produced only once every
DATATRANSFORMATIONSAND
ten years. T h e accuracy of intervening estimates, whether from
INDICES
OF THE RUNNINGRECORD
the Census Bureau itself or the highly reliable Sales Management
Of all the different classes of data treated in this monograph, magazine, are high but still imperfect.
none has so great a need for transformation as those cited in this T h e frailty of individual sets of records, which is discussed
chapter. Because the data are drawn from continuous records below, has caused many investigators to employ indices which
which typically extend over long periods of time, all the extraneous combine several different types or units of information. T h e
events of history are at work to threaten valid research compari- adequacy of such combinations rests, of course, on the degree to
sons. which the component elements are adequate outcroppings of the
Perhaps the most obvious of these is the change in the size of research hypothesis, as well as the degree to which appropriate
the population. The population increase has meant that the weights can be assigned to the elements. Setting these questions
absolute values of actuarial and allied data are relatively useless aside, however, it is apparent that combined indices must b e
for comparative purposes. In studies employing election records, employed when an investigator lacks a theory so precise and
for instance, the absolute number of votes cast provides an subtle as to predict a single critical test, or, when the theory's
inadequate base for most research purposes. It gives Mr. Nixon precision is adequate, no data exist for the critical test. For E. L.
little comfort, we are sure, to know that he garnered more votes in Thorndike's (1939) purpose in studying cities, there was no accept-
1960, as a loser, than did any preceding winning candidate except able alternative to transforming such data as park area and prop-
Eisenhower. Similarly, the absolute number of entries associated erty values into indices. And for MacRae (195413) and Riker and
with population level has changed over time. This secular trend in Niemi (1962), the unstable nature of a single vote by a congress-
the data is often best removed. Thus, Ianni (1957-1958) had to man forced the construction of indices of samples of votes, which
construct a relative index of residential mobility over time, and were hopefully a less ephemeral source for comparisons. MacRae
DeCharms and Moeller (1962) transformed patent production to an needed a "liberal index," Riker and Niemi, an "index of coali-
index tied to population. tions." Because the individual unit was highly suspect as a sam-
Time also works its effect by a change in the composition of a pling of the critical behavior under study, the sampling had to be
critical group. The number of congressmen in the House of Repre- expanded. There occurs, too, the attendant questions of how the
sentatives may stay relatively stable over a long time period, but units are to be stated, weighted, and combined.
the characteristics of these congressmen change - and, in chang- One of the major gains of the running record, then, is the
ing, produce a set of rival hypotheses for some investigator's capability to study a hypothesis as external conditions vary over
84 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD

time. Such analysis demands that the investigator consider all Researchers who use secondary sources are always open to the
charge that they are cavalier and uncritical in their use of
possible transformations before making comparisons, and also source materials, and cross cultural analysis-particularly
decide whether indices will provide a more stable and valid base when large numbers of societies are used with information
for hypothesis testing. This requirement is not as pronounced in taken out of context-is particularly vulnerable to such criti-
the discontinuous archival records cited in the chapter that cism [p. 1791.
follows nor among the observational and physical-evidence At the beginning of this chapter, we detailed the operating
methods. questions of selective deposit and selective survival of archives.
Both these contaminants can add significant restrictions to the
content and contributing populations of the archival materials. In
the discussion of individual research studies, we have noted how
It should be obvious that we prize the potential for historical
roll-call votes, marriage records, reports of congressional speeches,
analysis contained in running records.
letters to the editor, crime reports, and other records are all
The best fact is one that is set in a context, that is known in subject to substantial population or content restrictions in their
relation to other facts, that is perceived in part in the context of initial recording. To a lesser degree, the selective survival of
its past, that comes into understanding as an event which
acquires significance because it belongs in a continuous dy- records can be a serious contaminant, and in certain areas, such as
namic sequence.. . [Boring, 1963, p. 51. politics, it is always a prime question.
Those contaminants which threaten the temporal and cross-
If a research hypothesis, particularly for social behavior, can
sectional stability of the data are controllable through data trans-
survive the assaults of changing times and conditions, its plausibil-
formation and indexing methods -if they can be known. Happily,
ity is far greater than if it were tested by a method which strips
one of the more engaging attributes of many of these records is
away alien threats and evaluates the hypothesis in an assumptive,
that they contain a body of auxiliary data which allows the inves-
one-time test. Validity can be inferred from a hypothesis' robust-
tigator good access to knowledge of the population restrictions.
ness. If the events of time are vacillating, as they usually are, then
We have noted this for the absentee contaminant in congressional
only the valid hypothesis has the intellectual robustness to be
voting and the selective choice of cases in judicial proceedings.
sustained, while rival hypotheses expire.
With the actuarial material on birth, marriage, and death, it is
One pays a price in such time-series analysis, the necessary
often possible to find within the records, or in associated data
price of uncertainty. We again agree with that gentle stylist Boring series such as the census, information which will provide checks
(1963): "The seats on the train of progress all face backwards; you on the extent to which the research population is representative of
can see the past but only guess about the future" (p. 5). A hy- the universe to which the findings are to be generalized.
pothesis might not hold for anything but the past, but if the If the restrictions can be known, it is possible to consider the
present is tested, and a new, possibly better, hypothesis produced, alternative of randomly sampling from the body of records, with a
those same running records are available, as economical as ever, stratification control based on the knowledge of the population
for restudy and new testing. restriction. This is feasible for many of the records we have
For all the gains, however, the gnawing reality remains that mentioned because of their massiveness. Indeed, even if no sub-
archives have been produced for someone else and by someone stantial population contaminants exist, it is often advisable to
else. There must be a careful evaluation of the way in which the sample the data because of their unwieldy bulk. Since usually they
records were produced, for the risk is high that one is getting a can be divided into convenient sampling units, and also frequently
cutrate version of another's errors. Udy (1964) wrote of ethnogra-
classified in a form appropriate for stratification, the ability to
phic data:
86 UNOBTRUSIVE MEASURES ARCHIVES I: THE RUNNING RECORD 87

sample archival materials, particularly those in a continuous se- errors that come from awareness of being tested, from role elicita-
ries, is a decided advantage for this class of data. The sampling of tion, from response sets, and from the act of measurement as a
observations, or of traces of physical evidence, is markedly more change agent are all potentially working to confound comparisons.
difficult. With other data, such as the reports of presidential press confer-
The population restrictions are potentially controllable ences and census figures, the investigator has the additional bias
through auxiliary intelligence; the content restrictions are more of possible interviewer error passed along.
awkward. For all the varied records available, there may still be no For data collected by a second party, by someone other than
single set, or combination of sets, that provides an appropriate test the producer (birth and death records, weather reports, power
of an hypothesis. failures, patents, and the like), the risk of awareness, role, or
Something of this content rigidity is reflected in Walter Lipp- interviewer contaminants is present but low. The main problem
mann's (1955) discussion of the "decline of the west." Lippmann becomes one of instrument decay. Has the record-keeping process
writes of the turn of the century when been constant or knowably variant over the period of study? As
The public interest could be equated with that which was cited earlier, suicides in Prussia jumped 20 per cent between 1882
revealed in election returns, in sales reports, balance sheets, and 1883. It may be that response sets on the part of the record-
circulation figures, and statistics of expansion. As long as peace keepers, or a change in administrative practice, threatens valid
should be taken for granted, the public good could be thought of comparisons across time periods or geographic areas. To know of
a s being immanent in the aggregate of private transactions [p. this variation is extremely difficult, and it represents one of the
161.
niajor drawbacks to archival records.
Yet many of the studies reported in this chapter have revealed In summary, the running archival records offer a large mass of
the power of insightful minds to see appropriate data where pertinent data for many substantive areas of research. They are
associates only see "someone else's" records. There is little cheap to obtain, easy to sample, and the population restrictions
explicit in patent records, city water-pressure archives, parking- associated with them are often knowable and controllable through
meter collection records or children's readers to suggest their data transforinations and the construction of indices. But all
research utility. It required imagination to perceive the applica- content is not amenable to study by archival records, and there is
tion, and a willingness to follow an unconventional line of data an ever present risk that reactive or other elements in the data-
collection. Imagination cannot, of course, provide data if none are producing process will cause selective deposit or survival of the
there. Our thesis is solely that the content limitations of archival material. Against this must be balanced the opportunity for longi-
records are not as great as the social scientist bound by orthodoxy tudinal studies over time, studies in which one may test a hypothe-
thinks. sis by subjecting it to the rigor of evaluation in multiple settings
There is no easy way of knowing the degree to which reactive and at multiple times.
measurement errors exist among running archival records. These
are secondhand measures, and many of them are contaminated by
reactive biases, while others are not. The politician voting on a bill
is well aware that his action will noted by others; he may not be
aware that an observer in the gallery made a note of the tic in his
left eye when his name was called to vote. The records contributed
by the person or group studied-the votes, the speeches, the
entries written for directories-are produced with an awareness
that they may be interpreted as expressive behavior. Thus, those
ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 89

United States Navy records. All those who have learned his
results on air crashes tend now, as passengers, to squint studiously
at their pilots' height. Dipping back into the Navy records, Lodge
collected reports of 680 jet plane accidents, and then searched
other records for the height of the pilots. He learned that men
CHAPTER4 exceeding the average height of 72 inches had significantly more
accidents than their shorter contemporaries. This may be traced
to the design of aircraft cockpits and the visual angle on instru-
Archives II: The Episodic and Private Record ment panels.
We have divided the sources into three gross classes: sales
records, institutional records, and personal documents. All three
are potential substitutes for direct observation of behavior. This is
In the preceding chapter, we outlined the joys and sorrows of
most obvious with personal documents, where the unavailability of
those archives on which there is typically a running time record.
a source may force the investigator to use whatever alternatives
Here we continue our discussion of archives, but center on those
are available. But sales and institutional records may work in the
which are more discontinuous and usually not a part of the public
same way, and can broaden the scope of an investigation which is
record. Such data are more difficult to come upon than the public
primarily based on observation. They may fill in holes present in an
records, unless the investigator is affiliated with some organization
observational series, or be used to produce a broader sampling of
producing the material. The insurance sales of a casualty com-
the behavior under study.
pany, the nurse's record on a bedside clipboard, and last year's
Chadwick Alger once suggested to us that it would be
suicide notes from Los Angeles are more available to the "inside"
profitable for a political scientist to sit in the Delegates' Lounge of
investigator than they are to the curious outsider. But if these
the United Nations and observe how much whisky was downed. By
records are more difficult and costly to acquire than public
keying the consumption rate to action before the UN, Alger felt
records, they can often provide a gain in specificity of content. The
that an index of tension might be developed. In such a setting, the
amount of irrelevant dross commonly declines as an investigation
sales records of the bar might be an even better measure, for they
is limited to a particular set of privately produced data.
are less amenable to instrument-decay errors and permit a closer
We have already mentioned the risks to validity inherently
noting of type of drink ordered (Scotch, Canadian, Cuba libres,
present in archival records. The main analytic difference between
and so forth).
the records mentioned in this and in the earlier chapter is the
Brown (1960) suggested that records of soap consumption be
common inability to make longitudinal analyses of the private
substituted for ratings of the cleanliness of institutionalized pa-
data. Sometimes security is the reason, sometimes the data are
tients -ratings which are, after all, observation one step removed.
stored for shorter periods, sometimes financial and labor costs
Brown points out two ways of measuring soap usage: a liquid soap
preclude an analysis over time. Whatever the cause, this is a major
could be measured by reading the level of liquid in the dispenser
loss. The best defense against it is to find a related set and
each day, or bar soap by a measure of the water displacement
combine both- one continuous and the other discontinuous -for a
at the beginning and end of the period studied.
more textured series of comparisons.
Two other examples, both employing records of whisky con-
Some of the data in this chapter are episodic in character, but
sumption, illustrate the substitution of sales records for observation.
complete in reporting; many sources do maintain long and accu-
Hotel and restaurant records could be employed in a comparative
rate record-keeping systems. The military is one such source, and
study of occupations. One can observe members of an occupation
Lodge (1963) has conducted a provocative correlatio~zalstudy with
88
90 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 91

and attribute traits to them, but valuable auxiliary information travelers. With the bar facilities now expanded to exceed demand,
might come from the records on drink consumption and petty it will be possible to perform an analysis containing three sales
thievery in convention hotels. Do antl~ropologiststake more soap components: bar sales, trip insurance sales, and ticket sales. Each
and towels away with them than do mechanical engineers? Such of these must be corrected for the systematic biases present, but
an analysis is posited on the assumption that those who attend they should provide a more sensitive and nonreactive set of
conventions (and who stay, steal, and drink in conveiltion hotels) possible outcroppings of anxiety than any single variable study,
are a representative sample of the profession. particularly one based on the interviewing of travelers.
Hillebrandt (1962) sought data on the sale of alcoholic drinks Insurance, a paid-for hedge against risk, is an admirable
at Chicago airports in his study of passenger anxiety produced by measure of the effect of disaster. Just as one can examine trip
air crashes. He failed to use the proposed data because of the insurance sales and link them to crashes, one can examine the
insensitivity of the instrument. At that time, the major Chicago timing of casualty insurance sales and link them to the occurrence
airport had recently been completed, and the construction of bars of hurricanes or tornadoes. Add to this sales of life insurance
had not caught up with demand. Thus, there was negligible (compared to the time of death of close friends or relatives), and
variance from day to day in sales, and the small amount that one has a three-way index of the general effect of disaster.
existed seemed to be based upon the bartenders' speed, not on These same data could be used to test Zipf s (1946)hypothesis
exterior factors. further. Is the amount of casualty insurance taken out inversely
proportional to the distance from the disaster? How the mapping
of insurance underwriting compares to a meteorological map of
tornado probability would be a necessary control. It might be that
In a society as oriented to marketing and record keeping as the hypothesis holds only in areas which have had tornado experi-
ours, sales data abound for study of a varied body of content. As ence. This would give support to Zipf's hypothesis, but even
noted above, Hillebrandt (1962) did not get to use bar sales as a greater support would come if a significant amount of insurance
measure. He continued with his study, however, and used the were written in proximate areas with little or no tornado experi-
volume of air passengers. With a complete set of data over ence. What, in brief, is the nature of the generalization of effect?
time, he was able to transform the data, correcting for sys- How unlikely a source of research material is the sale of
tematic sources of variance irrelevant to his hypothesis. He peanuts! Yet, continuing in the vein of study of anxiety or tension,
partialled out the seasonal variation in air traffic, for example, and peanut sales are a possibility that should be systematically ex-
accounted for the secular changes in traffic level at Chicago's two plored. An anecdotal report appeared in the Chicago Sun-Times
major airports. The residual material demonstrated that crashes from the concessionaire in that city's baseball parks. He casually
were only a very short-term depressant on travel. Just as with the observed that peanut sales after the seventh inning of a game are
bar sales, there is some rigidity in his data which tends to blunt significantly higher than earlier-but only during a tight game. If
comparisons. A certain number of people have no alternative to the game is one-sided, there is no late-inning increase in peanut
flying, regardless of how dissettling was a major crash the day purchasing. Is this a sound, nonreactive measure of involvement or
before. Webb and Campbell plan to continue this analysis with a tension? It may be, but it illustrates the problems associated with
complementary measure. With the same exogenous variable, they such archival measures-one must pay special heed to rival
hope to plot the number and dollar value of trip insurance policies hypotheses. One hypothesis is that fans, during the increasingly
taken out by travelers at Chicago airports. Buying trip insurance is tense moments of the late innings, absentmindedly lean over and
a low-cost and simple behavior which should index anxiety of air compulsively crunch their way through more peanuts than earlier.
92 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD

But one should look at population restrictions. It may be that John Hancock
the finding (if it may be legitimatized as such) on the increase Winston Churchill
coming only in tight games is an artifact of selective attendance Napoleon I
and not tension. There is a hyperbolic curve of attendance in a one- Charles Dickens
sided game. A substantial number of fans usually arrive late, and Ralph Waldo Emerson
another substantial group leave early if the game appears to be Cdsar Franck
already decided. The population potential for peanut purchasing is Daniel Webster
thereby variable across innings, and the effect of a tight game Calvin Coolidge
should be to reduce the early departures and provide a larger base John Quincy Adams
for sales in later innings. For a finer test, a simple correction would Aldous Huxley
be to transform the peanut sales into unit sales per X thousand
fans per inning-a transformation possible by clocking turnstile The sales and pictorial content of stamps are a useful but
movement in and out. unused bit of expressive intelligence. An analysis of the illustra-
Another test of the hypothesis, using the same data, is possi- tions on stamps may give indications of the state of political
ble, although we can report no findings. The tension hypothesis opinion in the nation. What does an analysis of illustrations printed
would get strong support if, in a population of one-sided games, during the early years of the Fascist regimes in Germany and Italy
fans left in substantial numbers early and sales stayed stable from show? It has been suggested that stamp illustrations presage
the middle through the later innings. This would be reflected in an aggressive political action. Perhaps somewhere a philatelistic
ascending consumption curve if correction were made for those in psychologist has prepared this study, using illustrated sets of
attendance. stamp catalogues by year of issue.
But one other element of population restrictions remains to be A possible flaw in such analysis is a potential datum for
considered. We have examined only the issue of the absolute another topic of study. Each nation does not print its own stamps.
number of fans available in the park to buy the peanuts. Is there Many former colonial territories continue, as a cost-saving device,
any plausibility to the notion that those who leave early are more or to use secondhand engravings and the printing facilities of the
less devoted to peanuts? One might determine this by interview at former governing power. The stamp illustration may thereby re-
the exit gate, or by looking for traces of peanut shells in vacated flect an economic and not a political decision. But whether or not a
seats. former colonial territory still relies on the old power is itself a clue
Sales data can also be used to infer popularity and preference. to relations between the two. One could, for example, compare
The impact of Glenn's orbital flight was evidenced by record- former British and French colonies in Africa. Or compare new
breaking sales of the commemorative stamp issued to mark it. nations within what was a single colonial area. Guinea prints her
Similarly, the sale of commemorative Kennedy stamps and the own stamps; Mali buys hers from France.
great demand for Kennedy half-dollars after their issuance, as well The diffusion of information among physicians was the topic
as all the special books and the reappearance of ProJiles in of Coleman, Katz, and Menzel's (1957) research. Instead of the
Courage on the best-seller list, provide persuasive evidence, if any more standard, and reactive, tactic of interviewing doctors, they
were needed, of the man's influence on public thinking. elected to go to pharmacy records for information about which
Another measure of the popularity of a man is the value of his doctors prescribed what drugs when. Sampling at intervals over a
autograph in the commercial market. The supply level must be 15-month period, they related the physician's adoption of new
controlled, of course, but it is of some interest that the following drugs to his social network. Such hardnosed data can be a useful
prices held at the end of 1964: check on interviewing data, provided the effect of collecting such
94 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 95

records does not alter the behavior of the record-keepers. This is a caster (optimism or pessimism) and apply a secret correction to his
very implausible risk with drug prescriptions, but a reasonable one periodic predictions. This is a rather interesting data transforma-
when dealing with less legally controlled records. The danger is tion, since its existence shows the set rigidity of the forecasters.
not so much in masking information as it is in improving it. The They, too, know whether they have been over- or underestimating,
record-keeper may perform a more conscientious job because he yet that knowledge does not produce potent enough feedback to
knows that his work is being put to some use. If the investigator overcome the response set. Morgenstern (1963) notes reports of the
using such records stresses the greater glory to man that will come same type of corrections applied by Soviet planners in the 1930's.
from the record-keeper offering his cooperation, he may actually Boring (1961) has detailed how a similar type of response error
be increasing the risk that the instrumentation process will change was an early cause ce'lkbre in astronomy. Differences in reaction
-thereby threatening the validity of comparisons over time. times among astronomical observers became known, and the
The social-network variable employed by Coleman, Katz, and phrase ''personal equation" was coined to describe the bias. In an
Menzel can be measured by other methods than the standard evolving history, the contaminant existed but was unknown, be-
interview or sociogram. To study the extent to which interaction came known and the cause of study for the purpose of eliminating
among different departments of a university took place, its biasing effect, and then became the substantive material of a
one could use orthodox procedures. In addition, so humble a large body of psychological research.
document as a desk calendar might be checked. This record can The sale of stocks was used by Ashley (1962) in a study
provide information on who lunched with whom, with what degree interesting for what it says of positive and negative reward. Isolat-
of frequency, and across what departments. Not everyone notes ing firms which announced an unexpected dividend or earnings
such engagements (a population restriction), and the desk calendar statement, either up or down, Ashley traced the stock prices
is not a likely source for learning of other engagements, such as following the announcement. An unexpectedly high dividend in-
social dinners or meetings so regular that they don't have to be fluenced the price of the stock for about 15 days, while an unex-
noted (content restrictions). Staying just with the lunch dates, one pectedly low dividend or earnings statement had an effect for only
route to learn the character of the restrictions would be to enlist about four days. There are other places to measure extinction than
the aid of waiters in faculty clubs. The reader can conjure up in a Skinner box. See, for instance, Winship and Allport's (1943)
objections to this assistance. study on newspaper headlines, mentioned earlier, and Griffith's
Drug-sale records are in common use by pharmaceutical (1949) research on horse-race odds-both studies of optimism-
houses to evaluate the sales effort of their detail men. Both the pessimism. Hamilton (1942) has also conducted an interesting
houses and the detail men have learned that the verbal statement content analysis on the rise of pessimism in widely circulated
or the observed enthusiasm of a doctor for a new drug is a highly Protestant sermons.
unstable predictor of what he will prescribe. If records are availa- A more common use of sales records (but still surprisingly
ble for checks on self-reports, they should by all means be used. uncommon) is as a measure of propaganda effectiveness. Within
Such checks are particularly useful when the data are produced advertising, particularly, there has been extensive writing on the
continuously by the same subjects, for then a correction can be inadequacy of survey methodology to predict the advertiser's
applied to the self-reports. If the assumption can be made that the major criterion, sales. An excellent annotated bibliography review-
character of the error is constant over time, it may not be nec- ing advertising's effect on sales has been prepared by Krueger and
essary to run both sets of data concurrently. Ramond (1965).
Something of an analogous correction can be seen with eco- Henry (1958) gives numerous examples of the discrepancies
nomic forecasters. Firms which employ more than one forecaster between respondents' reports and sales figures, indicating that the
are said to compute the response-set characteristic of the fore- reasons a consumer gives for buying a product cannot be relied
ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 97
96 UNOBTRUSIVE MEASURES

upon. Reactive measures thus suspect, other methods of consumer This has been done by comparing sales attributable to a single
preference measurement must be found. theme against a control of past sales, and also by employing
Henry mentions an example of a controlled sales experiment multiple themes and comparing one against the other. In this
on "shelf appeal." One brand of candy is sold in three types of latter approach, close attention has been paid to time-sampling
wrappers, with variables such as shelf position and number of problems, so as to protect the equivalence of the populations
units displayed controlled. The dependent variable is the sales exposed to each version.
level of the different packages. One could similarly experiment with vending machines, al-
There is an ideal experimental medium for advertising re- though we do not know of any such research. By random assign-
search in direct-mail sales efforts. Lucas and Britt (1963) comment ment of the display of experimental cigarette packages to ma-
upon a number of studies which have varied such elements as the chines, or by systematically varying exhortatory messages over
color of paper, inclusion of various incentives, and stamping a machines, one could employ lever-pulling as the effectiveness
return envelope by hand or by meter. They also cite the example of measure.
a department store which might send out a monthly statement to Aside from the commercial applications of such research,
direct-mail advertising, vending machines, and the like offer a fine
customers containing one of ten variants of an advertisement
selling a specific product. The average return per layout would natural laboratory for the study of persuasion. Mindak, Neibergs,
and Anderson (1963) used the sales of tickets to parades and to a
then be determined by subsequent sales.
The large mail-order houses (Sears, Ward, Spiegel and others) civic aquatennial as one of their several measures of the effect of a
newspaper strike. Roens (1961) reported on the use of different
regularly conduct controlled experiments on different thematic
combinations of media to carry the same propaganda theme (for
appeals. This is easily performed by varying the content of the
Scott paper), and Berreman's (1940) study of factors affecting the
appeal and simply counting the returns attributable to the differ-
sale of novels found that publicity elements had little effect.
ent sources. In a revealing finding, one of these houses discovered
DeFleur and Petranoff (1959) investigated subliminal persua-
that an advertisement describing a self-riding lawnmower as some-
sion by sales measures. Since the alleged subliminal effect was
thing of an adult's toy dramatically outsold an appeal which argued
first reported (with no data) as the result of a filmed message, they
its superior functional merits.
flashed "Buy Product X" on a screen subliminally for several
Using sales results of a promotional campaign, Blomgren and
experimental weeks. The effect of this was measured by the
Scheuneman (1961) found a "scare" approach was less effective
for selling seat belts than one that featured a professional racing deviation from normal of a food wholesaler's orders. No significant
driver and appealed to masculine control and relaxation. effect was observed.
Over thirty years ago, Jahoda-Lazarsfeld and Zeisel (1932) The stuff of commercial persuasion, advertisements proper,
studied the impact of the depression by noting the level of grocery has been employed in a number of ways for other purposes. The
sales. The same measure is used now to evaluate the efficacy of want-ad columns of newspapers have served as economic predic-
different sales themes. A before-after design was used by the tors (Fowler, 1962). Contrariwise, they may provide the data for
National Advertising Company (1963) to learn the effect of large historical analysis. The Security First National Bank of Los An-
outdoor signs placed in the parking lots of three shopping centers. geles compiles a regional index of want-ad frequency and reports
variance over time in this index coincidental with basic economic
They used store audits to determine the sales of products ad-
data. A management consulting firm prepares an index based only
vertised on the signs and then compared these data to sales from a
on advertisements for engineers and scientists. Although incom-
control sample of equivalent stores. Another method of "pre-
plete data are reported by Fowler, she writes that the firm "likes
testing" advertising themes is to place treatments of the theme in
to consider its index as a leading indicator of how the economy
prominent places within a supermarket and then observe sales.
98 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 99

fares, rather than a coincident indicator" (p. 12). That optimistic excellent short treatment of this subject, as have Ghiselli and
assessment illustrates the necessity for considering the time- Brown (1955) and Whisler and Harper (1962). But these have not
linked nature of a measure. The statement was true for a long supplanted the singular statements on "criteria of criteria" by R.
period of time -from the late forties possibly up to 1964. But since L. Thorndike (1949).l
the employment of engineers and scientists is intimately tied to The number of private records marshalled by the illdustrial
defense contracts, a major change in Defense Department policy psycliologists has been impressive. Amount and quality of output
will throw off employment levels, thus the ads, thus the indicator. are probably the most frequently used behavioral measures, and
Just such a change occurred during Secretary Robert McNamara's are usually expressed in some transformed score -the number of
administration, and the 1964 advertisement levels should be nega- units produced by a worker or department per unit of time, the
tively correlated with the state of the economy. amount of sales per unit of time, or the profitability of activity by
Advertisements have also been used to test theories of social dollars invested by the firm. The known subjectivity of ratings by
change. Assuming that ads reflect values, Dornbusch and Hick- supervisors or foremen increasingly moved many of the specialists
man (1959) sampled 816 issues of the Ladies Home Journal and in this area to pure behavioral measures, but ratings remain
analyzed the content of advertising to estimate the degree of because of the difficulty in making behavioral measures compara-
"other-directedness" displayed. These data tested Riesman's ble.
hypotheses on the history of "other-directedness." Two classes of Private records' have their difficulties, and Whisler and
indices were employed: (1) endorsements by persons or groups and Harper (1962) are helpful in illustrating problems in making com-
(2) claims that a product is related to satisfactions in interpersonal parisons across departments or workers. The factors involved in
relations. any such comparison of records are numerous and vary from one
Singh and Huang (1962) made a cross-cultural study of adver- situation to another-the type of work, physical working condi-
tising, comparing American and Indian advertising for similarity tions, work group cohesion, and the like. Ghiselli and Brown (1955)
and relating the findings to socioeconomic and cultural factors. struggle with the problem of appropriate controls for work behav-
Since they used print advertising, it is important to take account of ior and offer a series of possible work standards: group average
differential literacy rates in such comparisons. If the advertisers production, rates of selected individuals known to be "good"
are addressing their messages to literate "prospects" and reflect- workers (with some discounting then applied), experimentally
ing the values of those "prospects," the differential literacy rate determined times for tasks, and "rational analysis" by people
may interact with magazine readership and prospect status to yield familiar with the problem.
differences between societies that are spurious. This is less a We review a body of studies with data drawn from the
concern with broadcast advertising, although there may be a institutional records of companies, schools, hospitals, and the
population restriction in that those who either own or are more military. A good share of these come from industry, but the overlap
available to radio or television receivers vary systematically across with other institutions is marked.
countries. Most of the current writers argue for multidimensional cri-
teria. Ghiselli and Brown (1955) give the humble example of a
streetcar motorman, indicating a series of proficiency measures:
Among the finest work to be found on discussion of multiple 1. Number of collisions with pedestrians
methods and the criterion problem is that of the industrial psy- 2. Number of traffic violations
chologists. One rarely finds such attention to the relative merits of
'For other articles of methodological interest on criteria see Brogden and
ratings versus observation versus performance versus interviewing Taylor (1950), Gordon (1950), Fislie (1951), Bass (1952), Severin (1952), Rush (1953),
versus questionnaires versus tests. Guion (1961) has offered an MacKinney (1960), and Turner (1960).
100 UNOBTRUSIVE MEASURES
ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 101
3. Number of commendations from public I
from the factory. Melbin (1961) also used absenteeism and job
4. Number of complaints from public turnover in research on psychiatric aids. He compared these data
5. Number of times company rules broken to archival reports on work assignments. The correlational meth-
6. Number of sleepovers (tardiness)
ods limited his ability to establish cause-and-effect relationships,
7. Number of times schedules broken
but he was able to trace a double-directioned effect between
8. Number of reprimands from inspectors
changes in work assignments and absences.
9. Ratings by inspectors
Job turnover is an ambiguous measure-sometimes it is an
10. Errors reported by dispatchers.
administrative action, sometimes an action dictated by the indi-
As always with multiple measures such as these, the question vidual employee. R. L. Thorndike (1949) holds that administrative
comes up of the advisability of combining the various measures actions should be considered as a discrete class of criteria. Along
into a single composite score. Consider in the list above the with many others, Guilford (1956) used the record of pay increases
problem of weighting each of the ten variables. Ghiselli and Brown as his measure of the firm's appraisal of the individual. Weitz
suggest multiple cutoffs - establishing a minimum performance (1958) suggests the imperfectly correlated measure of promotion
level for each component element of the index. The way in which within the company, while Jay and Copes (1957) speak of job
the variables are combined, and the final score reached, can be survival as a criterion. Merely to stay on a job, without being fired,
disastrously misleading if some minimum standard is not met on is indicative of an administrative decision that the employee is not
each of the tasks of the job. They offer the highly reasonable too bad.
example that an airline pilot should be able to land a plane as well Whisler and Harper (1962) also speak of seniority as a crite-
as take one off-perhaps not so gracefully, but still with an rion and discuss the implications of this in the union-management
irreducible degree of proficiency (cf. Coombs, 1963). struggle over definition of criteria. They state that seniority has
Another multimeasure study, this one of ship effectiveness in appeal because of its qualities of objectivity and precision, and
the Navy, was conducted by Campbell (1956). Rather than that it is not a simple case of the union wanting seniority and the
rely solely on ratings by the captain or crew members, Campbell management fighting it. Promotion from within the management
examined the ship records for reports of ship inspections, torpedo group and the high value placed on experience is evidence of the
firings, re-enlistment rates of those aboard, requests for transfer, use of seniority within the management tier.
and disciplinary actions. Snyder and Sechrest (1959) combined One could also observe other management actions which
behavioral violations and ratings in their work showing the positive reflect the esteem with which an employee is held. The rug on the
effect of directive therapy in defective delinquents. floor, the drapes on the window, the white telephone, the second
Re-enlistment rates are a measure of job turnover, and Evan secretary, and the corner office are all salient cues to an em-
(1963) explored this topic in research on student workers. Person- ployee's success with the firm.
nel records were examined to find a possible relationship between
Other industrial research with absentee records has included
job turnover and departmental placement. He reasoned that a high
Amthauer's (1963) work on social psyclzological settings and Bern-
level of interaction on the job with other student workers would
berg's (1952) study of departmental morale. Amthauer took ab-
have a stress-reducing effect and result in a lower rate of job
senteeism as his dependent variable and considered a string of
turnover. He supported this hypothesis by showing that the larger
independent variables: motivation, intelligence, persistence, and
the number of students with whom a worker could interact, the ,
stability, as well as the individual's relationship with apprentices
lower the rate of quitting.
and with instructors. Note that although absenteeism is not implic-
Knox (1961) added absenteeism to job turnover and correlated
itly reactive, the other measures of individual psychological
both with age, seniority, and the distance of the worker's home
characteristics were. So, too, for Bernberg, who measured depart-
102 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 103

mental morale by measures with high reactive risk, but matched sick-call data, drawing statistics from the Office of the Air Surgeon
these against an elaborated set of absentee measures. He took four on members of heavy bomber crews in the European theater dur-
variants of absenteeism as his variables: unexcused absence of ing June, 1944. Sick-call rate was correlated with self-evaluations
one day or longer, tardiness, absence of less than one day, and of physical condition. Stouffer suggests that the contrast between
trips to the medical unit not resulting from accident or disease. For attitudes toward one's physical condition and behavior with respect
a review of studies focused on absenteeism in the industrial to physical symptoms reflected the men's tendency to save up
setting, see Brayfield and Crockett (1955). complaints until their tour of 30 missions was completed. He specu-
The unions keep books, too, and Stuart (1963) was canny lates that they might have been motivated by a fear of not complet-
enough to use grievance records in a study of racial conflict. He ing a tour of combat duty and thus running the risk of postponing a
collected 364 verbatim records of the grievance board of a large -
return to the United States.
union in the textile industry-data extending over a seven-year More military records were used by Fiedler and his associates
period. The complai~ltswere analyzed to determine feelings, atti- (Fiedler et al., 1958; Fiedler, 1962). These studies examined "ad-
tudes, and actions against Negro and Spanish-speaking workers justment" in nonclinical populations-Army units and student
who comprised a majority of the union. Their complaints are populations. For the Army personnel, they obtained sick-call data,
important evidence, but there is the risk of the bias one notes in disciplinary-offense ratings, and court-martial records; for the
studies of political speeches. Although the events occurred in the students, course grades, number of visits to the student health
past, and the investigator did not intrude in the production of the center, and student counseling bureau visits. Little relationship
data, the subjects were very much aware that their remarks were was found among these measures of "adjustment," underlining
"
on the record." This perforce limits the degree of generalization the problem of combining pieces of evidence which look to be
possible. similar in reflecting some characteristics.
McGrath (1962) also used indirect data in his study of friend- I11 a more circumscribed investigation, Mechanic and Volkart
ship and group behavior. The dependent variable was the competi- (1961) probed sick-call visits further, with a population of college
tive performance of rifle teams, plotted against whether the team students. They showed the frequency of visits to be positively
had given favorable or unfavorable ratings to former teammates and related to the subject's degree of stress and his tendency to play
how the team had been rated. I. D. Steiner (1964) comments on a the sick role. Medical visits are not a pure measure of a single
possible restriction on generalization: "Additional research is trait, and few single archival measures are ever pure. The employ-
needed to determine whether the individualistic orientation which ment of multiple measures of the same hypothesized "trait" is
was a boon to rifle teams would also promote productivity in always indicated (Campbell & Fiske, 1959). We stress this because
situations calling for cooperative group action" (p. 434). many of the researchers who have been venturesome enough to
Hall and Willerman (1963) studied the effect of college room- employ nonreactive data have done so to define or classify subjects
mates on grade performance. They set up experimental combina- -often then administering questionnaires and interviews to the
tions of dormitory roommates with varying academic ability. Stu- subjects. This is a highly appropriate device to combine research
dents rooming with high-ability students obtained better grades methods, but its validity is posited on the accuracy of the trait
than those rooming with low-ability students; this condition held, definition contributed by the initial record. A multiple-method
however, only when the roommate was later born in his family. approach is the best hedge against error (cf. R. L. Thorndike,
Hall and Willerman conclude that these results support Schachter's 1949).
(1959) thesis that first-born students are more susceptible to Schwartz and Stanton (1950) suggest a study of the social
influence and later-born ones are more influential. situation of the hospital combining observational and archival
Stouffer (Stouffer et al., 194a9) was among those reporting on materials. The observation is of negative incidents in a ward, and
104 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 105
the archive is a measure of incontinent behavior: the amount and learned that there was an extensive correspondence between the
type of laundry done for that ward. In their exploratory observa- two countries and that many of the letters were being thrown
tional study, they kept complete records of the patients' daily away. From this lead, advertisements were placed which offered to
activities, and were able to determine connections between certain pay for each letter produced.
types of negative incidents and incontinent behavior. They further There are, to be sure, substantial questions about the popula-
suggest that the laundry record could be useful in establishing the tion and content restrictions in the letters Thomas and Znaniecki
effects of changes in therapeutic methods within a ward. gathered; there are in any body of voluntarily produced (even for
pay) research materials. Typically, they only had one side of the
exchange, a common and frustrating condition often bemoaned by
biographers and historians. In a commentary on this study, Riley
The last major class of more or less private archives is
(1963) states:
personal documents. These have been more the bailiwick of the
historiographer than the behavioral scientist, but a number of In all such instances, then, their data refer only to selected
notable studies have been performed using personal documents. members of each group (family) and cover only part of the
Cox (1926), in Volume 2 of Terman's Genetic Studies of Genius, interaction. These gaps illustrate an important potential limita-
tion in the use of available data generally: not having been
used documentary evidence of all kinds, and, on the history
assembled for the purpose of the investigation, the data may be
of science side, we have Terman's early (1917) study estimating fragmentary or incomplete, thus depriving the researcher of
Galton's IQ. Centering on records of Galton's prowess between valuable information.
the ages of three and eight (he could read any Endish-language Another limitation is that such privately owned and sponta-
book by five and knew the Iliad and Odyssey by six), Terman neously produced materials may be rare or difficult to obtain.
compared them with the ages at which other children are able to Owners of letters, diaries, or other personal documents may
accomplish the same or similar achievements and estimated
.
sometimes object to their use for research purposes. . . More-
over, situations producing appropriate materials may be rare.
Galton's IQ to be not far from 200. The continuing exchange of letters, for example, seems to
There have been important methodological works, such as G. depend upon long-term or frequent separation of the members,
W. Allport's (1942) monograph, The Use of Personal Documents in as well as upon a custom of detailed letter writing. Neverthe-
Psychological Science, and facilitating method papers, such as less, there are no doubt many instances in which similar data
are available for further research, as, for instance, when ser-
Dollard and Mowrer's (1947) system to determine the amount of vicemen are separated from their families [pp. 242-2431.
tension in written documents by a "Discomfort Relief Quotient."
But for all this, written documents have been another of the Riley thus points out that the dross rate may be high ("situations
underdeveloped data resources of social science. In the examples producing appropriate materials may be rare"), and that popula-
that follow, we cite some of the major studies using written tion restrictions may be present (". . . exchange. . . seems to de-
documents and illustrate some of the rival hypotheses coincident pend upon long-term or frequent separation. . . owners may some-
with them. times object to their use"). Specifically for cross-cultural compari-
One could not think of letters as a research source without son purposes, there is the question of differential literacy rates.
bringing to mind Thomas and Znaniecki's (1918) classic study of How many of the Polish peasants could write? If they could not,
the Polish peasant. Letters sent between Poland and the United and had letters written for them by others, say, village scribes, did
States were one of the major elements in a data pool that included the presence of these intervening persons serve to alter the
autobiographies, newspaper accounts, court proceedings, and the content of the letters? On the voluntary supplying of the letters,
records of social agencies. Rather by happenstance, Thomas did the correspondent: give up only a biased sample? A money
106 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD

incentive might have to be prohibitively high to pry loose some love mail showed no realization of the stand he had so publicly
taken. The Senator could scarcely get excited about these letter
letters, for example, or letters which detailed complaints about the writers as either a source of opposition or of support on the
correspondent's frugality in sending money back home. basis of that issue [p. 3991."
In Sunday feature articles, one sometimes reads another
group of one-way letters: those sent from children in summer If the only goal of the senator is to stay in office, the mail from
camp to their parents. By themselves, they are instructive of a an ineligible voter is only so much dross. His greater concern is the
child's perception of the surrounding world. Salzinger (1958) got lack of any intelligence from the great mass of eligible voters who
the other end of this candid correspondence as well, and analyzed don't write.
the content of mail from children and parents, comparing the Dexter goes on to discuss why mail is important, but speaks of
letters for similarity on "wants," "demands," and "requests." the congressman's description of "genuine," "junk," and "stimu-
Janowitz (1958) dealt with letters and diaries captured from lated" mail. Of interest is the way in which they are discriminated.
German soldiers. His concern was the impact of propaganda on It [mail] is not believed if "junk," i.e., press releases or
these troops, and "when these letters dealt with the German writer other broadcast mailings, nor if it be stimulated. Stimulated mail
himself, or his small circle of friends, they contained testimony of is not entirely easy to define. In its pure form it consists of
considerable value. Many made valuable propaganda documents, virtually identical postcard messages written under the instiga-
tion of a single company, union or interest group. (One company
especially captured undelivered mail" (p. 734). This last point is of even mailed the postcards for its workers, fearing that they
interest, for the undelivered mail is a subset of mail which is most would not know who their congressman was.) Congressmen look
recent and most pertinent for evaluation of propaganda effect. for signs of stimulation-similarity of phrasing ("They all used
Letters captured on the person of troops may also be particularly the same argument.") or even stationery ("They handed out the
valuable, for they contain not only the most recent expressions of paper.") and time of mailing ("You could tell the hour or minute
someone pushed the button."). . . it is hard to fool a congress-
feeling, but may also inelude letters the writer may have been
man as to when mail is stimulated. Some organizations urge
postponing mailing. Such uncertain writings may be prime indica- their members to write in their own words, on their own
tors of attitudes and morale, for the easy stereotypes of "I'm fine stationery, and as personally as possible. Congressional assist-
Ma and the food's not bad" are more likely to be quickly dis- ants tell us that perhaps one in fifty persons who write such a
patched. letter will enclose the original printed notice from the oraaniza-
-
tion urging an individualized apparently spontaneous letter
Letters to political figures are another source of data. For
[p. 4031.
some magnificent examples of these, as well as a general treatment
of the topic, one can examine Dear F. D. R. (Sussman, 1963, an As for the extent of this false element in spontaneous mail,
earlier report of which is in Sussman, 1959. See also Dexter, 1963). Most of the mail sent on the Reciprocal Trade Act was in some
A particularly fine discussion of possible sources of error in sense stimulated . . . [for] Eastern and Southern congress-
letters is presented in Dexter's (1964) chapter on letters to con- men. . . Westinghouse, Dow, Monsanto, and Pittsburgh Plate
gressmen. He notes that congressmen do not necessarily see any Glass may have stimulated 40 per cent or more of all the mail
received on the issue in 1954.. . . Mail in favor of reciprocal
cause for alarm in a barrage of negative mail.
trade was equally stimulated and ~ e r h a p sby even fewer prime
One "pro-labor" Senator, out of curiosity, had his staff check up 2Reprinted with permission of The Free Press of Glencoe from People,
the writers of 100 letters he received advocating support of a Society ancl Mass Communications, edited by L. A. Dexter and D. M. White.
higher minimum wage. It was found that 75 writers were eligible Copyright @ 1964 by The Free Press of Glencoe, a division of The Macmillan
to register, but of these only 33 actually were registered. Fur- Company. Based upon "Congressmen and the People They Listen To," Massachu-
thermore, the letters were advocating his support of a measure setts Institute of Technology, 1955, and American Business and Public Policy,
on which he had been particularly active, and the content of the Atherton, New York, 1963.
108 UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 109
movers. Our impression is that three-fourths of all antiprotec- action. They drew on the Los Angeles County Coroner's office for
tionist mail was stimulated directly or indirectly by the League their material, noting a control of "false" suicide notes. Other
of Women Voters [pp. 403-4041,
suicide-note studies have been conducted by Gottschalk and
A check on the true level of protest mail was made by the Gleser (1960) and by Schneidman and Farberow (1957).
Xerox Corporation. Flooded with negative letters after sponsorship Art work is another expressive personal document that may
of a television series on the United Nations, they hired a group of provide data-as is shown by all the clinical psychologists who
handwriting experts to examine the mail. "A total of 51,279 pro- look at Van Gogh's paintings and say, "That man was in trouble!"
tests had been received. The handwriting experts determined An equally post hoc analysis, but with more analytic elegance, has
that the letters were written by 12,785 persons. The latter figures been contributed by Barry (1957). He studied the complexity of art
practically equalled the number of favorable letters" (Kupcinet, form as related to severity of socialization. From Whiting and
1965). No mention is made of an equivalent analysis of the "pro" Child's 76 nonliterate societies on which socialization data are
letters. available, he found 30 with at least ten extant examples of graphic
This selective bias in the population mailing letters to con- art - either displays in museums or illustrations in ethnographic
gressmen or others results in an invalid generalization on the state reports. There was a low-level association between complexity of
of public opinion, but it can serve as evidence of how the major art form and degree of severity of socialization. The unknown
pressure groups are responding. The bias itself is not fatal; only question is whether a higher or lower level of association would
not knowing of it is. have been detected had the data been available for more than 30
Earlier, we mentioned the problem of population-restriction societies. Were those who were more gentle in socialization less
bias in suicide notes, observing that less than 25 per cent of all likely to produce art work which has survived to the present?
suicides leave final notes. Osgood and Walker (1959) took this into Note, too, that there is the selective screen of museum curators
account in their study of motivation and language behavior. Rea- and ethnographers. Materials might have survived physically from
soning from behavioral principles, they predicted that the content the other 46 societies, but have been defined as of insufficient
of suicide notes should differ significantly from control notes and artistic or scientific worth to display behind glass or on paper.
simulated suicide notes. Persons about to take their life should be What might be considered an equally primitive art form was
highly motivated (something of an understatement), and this moti- studied by Solley and Haigh (1957) and by Craddick (1961). Both
vation should increase the dominant responses in their hierar- investigations showed that the size of children's drawings of Santa
chies; a higher than normal level of stereotypy should be present. Claus was larger before Christmas than after. Sechrest and Wal-
Content analysis by six different stereotypy indices supported lace (1964) asked whether the size of the Santa Claus drawing
their prediction. This study is a good example of relating "natural might be traced to a generalized expansive euphoria associated
phenomena" that exist in the outside world to principles derived with the excitement of the season, and whether children might be
from laboratory experimentation. There are many tests for theo- expected to draw almost any object larger during the Christmas
retical postulates available in settings other than the laboratory, season. Their experimentation showed this was not the case, and
and joint testing in the laboratory and outside may yield powerful the Santa Claus was the only one of three objects drawn larger.
validity checks. Craddick (1962) also found that the mean size of drawings of
Spiegel and Neuringer (1963) also tested a specific hypothe- witches decreased at Halloween time. Berger (1954), working from
sis by the employment of suicide notes. They examined the doodles in the notebooks of college students, found a correlation of
proposition that inhibition of the experience of dread ordinarily .75 between graphic constriction in the doodles and neurotic
evoked by suicidal intention is a necessary condition for suicidal tendency.
UNOBTRUSIVE MEASURES ARCHIVES 11: THE EPISODIC AND PRIVATE RECORD 111

using records as a testing medium, Mosteller and Wallace (1963)

went to records of 1787-1788 for their comparative study of a
In this review of archival studies, we have seen the versatility
Bayesian procedure with a classical statistical approach. They
of the written record. Not only has the content of study varied, but
demonstrated that both procedures reached the same conclusion
also the functions these data have served.
on the disputed authorship of some of The Federalist paper^.^
For some research purposes, there were few alternatives to
The great majority of these studies, however, have used the
archives-not a particularly luminary recommendation, but cer-
archives for indirect evidence. Stuart's (1963) study of union
tainly a compelling one. With suicides, for example, there is no
grievances and the state of race relations, Parker's (1963) study of
choice but to wait until a population defines itself operationally.
library withdrawals to show the effect of television, and the mea-
Once this happens, one can go to farewell notes, biographical
surement of the size of Santa Clauses (Solley & Haigh, 1957;
material, and interviews with relatives; but one cannot go to the
Craddick, 1961; Sechrest & Wallace, 1964) all reveal the iilventive
subject. So, too, for the general student of the past. For one like
unveiling of valuable evidence. But only ~ a r t i a evidence-for
l the
Terman (1917), who chose to study Galton, there was no easy
reasons traced in the preceding chapter show the need for care in
alternative to consulting the written record.
generalizing from such analyses. Here, as with the running public
In a limited content area, the archival record provides the
records, there is a heavy demand for consideration of possible data
dependent variable. Just as votes are the ultimate criterion for the
transformations and for the construction of multiple indices. If it is
politician, sales and work performance are the ultimate criteria for
agreed that the archives typically provide only partial evidence,
some applied social scientists. It has been of interest in the history
and if the desirable research strategy is to generate multiple
of research in both advertising and personnel that relatively direct
displays of overlapping evidence, then the way in which these
criterion variables have been ignored, while less pertinent ones
partial clues are pieced together is critical.
were labored over. (Measuring "willingness to buy" by question-
We should recognize that using the archival records fre-
naire methods is an example, although it does have some utility in
quently means substituting someone else's selective filter for your
prediction studies.)
own. Although the investigator may not himself contaminate the
There are also a few studies in which records were used as a
material, he may learn that the producer or repository already has.
medium through which theoretical principles could be tested.
A thoughtful consideration of the sources of invalidity may provide
Such studies are too few, but these records offer superb opportu-
intelligence on these, either by suggesting astute hedges or new
nities to validate hypotheses generated in less natural and more
analyses to answer rival hypotheses. In any event, the Chinese
reactivity-prone settings. There are restrictions, but it should be
proverb still holds:
recognized that there are restrictions in any single class of infor-
mation. Berlyne (1964), in commenting on some highly controlled The palest ink is clearer than the best memory.
experimental work, wrote: 3For an excellent general treatment of "identifying the unknown com-
Skinner and his associates have concentrated on situations in municator," see Paisley (1964), where studies in painting, literature, and music
are reviewed.
which an animal can perform a particular kind of response
repeatedly at a high rate. The findings yielded by this kind of
experiment have been extrapolated without much hesitation,
and not always with specific empirical warrant, to a diversity of
human activities, including those on which the most important
social problems hinge [pp. 115-1161.
Osgood and Walker (1959) used suicide notes to study the
effect of heightened motivation on response hierarchies. Also
SIMPLE OBSERVATION 113

The patently visible observer can produce changes in behav-

ior that diminish the validity of comparisons. Arsenian (1943)
noted that the simple presence of an adult sitting near a door
seemed to lend assurance to a group of nursery-school children.
Simple Observation The opposite change was noted by Polansky and associates (1949)
in studying the effect of the presence of observers among young
boys at a summer camp. There the observers were a threat and
Who could he be? He was evidently reserved, and melancholy.
became objects of active aggression. Not only is change produced
Was he a clergyman?-He danced too well. A barrister?-He
was not called. He used very fine words, and said a great deal. which reduces the generalizability of findings, but if one were
Could he be a distinguished foreigner come to England for the comparing children in two settings varying in the visibility of
purpose of describing the country, its manners and customs; observers or the reaction to observers, internal validity would take
and frequenting city balls and public dinners with the view of a blow.
becoming acquainted with high life, polished etiquette, and
English refinement?-No, he had not a foreign accent. Was The effect of the observer may erode over time, as Deutsch
he a surgeon, a contributer to the magazines, a writer of fashion- (1949) has shown, and thereby produce a selective contaminant in
able novels or an artist? - No: to each and all of these surmises observational data series. The defense against this is to permit the
there existed some valid objection.-"Then," said everybody, 1 effect of the observer contaminant to wear off, and start analysis
"he must be somebody."-"I should think he must be,"
with data subsequent to the time when the effect is negligible. This
reasoned Mr. Malderton, with himself, "because he perceives
our superiority, and pays us much attention."
I is similar to experimental controls for practice effects in learning
(Sketches from Boz) experiments, and presumes that the effect will wear off quickly
enough not to waste too much data. And that in turn is based on
Charles Dickens displayed a ready touch for observationally the researcher's ability to measure the independent effect of
scouring the behavior of this mysterious gentleman for evidence observation in the series.
with which to classify him-even going so far as to put out the Bales (1950) tested whether different arrangements of observ-
hypothesis that the man was a participant observer. In this chap- ers would selectively bias group behavior. Observers sat with the
ter, the first of two on observational methods, our interest is I group, or behind a one-way screen with the group aware they were
focused on situations in which the observer has no control over the there, or behind the screen with the group unsure if they were
i
behavior or sign in question, and plays an unobserved, passive, I there. He found no difference in group behavior under these
and nonintrusive role in the research situation. The next chapter conditions. All conditions were applied in a laboratory, however,
details studies in which the observer has played an active role in and all the groups knew they were being tested. These factors
structuring the situation, but in which he is still unobserved by the might overpower the possibly weaker effects of the physical
actors. Since we have limited our discussion to measures with low position of the observer.
risks of reactivity, the visible "research-observer7' approach and
No matter how well integrated an observer becomes,
the participant-observation method have been minimized here.l
we feel he is still an element with potential to bias the produc-
'More standard treatments of research methods may be consulted for exten- tion of the critical data substantially. The bias may be a selective
sive discussion of observational techniques with the observer visible. See Goode & one to jeopardize internal validity, or, perhaps more plausibly, it
Hatt (1952), Festinger & Katz (1953), Good & Scates (1954), Selltiz et al. (1959),
Riley (1963), Kerlinger (1964), and Madge (1965). These same works also contain may cripple the ability of the social scientist to generalize his
I
material on analysis of documentary and secondary source materials. 1 findings very far beyond his sample. A number of writers (cf. Bain,
114 UNOBTRUSIVE MEASURES
I SIMPLE OBSERVATION 115

1960; Gullahorn & Strauss, 1960; Gusfield, 1960; Wax, 1960) have from the investigator. Back (1960) writes of the traits of the good
argued for the participant-observation method as a device to informant (knowledgeability, physical exposure, effective expo-
circumvent some of the contaminations of studies employing an sure, perceptual abilities, availability of information, motivation)
"outside" observer. It may do that, but there is still a high risk of and points out some of the difficulties of receiving valid and
contaminants surviving to invalidate comparisons. appropriate data from informants.
Dalton (1964) gives an excellent pro-and-con analysis of par-
ticipant observation in his commentary on the methods used in
M e n Who Manage. Dalton's pro list is longer than the con one, and
Riley (1963) has suggested that the participant-observation he employs the intriguing terminology of "established circulator"
studies are subject to two classes of error-"control effect" and and "peripheral formalist."
"biased-viewpoint effect." The control effect is present when the As a final note on participant observation, we cite Lang and
measurement process itself becomes an agent working for change: Lang's (1960) report, in which participant observers became par-
"the difficulty with control effect in participant observation, and in ticipants. Two scientific observers of audience behavior at a Billy
many other research designs, is that it is unsystematic . . . " (p. 71). Graham Crusade in New York made their "Decision for Christ"
The biased-viewpoint effect includes what we have discussed and left the fold of observers to walk down the aisle. This is i11 itself
under the label of intra-instrument processes. The instrument (the an interesting measure. What a testimony to the Reverend Mr.
human observer) may selectively expose himself to the data, or Graham's persuasive skills, when sociological observers are so
selectively perceive them, and, worse yet, shift over time the swayed that they leave their posts!
calibration of his observation measures.
This has been suggested by Naroll and Naroll (1963), who
speak of the anthropologist's tendency to be disposed to "exotic Stephen Leacock said, "Let me hear the jokes of a nation and
data." The observer is more likely to report on phenomena which I will tell you what the people are like, how they are getting along,
are different from those of his own society or subculture than he is and what is going to happen to them" (Manago, 1962). This may be
to report on phenomena common to both. When the participant too haughty a claim for conclusions possible from one set of
observer spends an extended period of time in a foreign culture (a observational data, but we note below studies which produce
year among the Fulani or six months witli a city gang), those impressive findings from the opportunistic use of observation of
elements of the culture which first seemed notable because they events over which the investigator has no control.
were alien may later acquire a more homey quality. His increased These simple observation studies have been organized into
familiarity witli the culture alters him as an instrument. the following categories: exterior physical signs, expressive move-
Riley suggests that the control effect may be reduced by the ment, physical location, language behavior (conversation sam-
observer assuming an incognito role, even though ethical questions pling) and time duration. The breadth of these measures is notable,
are raised, but, and they are "simple" only in that the investigator does not
on the other hand, the covert observer may find complete intervene in the production of the material.
immersion in the system, and subsequent likelihood of a biased
viewpoint, more difficult to avoid. Limited to his specified role,
he may be cut off from valuable channels of information, unable
to solicit information not normally accessible to his role without Most of the exterior physical signs discussed are durable ones
arousing suspicions [p. 721.
that have been inferred to be expressive of current or past behav-
Associated with this class of observation is the use of the ior. A smaller number are portable and shorter-lived. The bullfight-
informant, who is a participant observer one selective screen away er's beard is a case in point. Conrad (1958) reports that the
116 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 117

bullfighter's beard is longer on the day of the fight than on any compared between those who attended an experimental set of
other day. There are supporting comments among matadors about papers and those who attended a series on ego-identity formation.
this phenomenon, yet can one measure the torero's anxiety by The results are clear cut. The "tough-minded" psychologists have
noting the length of his beard? The physical task is rather difficult, shorter-cut hair than the long-haired psychologists. Symptomatic
but not impossible in this day of sophisticated instrumentation. As interpretations, psychoanalytic inquiries as to what is cut about
in all these uncontrolled measures, one must draw inferences the clean-cut young man, are not the only possibilities. The causal
about the criterion behavior. Maybe it wasn't the anxiety at all. Per- ambiguity of the correlation was clarified when the "dehydration
haps the bullfighter stands farther away from the razor on the hypothesis" (i.e., that lack of insulation caused the hard-headed-
morning of the fight, or he may not have shaved that morning at all ness) was rejected by the "bald-head control," i.e., examining the
(like baseball pitchers and boxers). And then there is the possible distribution of baldheaded persons (who by the dehydration hy-
intersubject contaminant that the more affluent matadors are pothesis should be most hardheaded of all).
likely to be shaved, while the less prosperous shave themselves. Clothes are an obvious indicator, and A. M. Rosentha1(1962),
A less questionable measure is tattoos. Burma (1959) reports wrote of "the wide variance between private manners and public
on the observation of tattoos among some nine hundred inmates of behavior" of the Japanese:
three different institutions. The research measure was the propor- Professor Enright [British lecturer in Japan] and just about
tion of inmates with tattoos: "significantly more delinquents than every other foreigner who ever visited Japan have noted with
n ~ n d e l i n ~ u e n tattoo
ts themselves." Of conrse, one could hardly varying degrees of astonishment that there is a direct relation-
reverse the findings and hold that tattooing can be employed as a ship between the politeness of a Japanese and whether or not he
is wearing shoes [p. 201.
single measure of delinquency. Returning to the bull ring for a
moment, "There are many ordinary bullfighters, but ordinary It is quite likely that this relationship reflects the selective distri-
people do not fight bulls" (Lea, 1949, p. 40). bution of shoes in the Japanese society more than any causal
More formal classification cues are tribal markings and scars. element, an example of a population restriction. The economically
Doob (1961) reports on a walk he and an African companion took marginal members of the Japanese population should, one would
through a Nigerian market. think, be more overt in expressing hostility to foreign visitors than
I casually pointed to a dozen men, one after the other, who had those who are economically stable- and possession of shoes is
facial scars. My African friend in all instances named a society; more probably linked to affluence than it is to xenophobia.
then he and I politely verified the claim by speaking to the Shoe styles, not their presence, have been used as the unit
person and asking him to tell us the name of his tribe. In eleven of discrimination in the United States society where almost every-
instances out of twelve, he was correct. Certainly, however, he body does wear shoes. Gearing (1952), in a study of subculture
may have been responding simultaneously to other cues in the
person's appearance, such as his clothing or his skin color awareness in south Chicago, observed shoe styles, finding features
[ P 831. of the shoe to correspond with certain patterns of living. In
general, the flashier shoe more often belonged to the more culture-
In a report whose authors choose to remain anonymous bound individual. Similar concern with feet was shown by the OSS
(Anonymoi, 1953-1960), it was discovered that there is a strong Assessment Staff (1948) when, because standard uniforms reduced
association between the methodological disposition of psycholo- the number of indicators, they paid special attention to shoes and
gists and the length of their hair. The authors observed the hair socks as a prime indication "of taste and status."
length of psychologists attending professional meetings and coded Despite the general consensus on clothing as an indicator of
the meetings by the probable appeal to those of different methodo- status, little controlled work has been done on the subject. Flugel
logical inclinations. Thus, in one example, the length of hair was (1930) wrote a discursive book on clothing in general, and Webb
118 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 119

(1957) reported on class differences in attitudes toward clothes and between possession (ownership) of the object and a second varia-
clothing stores. Another investigation shows many differences ble. Calluses, for example, can serve as an observable indicator of
between clothing worn by independent and fraternity-affiliated certain classes of activity. Different sports make selective de-
college males. Within the fraternity groups, better grades are mands on tissue, for example, and the calluses that result are
made by the more neatly dressed (Sechrest, 1965b). reliable indicators of whether one is a squash player or a golfer.
Kane (1958; 1959; 1962) observed the clothing worn by out- Some occupations may also be determined by similar physical
patients to their interviews. He has considered pattern, color, clues.
texture, and amount of clothing, relating these characteristics to With these measures used alone, validity is often tenuous.
various moods, traits, and personality changes. In a more reactive Phillips (1962) is unusual in giving multiple indicators of the
study, Green and Knapp (1959) associated preferences for different changes in Miami resulting from the influx of a hundred thousand
types of tartans with need achievement; it would be of interest to Cubans. Two years following the Castro revolution, he observed:
see if this preference pattern were supported in clothing pur-
chased or worn. Bilingual street signs (No Jaywalking; Cruce por la Zona
A southern chief of detectives has discussed using clothing para Peatones)
clues as predictor variables. In a series of suggestions to police "A visitor hears almost as much Spanish as English."
officers, he noted the importance of dress details. When Negroes Signs in windows saying "Se Habla Espanol"
are planning a mass jail-in, "The women will wear dungarees as Stores with names like "Mi Botanica" and "Carniceria Latina"
they enter the meeting places" (Anonymous, 1965). Latin-American foods on restaurant menus
Jewelry and other ornamental objects can also be clues. Freud Supermarkets selling yucca, malanga, and platanos
gave his inner circle of six, after World War I, rings matching his The manufacture of a Cuban type of cigarette
own. On another intellectual plane, observers have noted that in Radio broadcasts in Spanish
some societies one can find illiterates who buy only the top of a pen Spanish-language editorials in the English-language news-
and then clip it to clothing as a suggestion of their writing prowess. papers
One could observe the frequency of such purchases in local stores, Services held in Spanish by 40 Miami churches.
or less arduously, examine sales records over time from the Perhaps Phillips was overstating his case, but the marshalling
manufacturer, considering the ratio of tops to bottoms for different of so much, and so diverse, observational evidence is persuasive.
countries or regions. The observation method would have an For a prime source in such studies of the unique character of
advantage in that one could make coincidental observations on the cities, and their changes, there is that eminent guide, the classified
appearance of those purchasing the tops alone, or isolate a sample telephone directory. It can yield a wide range of broad content infor-
for interviewing. The archival record of top and bottom shipments mation on the economy, interests, and characteristics of a city and
is infinitely more efficient, but more circumscribed in the content its people. Isolating the major United States cities, which ones have
available for study. the highest numbers of palmists per thousand population?
As part of their study of the social status of legislators and
their voting, MacRae and MacRae (1961) observed the houses lived
in by legislators and rated them along the lines suggested by
Warner (Warner, Meeker, & Eells, 1949). This house rating was The more plastic variables of body movement historically
part of the over-all social-class index produced for each legislator. have interested many observers. Charles Darwin's (1872) work on
Observation of any type of possession can be employed as an the expression of emotions continues to be the landmark com-
index if the investigator knows that there is a clear relationship mentary. His exposition of the measurement of frowning, the
SIMPLE OBSERVATION 121
120 UNOBTRUSIVE MEASURES

uncovering of teeth, erection of the hair, and the like remains Something of the detail possible in such studies is shown
provocative reading. The more recent studies on expressive move- in Wolffs (1948; 1951) work on hands. In the first study, Wolff
ment and personality measurement are reviewed by Wolff and observed the gestures of mental patients at meals and at work,
Precker (1951). Of particular interest in their chapter is the concluding, "I found sufficient evidence that correlations exist (1)
emphasis on consistency among different types of expressive between emotional make-up and gesture, (2) between the degree of
movement. They review the relation between personality and the integration and gesture" (1948, p. 166). The second study was
following measures: facial expression, literary style, artistic style, anthropometric, and Wolff compared features of the handprints of
style of speech, gait, painting and drawing, and handwriting. Not schizophrenics, mental defectives, and normals. The hands were
all of these studies are nonreactive, since the central criterion for divided into three major types: (1) elementary, simple and regres-
this is that the subject is not aware of being measured. sive; (2) motor, fleshy and bony; and (3) small and large. On the
Examples of using expressive movement as a response to a basis of an individual's hand type, measurements, nails, crease
particular stimulus -i.e., stimulus-linked rather than subject- lines, and type of skin, she delineates the main characteristics of
linked-are ~rovidedin the work of Maurice Krout (1933; 1937; their personality, intelligence, vitality, and temperament.
1951; 1954a; 1954b). Although this work was done in a laboratory Without necessarily endorsing her conclusions, we report the
setting, it was under facade conditions. That is, subjects were finding of a confused crease-line pattern peculiar to the extreme of
unaware of the true purpose of the research, considering the mental deficiency. Other structural characteristics such as con-
experiment a purely verbal task. There is a good possibility for cave primary nails, "appeared to a greater or lesser degree in the
application of Krout's (1954a) approach in less reactive settings. hands of mental defectives.. . but were completely absent in the
He elicited autistic gestures through verbal-conflict situations, and hands of the control cases" (Wolff, 1951, p. 105).
his analysis deals primarily with digital-manual responses. An A journalistic account of the expressive behavior of hands has
example of his findings is the correlation between an attitude of been given by Gould (1951). Here is his description of Frank
fear and the gesture of placing hand to nose. Darwin (1872) Costello's appearance before the Kefauver crime hearings:
mentioned pupil dilation as a possible fear indicator. As he [Costello] sparred with Rudolph Halley, the committee's
Kinesics as a subject of study is relevant here, although as yet counsel, the movement of his fingers told their own emotional
large amounts of data are not available. Birdwhistell (1960; 1963) story. When the questions got rough, Costello crumpled a
has defined kinesics as being concerned with the communicational handkerchief in his hands. O r he rubbed his palms together. Or
he interlaced his fingers. Or he grasped a half-filled glass of
aspects of learned, patterned, body-motion behavior. This system of water. Or he beat a silent tattoo on the table top. Or he rolled a
nonverbal communication is felt to be inextricably linked with the little ball of paper between his thumb and index finger. Or he
verbal, and the aim of such study is to achieve a quantification of stroked the side piece of his glasses lying on the table. His was
the former which can be related to the latter. Some "motion video's first ballet of the hands [p. 11.'
qualifiers" have been identified, such as intensity, range, and It is of interest that conversations of male students with
velocity. Ruesch and Kees (1956) have presented a combination females have been found to be more frequently punctuated by
text-picture treatment in their book, Non-Verbal Communication. quick, jerky, "nervous" gestures than are conversations between
An example of the impressionistic style of observation is provided two males (Sechrest, 1965b).
by Murphy and Murphy (1962), who reported on the differences in Schubert (1959) has suggested that overt personal behavior
facial expressions between young and old Russians: "While faces could be used in the study of judicial behavior. In presenting a
of old people often seemed resigned, tired and sad, generally the psychometric model of the Supreme Court, he suggests that the
children seemed lively, friendly, confident and full of vitality" (p.
12). '01951 b y the New York Times Company. Reprinted b y permission.
122 UNOBTRUSIVE MEASURES
SIMPLE OBSERVATION 123
speech, grimaces, and gestures of the judges when hearing oral
plates and dirty cars with clean plates," explaining that thieves
arguments and when opinions are being delivered are rich sources
frequently switch plates (Reddy, 1965).
of data for students of the Court.
In a validation study of self-reported levels of newspaper
On the other side of the legal fence, witnesses in Hindu courts
readership, eye movement was observed when people were read-
are reported to give indications of the truth of their statements by
ing newspapers in trains, buses, library reading rooms, and the
the movement of their toes (Krout, 1951). T h e eminent American
street (Advertising Service Guild, 1949). A number of interesting
legal scholar J. H. Wigmore, in works on judicial proof and
eye movement and direction studies have been conducted in
evidence (1935; 1937), speaks of the importance of peripheral
controlled laboratory settings. Discussion of them is contained in
expressive movements a s clues to the validity of testimony.
the following chapter on observational hardware.
That these cues can vary across societies is demonstrated by
S e c l ~ r e s tand Flores (in press). They showed that "leg jiggling" is
more frequent among Filipino than American males, and held that
jiggling is a "nervous" behavior. As evidence of this, they found
T h e physical position of animals has been a favored measure
jiggling more frequent in coffee lounges than in cocktail lounges.
of laboratory scientists, as well as of those in the field. Imanislii
T h e superstitious behavior of baseball players is a possible
area of study. Knocking dust off cleats, amount of preliminary bat (1960), for example, described the social structure of Japanese
macaques by reporting on their physical grouping patterns. T h e
swinging, tossing dust into the air, going to the resin bag, and
dominant macaques sit in the center of a series of concentric rings.
wiping hands on shirts may b e interpreted as expressive actions.
One hypothesis is that the extent of such superstitious behavior is For people, there are the familiar newspaper accounts of who
related to whether or not the player is in a slump or in the middle stood next to whom in Red Square reviewing the May Day parade.
of a good streak. This study could b e extended to other sports in T h e proximity of a politician to the leader is a direct clue of his
which the central characters are relatively isolated and visible. status in the power hierarchy. His physical position is interpreted
It should be easier for golfers and basketball players, but more as symptomatic of other behavior which gave him the status
difficult for football players. position befitting someone four men away from the Premier, and
From a practical point of view, of course, coaches and scouts descriptive of that current status position. In this more casual
have long studied the overt behavior of opponents for clues to journalistic report of observations, one often finds time-series
forthcoming actions. (It is known, for example, that most football analysis: Mr. B. has been demoted to the end of the dais, and Mr.
teams are "right sided" and run a disproportionate number of L. has moved up close to the middle.
plays to the right [Griffin, 19641.) Does the fullback indicate the T h e clustering of Negroes and whites was used by Campbell.
Kruskal, and Wallace (1965) in their study of seating aggregation
direction of the play by which hand he puts on the ground? Does
as an index of attitude. Where seating in a classroom is voluntary,
the linebacker rest on his heels if he is going to fall back on pass
the degree to which the Negroes and whites present sit by them-
defense? Does the quarterback always look in the direction in
selves versus mixing randomly may b e taken as a presumptive
which he is going to pass, or does he sometimes look the other way,
index of the degree to which acquaintance, friendship, and prefer-
knowing that the defense is focusing on his eyes?
ence are strongly colored by race, as opposed to being distributed
A police officer reported eye movement a s a "pickup7' clue.
without regard to racial considerations. Classes in four schools
A driver who repeatedly glances from side to side, then into the
were studied, and significant aggregation by race was found,
rearview mirror, then again from side to side may be abnormally
varying in degree between schools. Aggregation by age, sex, and
cautious and perfectly blameless. But h e may also be abnormally
race has also been reported for elevated trains and lunch counters
furtive and guilty of a crime. Another officer, in commenting on
(Sechrest, 1965b).
auto thefts, said, " W e . . .look for clean cars with dirty license
Feshbach and Feshbach (1963) report on another type of
124 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 125

clustering. At a Halloween party, they induced fear in a group of men talking together and two Frenchmen in conversation. In a
boys, aged nine to twelve, by telling them ghost stories. The boys cross-cultural study, this would be a response-set characteristic to
were then called out of the room and were administered question- be accounted for.
naires. The induction of the fear state was natural, but their Sommer's work inspired a study in Germany (Kaminski &
dependent-variable measures were potentially reactive. What is Osterkamp, 1962), but unfortunately it is not a replication of
of interest to us is a parenthetical statement made by the authors. Sommer's design. A paper-and-pencil test was substituted for the
After describing the ghost-story-telling situation, the Feshbachs actual physical behavior, and 48 students were tested in three
offer evidence for the successful induction of fear: "Although mock situations: classroom, U-shaped table, and park benches.
the diameter of the circle was about eleven feet at the beginning Sechrest, Flores, and Arellano (1965) studied social distance in a
of the story telling, by the time the last ghost story was completed, Filipino sample and found considerably greater distance in
it had been spontaneously reduced to approximately three feet" opposite-sex pairs as compared with same-sex pairs. Other tests
(p. 499). include measuring the distance subjects placed photographs away
Gratiot-Alphandery (1951a; 1951b) and Herbinikre-Lebert from themselves (Smith, 1958; Beloff & Beloff, 1961) and Werner
(1951) have both made observations of children's seating during and Wapner's (1953) research on measuring the amount of dis-
informal film showings. How children from different age groups tance walked under conditions of danger.
clustered was a measure used in work on developmeiital changes. Sommer (1960) noted how the- physical location of group
So~nmei- (1961)employed the position of chairs in a descriptive members influenced interactions. Most communication took place
way, looking at "the distance for comfortable conversation." Nor- among neighbors, but the corner was the locus of most interaction.
mal subjects were used, but observations were made after the Whyte (1956) observed that air conditioners were dispersed in a
subjects had been on a tour of a large mental hospital. Distances nonrandom way in a Chicago suburban community, and Howells
among chairs in a lounge were systematically varied, and the and Becker (1962) demonstrated that those who sat facing several
people were brought into the lounge after the tour. They entered others during a discussion received more leadership nominations
by pairs, and each pair was asked to go to a designated area and sit than did those who sat side by side.
down. A simple record was made of the chairs selected. Leipold's (1963) dissertation carried the work further, paying
The issue here is what one generalizes to. Just as the Fesh- special attention to the individual response-set variable of "per-
bachs' subjects drew together during the narration of ghost stories, sonal space," the physical distance an organism customarily
it would not be unrealistic to expect that normal adults coming places between itself and other organisms. Leipold gathered person-
from a tour of a mental hospital might also draw closer together ality-classification data on a group of 90 psychology students,
than would be the case if they had not been on the tour. Their divided them into two groups on the basis of introversion-extraver-
seating distance before the tour would be an interesting control. sion, and administered stress, praise, or neutral conditions to a
Do they huddle more, anticipating worse than will be seen, or less? third of each group. He evaluated the effect of the conditions, and
Sommer (1959; 1960; 1962) has conducted other studies of the tie to introversion-extraversion, by noting which of several
social distance and positioning, and in the 1959 study mentions a available seats were taken by the subjects when they came in for a
"waltz technique" to measure psychological distance. He learned subsequent interview. The seats varied in the distance from the
that as he approached people, they would back away; when he investigator. In one of his findings, he reports that introverted and
moved backward during a conversation, the other person moved high-anxious students, defined by questionnaire responses, kept a
forward. The physical distance between two conversationalists greater physical distance from the investigator (choosing a farther
also varies systematically by the nationality of the talkers, and chair) than did extraverted and low-anxious students. Stress condi-
there are substantial differences in distance between two English- tions also resulted in greater distance.
126 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 127

That random assignment doesn't always work is shown in result of the easy affability of the man. This affability might truly
Grusky's (1959) work on organizational goals and informal leaders influence the power position of his country, and hence be an
-research conducted in an experimental prison camp. He learned important datum in that sense, but it is more likely to confound
that informal leaders, despite a policy of random bed assignments, comparisons if it is used as evidence on a nation.
were more likely to attain the bottom bunk. Grusky also consid- Barch, Trumbo, and Nangle (1957) used the behavior of
ered such archival measures as number of escapes, general trans- automobiles in their observational study of conformity to legal
fers, and transfers for poor adjustment. On all of these measures, requirements. We are not sure if this is more properly coded under
c L expressive movement," but the "physical position" category
leaders differed significantly from nonleaders. It must be remem-
bered that this was an experimental prison camp, and the artifi- seems more appropriate. They were interested in the degree to
ciality of the research situation presents the risk that a "Haw- which turn-signalling was related to the turn-signalling behavior of
thorne effect7' may be present. What would be valuable would be a preceding car. For four weeks, they recorded this information:
another study of regular prison behavior to see if these findings 1. Presence or absence of a turn signal
hold in a nonexperimental setting. 2. Direction of turn
On still another plane, the august chambers of the United 3. Presence of another motor vehicle 100 feet or less behind
Nations in New York, Alger (1965) observed representatives at the the turning motor vehicle when it begins to turn
General Assembly. Sitting with a press card in the gallery, he 4. Sex of drivers.
recorded 3,322 interactions among representatives at sessions of Observers stood near the side of the road and were not easily
the Administrative and Budgetary Committee. Each interaction visible to the motorists. There was the interesting finding that
was coded for location, initiator, presence or exchange of docu- conforming behavior, as defined by signalling or not, varied with
ments, apparent humor, duration, and so on. His interest was in the direction of the turn. Moreover, a sex difference was noted.
defining the clusters of nations who typically interacted in the There was a strong positive correlation if model and follower were
committee. females, and also a high correlation if left turns were signalled. But
Using tlie same approach, it might be possible to get partial on right turns, the correlation was low and positive. Why there is a
evidence on which nations are perceived as critical and uncertain high correlation for left turns and a low one for right turns is
during debate on a proposed piece of UN action. Could one define equivocal. The data, like so many simple observational data, don't
the marginal, "swing" countries by noting which ones were visited offer the "why," but simply establish a relationship.
by both Western and Bloc countries during the course of the Several of tlze above findings have been verified and perturb-
debate? Weak evidence, to be sure, for there is the heavy problem ingly elaborated by a finding that signalling is more erratic in bad
of spatial restriction. One can only observe in public places, and weather and by drivers of expensive autos (Sechrest, 1965b).
even expanding the investigation to lobbies, lounges, and other Blomgren, Scheuneman, and Wilkins (1963) also used turn signals
public meeting areas may exclude the locus of the truly critical as a dependent variable in a before-after study of the effect of a
interactions. This bias might be selective, for if an issue suddenly signalling safety poster. Exposure to the sign increased signalling
appeared without warning, the public areas might be a more solid about 6 per cent.
sampling base than they would be for issues which had long been
anticipated and which could be lobbied in private. That tlie outside
observer must have a broad understanding of the phenomenon and
parties he is observing is indicated in Alger's study. He comments
on the high level of interaction with the Irish delegate, which was Language is a hoary subject for observation, with everything
not a reflection of the political power of Ireland, but instead the from phonemes to profanity legitimate game. Our interest here is
SIMPLE OBSERVATION 129
128 UNOBTRUSIVE MEASURES

ples from speech of subjects unaware of observation. One of the

more circumscribed and centers on language samples collected
earliest mentions of conversation as a source of psychological data
unobtrusively. This means excluding much useful research,
subject to quantification was made by Tarde (1901). Although he
Mahl's (1956) study of patients' speech in psychotherapy sessions,
performed no studies on conversation himself, Tarde made several
for example. The incidence of stuttering, slips of the tongue, and
suggestions for potential areas of study, such as variation in speed
the like is important data, but because the data were collected
of talking among cultures and categorization of topics by social-
in a therapist-patient setting, they do not apply here.
class differences.
We would be curious to read the findings of a nonreactive
For the first reported study of conversations, we can look at H.
study which investigated slips of the typewriter as a measure. The
T. Moore's (1922) work on sex differences in conversation-a
employment of these regularly appearing slips somehow evaded
canny and delightful research that triggered a whole series of
Freud (1920) in his major work on the topic. Sechrest (196533) has
hidden-observer language studies.
demonstrated a higher number of gross errors (skipping lines, poor
Moore sought to prove that there was a definite mental
spacing, and repositioning of hands) when subjects are copying
differentiation between the sexes, regardless of what previous
erotic passages than when copying passages from a mineralogy
studies (to 1922) had shown. To test this, he argued for a content
text. Winick (1962) studied some sixty thousand messages written
analysis of "easy conversation." Especially at the day's end, he
by passers-by on a typewriter outside a New York store, but his
held, conversation should provide significant clues to personal
analysis centered on coding of content. The data are also amenable
interest.
to study of spelling errors, spacing, and the like.
So Moore slowly walked up Broadway from 33rd Street to 55th
We have taken one area of language research, conversational
Street about 7:30 every night for several weeks. He jotted down
sampling, and traced it historically to illustrate the methodologi-
every bit of audible conversation and eventually collected 174
cal issues.
fragments. Each was coded by the sex of the speaker and by
Dittmann and Wynne (1961) demonstrate a modern approach.
whether the company was mixed or of the same sex. It is not
They coded verbal behavior, with the source of language a radio
necessary to cite his findings at length, but one should not pass
program - the NBC show "Conversation." To study emotional
attention: in male to male conversations Moore found 8 per cent in
expression, the authors examined "linguistic" phenomena (junc-
the category "persons of opposite sex"; for female to female con-
tures, stress, pitch) and "paralinguistic" phenomena (voice set,
versations, this topic occupied 44 per cent of the language speci-
voice quality, and vocalizations of three types). A problem comes
mens.
from the possibility that a man's awareness of participation in a
radio show-particularly the effects of nervousness on speech- Some of the limitations of conversation sampling are obvious.
Moore could record only intelligible audible conversation. Speech
could lead to conditions that bias the production of the critical
that is muttered, mumbled, or whispered may contain significantly
responses.
different content than loud and clear speech. The representative
Kramer (1963) has reviewed the literature on the nonverbal
character of the speech samples is further questioned by the
characteristics of speech, concentrating on personal characteris-
representativeness of the speakers. Walkers on Broadway are
tics and emotional correlates. In a later article (1964), he reports a
probably not even a good sample of Manhattan. In short, there is a
methodological study of techniques to eliminate verbal cues. The
strong risk of sampling rigidity in both the talkers and the talk.
three major methods are: a constant ambiguous set of words for
We can look, chronologically, at the conversation-sampling
various emotional expressions; filtering out the frequencies which
studies that followed Moore's and note the efforts of other in-
permit word recognition; speech in a language unknown to the
vestigators to reduce error due to data-collecting procedure.
listener.
More satisfactory is language analysis which draws its sam- Landis and Burtt (1924) ~ u b l i s h e dthe first study stimulated
130 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 131

by Moore's classic. They were sensitive to positional sampling 9 Mabie (1931) and McCarthy (1929) employed free-play periods
biases and improved upon Moore's procedure by sampling a wider to sample the conversation of children. The visibility of the re-
variety of places and situations. With an experimenter who "wore corder is a great problem here, and the studies represent reactive
rubber heels and cultivated an unobtrusive manner," they gath- methodology. It may be that, as Mabie claims, the children's
ered samples of conversation, adding an estimate of the social awareness of her presence had no effect on their conversation.
status of the speaker. The broadened locations included street- This is uncertain, and our inclination would be to consider that her
cars, campuses, railroad stations, athletic events, parties, depart- presence, notebook in hand, would introduce a strong risk of
ment stores, theater and hotel lobbies, restaurants, barber shops, biasing the character of the overheard statements. When asked by
churches, and streets in both commercial and residential areas. the children what she was doing, Mabie told them that she was
After their analysis, Landis and Burtt concluded that the source of writing down what the children liked to do during play periods. That
the collection was only a minor factor. Landis (1927) broadened the response itself could predispose the children to verbalize evalua-
sampling base even further, reporting in an article entitled "Na- tive comments more frequently.
tional Differences in Conversation." He sampled conversations in Surreptitious observation is the only class which fits into what
Oxford and Regent streets in London and compared these results we would call nonreactive testing. Take the studied surreptitious-
to the earlier Landis and Burtt (1924) findings from Columbia, ness of Henle and Hubble (1938). Students were again the sub-
Ohio. jects, and
The monitoring of telephone conversations was the device by The investigators took special precautions to keep subjects
which French, Carter, and Koenig (1930) measured the degree to ignorant of the fact that their remarks were being recorded. To
which the most common words contributed to the total word usage this end, they concealed themselves under beds in students'
of conversations. This study of the repetitiousness of language was t
rooms where tea parties were being held, eavesdropped in
later used as a control for the repetitiousness of speech of schiz- dormitory smoking rooms and washrooms, and listened to
telephone conversations [p. 2301.
ophrenics and college students (Fairbanks, 1944).
Stoke and West (1931) tried to limit the number of variables in Without extending their explanation, Henle and Hubble re-
their conversation-sampling study and restricted the sample to port that "unwitting subjects were pursued in the streets, in
undergraduate college students, sampling from random bull ses- department stores, and in the home."
sions held at night in residence halls. The participant observers Escaping from under the bed, Carlson, Cook, and Stromberg
were 36 college students who worked with a checklist of probable 1 (1936) studied sex differences in conversation by monitoring lobby
topics and the data and number of people in the conversation. A conversations at the intermissions of 13 regular concerts of the
limitation of this "observe - withdraw - record" approach is that 4 Minneapolis Symphony and at six University of Minnesota con-
the observer cannot hope to record adequately the duration of certs. The self-selection of subjects may be a serious risk to
responses. The approach is also vulnerable to the criticism that the external validity (Who goes to Minneapolis Symphony concerts
observers' reports are subject to bias, beyond the initial selective and who doesn't?), but the whispering problem is not so great in a
perception one, because of the gap between event and notation. research setting like a crowded theater, where a premium is
Moving away from the campus. and more to the Moore ap- placed on loudness.
proach, Sleeper (1931) sampled conversations in the upper level of The size of the group is a clear influence on the degree to
Grand Central Terminal in New York, during the rush hour from I
which the experimenter must mask himself. For observing two-
5:00 P.M. to 6:30 P.M. Sleeper's procedure reflected the dross-rate person communication, it may well be necessary to hide under a
problems of such data, as he added a recording variant by exclud- bed. In a large public gathering, the problem of visibility is solved;
ing all "environmentally stimulated" conversation. the individual providing the conversation sample expects to find
132 UNOBTRUSIVE MEASURES
SIMPLE OBSERVATION 133
unfamiliar people close to him, and the experimenter need not
Doob (1961) writes of a girl in an African market who "was
hide. Only the recording of the language need be hidden. But it is
carefully shadowed in the interest of scholarly research." In an
not as simple as that. Even though the presence of the observer
approach described by Doob as "unsystematic eavesdropping," he
may cause no surprise, the same situation which permits accept-
notes:
ance of the stranger may also have worked to inhibit a class or
classes of verbal behavior. For some experiments this may be She began talking, and listening, before she entered the market's
gate. Within a period of ten minutes-the duration of the
unimportant-those in which the difference between public and
research-she spoke with more than twenty people: some she
private utterances is negligible. For other experiments, it may be greeted perfunctorily, others she talked to for a few moments
substantial. This is an empirical question for each experimenter to concerning relatives and friends. No political or cosmic
solve. It must be accepted as one of the possible content limita- thoughts were aired [p. 1441.
tions of conversational sampling.
Of interest is his point on a possible ethnocentric bias among
Watson, Breed, and Posman (1948) displayed their concern
foreign observers in Africa:
about the representativeness of college students by deliberately
excluding them from their sample of New York talkers. No campus . . . whereas people in the W e s t . . . are likely to keep themselves
occupied and to avoid long periods of complete solitude or, in
locations were used, and an attempt was made to eliminate "any-
contact with others, of silence, it may be that many Africans are
one distinguishable as a college student." Working at all times of perfectly content to be unoccupied except by their own feelings
day and night, they sampled uptown, midtown, and downtown and thoughts and sense of well being [p. 1441.
Manhattan, including the following locales: business, amusement,
All the studies of conversation reported here have relied on a
and residential streets and parks; subways, buses, ferries, taxis,
content analysis of the conversational samples gathered. The
and railroad stations; lobbies of movie houses and hotels; stores,
essential problems have been the representativeness of the sample
restaurants, bars, night clubs. Each observer recorded verbatim
collected. The unobserved observer (secreted under a bed or
what he had heard as soon as possible after hearing it. The
among a crowd), must be sensitive to the limitations of self-selec-
sampling of respondents was resolved as well by Watson, Breed,
tion of subjects, a problem of external validity, and the limitations
and Posman as it has been by anyone.
of the probable partial character of public-conversation samples.
Contrast this with the participant-observation type of ap-
Any public conversation may be constrained because of the
proach suggested by Perrine and Wessman (1954). The investiga-
"danger" of being overheard. Many of the inaudible comments in
tor posed as a stranger to the state, initiated casual conversation
public are likely to be drawn from a different population of topics
with subjects, and then directed conversation to political issues by
than those loudly registered. Moreover, as we noted earlier, the
commenting on recent newspaper headlines and the like. The
method requires a careful selection of both place- and time-sam-
conversation was recorded as soon as possible after leaving the
pling units to increase representativeness, and these controls will
subject, along with sex, race, location, and estimated age and
not be the same over different geographic locales. Sampling bus
socioeconomic class. The enormous methodological issue in this
conversations in Los Angeles and in Chicago yields a population of
type of conversation recording is the 60 to 70 per cent rate of
very different subjects. Moreover, these data are typically loosely
refusal. If nothing else, the eavesdropping approach reduces the
packed, and it takes a substantial investment in time and labor to
problem of self-selection of the sample - at least that bias attrib-
produce a large enough residual pool of relevant data. For all these
utable to wiUingness to participate in a survey. Not everyone will
limitations, however, there are research problems for which pri-
chip into a conversation with some stranger who wants to talk
vate commentary is not a significant worry, for which the adroit
about state politics. To use such data as the basis of inferring the
selection of locales and times can circumvent selective population
state of public opinion is dubious.
characteristics, and for which the issue is of sufficient currency in
134, UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 135

the public mind to reduce the dross rate. For these situations, Washburne (1928) reported on an experiment conducted in
conversational sampling is a sensitive and faithful source of infor- the Russian school system which conceivably could have used time
mation. for a measure. Each child in the school was given his own garden
plot, and at the same time had joint responsibility for a common
garden tended by all the children. I t is reported that records were
kept to show the relative amount of interest that each child had in
T h e amount of attention paid by a person to an object has long the two types of work. Although no mention is made of the measure
been the source of inferences on interest. For research on infrahu- used to determine amount of interest, time might b e an appropri-
man species, notoriously incompetent at filling out interest ques- ate one. Because it can be assumed that for equal care a greater
tionnaires, visual fixation has been a popular research variable, as amount of time would have to be spent on the individual garden,
in the recent work of Berkson and Fitz-Gerald (1963) on the effect adjustments would have to be made in comparing the times for the
of introducing novel stimuli into the visual world of infant chimps. two gardens.
With humans who can fill out questionnaires, time-duration mea- A number of theoretical variables may be linked to time
sures have been less popular, but are not uncommon. Frequently, a duration and time perception. Cortes (cited in McClelland, 1961, p.
duration variable is imbedded in a body of other measures. H. T. 327) has shown that a significantly larger number of high-need
Moore (1917) measured anger, fear, and sex interests by giving a achievers have watches that are fast than do low-need achievers.
subject multiplication tasks and then exposing him to distraction Do the high achievers also perceive time duration differentially?
of different types. T h e time taken to complete the tasks was the T h e lack of general emphasis on time-duration methods is
measure of interest: the longer the time, the greater the interest partly due to difficulty of measurement. For accurate observation,
in the distracting content. the hurly-burly conditions of a natural setting are damaging; the
In a study of museum visitors, Melton (1935) hypothesized a laboratory control over instrumentation is almost necessary if
positive relationship between the degree of interest shown by a precise observations of small time units are to be reached.
visitor in the exhibits and the number and quality of the permanent Sometimes this can be circumvented by a measure in which
educational results of the museum visit. Melton was very careful to time is scaled in grosser units than microseconds. Jacques (1956)
study response-set biases and situational cues which would con- defined "responsibility" by measuring how long a worker is al-
taminate his measure of the duration of observed time spent in lowed to commit the resources of his task without direct supervi-
viewing an exhibit. H e demonstrated the "right-turn" bias, and sion. Observation yields "a time span of responsibility" and a
experimented with changing the number of exits and installing descriptive measure of the worker. For a duration measure like
directional arrows-all elements which significantly affected the this, it would b e foolish to calibrate the measurement in seconds.
length of a visit. Many researches demand ultrafine discrimination of time, and for
In one finding, for example, he reports that the closer an them, natural observation is an awkward method. But where the
exhibit is to an exit, the less time will be spent at it. H e posits an unit is broader, observation in the natural setting becomes both
"exit gradient." Going further, he talks of the number of paintings feasible and desirable. Sometimes it is enough to say, "Professor
in a gallery, the proportion of applied or fine arts in the room, and X's interest in cutaneous sensation extended over a career of 38
comes up with findings on "museum satiation." As the number of years."
paintings in a gallery increase, the average total time in the gallery
also increases, the total number of paintings visited increases, and
the time per painting visited does not decrease but increases.
Melton's attention to these cues provides a model seldom followed For the permanent physical clues of observation -the scars,
in observational research. tattoos, and houses owned- the timing of when an observation is
136 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 137

made may be relatively unimportant. It may be possible to conjure One of the important time-sampling studies of observation in a
up conditions in which a tattoo may be so placed that it is differen- nursery setting was conducted by Thomas (1929). In recording the
tially visible at various points in a day (with or without jacket, for activities of the children, Thomas made use of a mapped floor plan
example), but for the most part, the exterior signs are quite and plotted movement against the plan. Olson (1929) used similar
invulnerable to time-linked variance. procedures, but concentrated on oral habits rather than movement
Many of the other simple observation materials-expressive patterns.
movement, physical location, and language- are, however, subject Barker and Wright (1951; 1954) have adopted an opposite
to the objection that the critical behavior is variable over a day or strategy to time sampling. They sought to avoid the problem of
some longer time period. The risk, or course, is that the timing of selected behavior over time by a saturation method. Rather
the data collection may be such that a selective population periodi- than sample behavior, they censused it. In their 1951 study,
cally appears before the observer, while another population, observations were made of one child for an entire day, with
equally periodically, engages in the same behavior, but comes minute-by-minute notations. Eight observers were used in turn,
along only when the observer is absent. Similarly, the individual's each being wholly familiar with the child. For any child under
behavior may shift as the hours or days of the week change. The ten, the authors feel the effects of observation are negligible.
best defense against this source of invalidity is to sample time This may be subject to doubt, however, particularly in view of
units randomly. recorded statements detailing interaction between the observer
Working in an industrial setting, Shepard and Blake (1962) and the child.
observed employees and judged whether they were working or not. This strategy does solve a problem, but it provides other ones.
By a time-sampling design, they found a strong decline in percent- It is practical for only relatively short periods of time (Imagine
age of workers working between 10:30 and 1l:OO-the time of a following a boy for a year!), and the method is predisposed to
daily supervisors' meeting which drew them away from direct measurement of individuals, not groups. This latter point may be
control over employees. important for the probability of reactive effects creeping in, for, as
Hence, the composition of supervisors' meetings was changed we noted above, the size of the group may be an important factor in
so as to ensure continuous supervision in the shop . . . thus the the degree of observation consciousness. A person tailing you
managers are correct in t h e i r . . . conclusion that more about all day is quite different from one next to you in a theater
consistent control and direction are needed to correct for their lobby.
tendency to be irresponsible [pp. 88-89]. Yet these limitations are no more punishing than the limita-
The technique has been extensively used in nursery settings, tions of other approaches, and the subtlety of links between
where there is a particular need for it because of the greater behaviors can hardly be better d e ~ c r i b e dEither
.~ way, sampling or
periodicity of behavior of infants and young children. Arrington censusing, a measure of control is added over a usually uncon-
(1943) has pointed out many of the factors which must be consid- trolled variable.
ered in assessing the results of time sampling recorded by the 3Edmond de Goncourt wrote of the goal of the Goncourt Journal: "What we
observer. The duration of the individual time sample must be have tried to do, then, is to bring our contemporaries to life for posterity in a
speaking likeness, by means of the vivid stenography of a conversation, the
chosen in accordance with behavior to be observed. Degree of physiological spontaneity of a gesture, those little signs of emotion that reveal a
sophistication, familiarity with the observer, previous experience personality, those imponderabilia that render the intensity of existence, and, last of
in being observed, type of situation, and number of individuals in all, a touch of that fever which is the mark of the heady life of Paris" (p. xi). From
the situation are also thought to be factors contributing to 'Lobser- The Goncourt Journals: 1851-1870, Edmond and Jules de Goncourt, translated by
Lewis Galantiere. Copyright 1937 by Doubleday & Company, Inc. Reprinted by
vation consciousness." permission of the publisher.
SIMPLE OBSERVATION 139
UNOBTRUSIVE MEASURES
data-gathering instrument, the human observer, will be variable
over the course of his observations. He may become less conscien-
tious as boredom sets in, or he may become more attentive as he
learns the task and becomes involved with it. If there are any
The emphasis of this chapter has been on research in which ambiguities about how the behavior is to be coded, the effect of
the observer is unobserved, and in settings where the investigator time may be to reduce the variation of coding (increase intra-
has had no part in structuring the situation. The secretive nature of observer reliability) as he works out operating definitions of which
the observer, whether hidden in a crowd or miles away before a behaviors go with which codes. All of these can work to produce
television screen, protects the research from some of the reactive spurious differences in comparisons.
validity threats. The subject is not aware of being tested, there is Errors of the observer, however, are not random, but show
thereby no concomitant role-playing associated with awareness, systematic biases that can be predicted, and hence corrected for,
the measurement does not work as an agent of change, and the from the observer's expectations of the experimental or field
interviewer (observer) effects are not an issue. situation. Campbell (1959) has inventoried 21 systematic sources
Moreover, there is the great gain that comes from getting the of error that apply to the human observer.
data at first hand. In studies of archival records and in the That this may apply to the principal investigator, as well as his
examination of trace and erosion evidence, there is always the aides, has been demonstrated by Rosenthal's (1963) "On the
uncertainty that others who came between the data and the inves- Social Psychology of the Psychological Experiment." The implica-
tigator, processing or pawing it, left their own indistinguishable tion of these studies is the demand for a greater emphasis on the
marks. necessity for saturated training of observers, hopefully under a
The first-hand collection of the data, usually of a contempo- '"blind" condition in which they do not know what a "good"
raneous event, also allows the gathering of other information to result or behavior will be.
reduce alternative hypotheses. One may note characteristics of the
The one other significant issue under the label of reactive
subjects which permit a testing of rival hypotheses about the threats comes from response sets on those observed. To a large
selective composition of the sample. To be sure, these are mostly measure, these are knowable - either through application of re-
limited to visual cues, but they can be extremely helpful. Simi- search conducted in other settings, or through direct observation
larly, the ability to observe the subjects in the act permits one to of behavior under different conditions. Hopefully, there will be
designate the individual actors, either for follow-up observation, or enough variation in the settings available for sampling to examine
for study with other instruments like the questionnaire or interview. whether any systematic response sets are at work, and whether
Such follow-up of individuals is difficult or impossible with archival these can be isolated from other possible sources of variance.
and trace measures. This is awkward when one is not actively manipulating the en-
It would be difficult to overestimate the value of this potential vironment, and becomes one of the strong arguments for the
for follow-up. One of the singular gains of simple observation is unobserved observer to alter the research environment sur-
that it is a procedure which allows opportunistic sampling of reptitiously and systematically in an undetectable way.
important phenomena. Because it is often opportunistic, there is
The populations available for observation fluctuate according
the attendant risk that the population under observation is an
to both time and location. Thus, some caution should be employed
atypical group, one unworthy to produce generalizations. The
in generalizing from research which gathered observations at one
follow-up studies may protect against this risk, as can adroit use of
time in one place. If generalizations about the subject matter of
locational and time sampling.
conversations are to all people, and content varies by age, then the
Against this impressive list of gains must be balanced some
"place" should be considered as a sampling universe including
possible sources of loss. Prime among these is the danger that the
140 UNOBTRUSIVE MEASURES SIMPLE OBSERVATION 141
varying locations which are likely to draw on different popula- elaborate instrumentation, or some titanic combination of both.
tions. When the concern is generalized to more limited settings, This is potentially a variable problem across cultures, as one notes
say, a study of the effect of different treatment conditions in members of certain societies willing to display classes of behavior
prisons, then the place sample should be more than one prison for that are hidden or taboo in others. Cross-cultural observational
each condition hypothesized to have an effect. studies are thereby threatened not only by the ethnocentric attri-
It is not always possible to draw elaborate locational samples, bution of meaning, but also by the lack of equivalence in observa-
but that should not deter observational research. If the setting is ble behavior across societies. As one increases the number of
circumscribed by practical conditions, a proper defense is to societies, of course, the probability is greater that an incomplete
employ time-sampling methods. Limited to a population of "tour" set of observations of public behavior will be available over all.
visitors to a mental hospital, one must bear the cross of a self- Finally, being on the scene often means a necessary exposure
selected population unlikely to be representative of much. Impos- to a large body of irrelevant information. Because one cannot often
ing a time-sampling design, observing different groups who come predict when a critical event will be produced, it is necessary to
on different days or in different months, for example, would wait around, observe, and complain about the high dross rate of
markedly improve the solidity of a shaky base. such a procedure. The payoff is often high, as in the case of one
Both time and locational sampling should be employed if patient observer who knew critical signs and was immortalized in
possible, for empirical research and introspection suggest that the song, "My Lover Was a Logger." The waitress sings,
population variation is a substantial issue in observation. An
added gain is that the investigator can also vary his observers over I can tell that you're a logger,
And not iust a common bum.
the sampling units and randomly assign them to different times 'Cause nobody but a logger
and locations, thus adding a badly needed control. Not all research Stirs his coffee with his thumb.
possibilities afford this chance, but it is a goal to be reached for.
McCarroll and Haddon (1961)took care to ensure that location
and time factors would not affect their study of the differences
between fatally injured drivers in automobile accidents and nonin-
volved drivers. At each accident site, a team consisting of the
authors, medical students, and from one to eight police stopped
noninvolved cars proceeding in the same direction as the accident
car on the same day of the week and at the same time of the day.
The same time- and locational-sampling strategy will also help
to counteract some of the risks in selective content. The popula-
tion varies over place and time, and the content of their behavior
similarly varies. If one can broaden the sampling base, he can
expand the character of material available for study. It is not
possible to know all about college students if observations are
limited to afternoons in the fall; when these observations fall on
Saturday, worse yet.
But all the finesse of the skillful investigator will not solve
some content limitations. Much of behavior is precluded from
public display and is available only through unethical action,
CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 143

Osgood (1953) illustrates the capacity weakness of the human

observer in his comments on studies of language behavior in the
first four or five months of a human's life: ". . . from the total
splurge of sounds made by an actively vocal infant, only the small
sample that happens to strike the observer is recorded at all"
(p. 684).
Not all observable behavior is so complex or so rapid, but
there is enough to cause the consideration of a substitute mecha-
Contrived Observation: nism for the observer. It might not be so bad if there were a
Hidden Hardware and Control random loss of material when the observer's perceptual system
got overloaded. Unhappily, the nature of the material noted and
not noted is likely to be a function of both the individual's idiosyn-
This chapter discusses the investigator's intervention into the
cracies and the systematic response sets learned in a given society.
observational setting. In simple observational studies, research is Again speaking of speech studies of early childhood, Osgood
often handicapped by the weaknesses of the human observer, by comments,
the unavailability of certain content, and by a cluster of variables
over which the investigator has no control. To reduce these The inadequate recording methods employed in most of the
threats, a number of workers have elected to vary the setting early studies make the data of dubious validity. The typical
procedure was merely to listen to the spontaneous vocalizations
actively or to substitute hardware devices for human record-keep- of an infant and write down what was heard. The selective
ing observers. We avoid here examples of the "speak clearly into factor of auditory perception-listeners "hear" most readily
the microphone, please" approach. The emphasis is on those those sounds that correspond to the phonemes of their own
investigations in which the scientist's intervention is not detecta- language - was not considered [p. 6841.
ble by the subject and the naturalness of the situation is not
These same biases are at work with the recording of any
violated.
language system that is unfamiliar to the observer-whether it be
the occult language of a child, or the unfamiliar tones of a foreign
HARDWARE: language. Webb (in preparation) has noted this in his study of
AVOIDINGHUMANINSTRUMENT
ERROR 1 orthographies in African languages. His analysis was based on
When the human observer is the recording agent, all the written records of the languages, many of which had been pro-
I
fallibilities of the organism operate to introduce extraneous vari- duced by missionaries, explorers, and other foreign nationals who
ance into the data. People are low-fidelity observational instru- came to Africa and learned the indigenous speech. In transcribing
ments. We have already noted how recording and interpretation the sounds of these languages for others, there were selective
may be erratic over time, as the observer learns and responds to approximations of the true sound, influenced by the tonal pattern
the research phenomena he observes. of the characters in the observer's native language. Thus, German
The fluctuations of this instrument can be brought under observers heard umlauts that evaded the British. There is some
some degree of control by random assignment of observers to possibility of control over this particular bias, since for some of the
locations and time units. Random assignment will not, however,
create a capacity in the organism which is not there, nor eliminate
response sets characteristic of all members of the society or
I
i
languages there are written transcriptions of the same words by
nationals of various countries; these can be matched against the
known tonal characteristics of the European languages to correct
subcultures from which the observers are drawn. for the selective hearing. When multi-national observations are
I
142
144 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL
145

missing, the task is much more difficult, and one must make Other aids to the observer are pieces of apparatus which
inferences about the effect of perceptual biases based on the allow him to record his observations more quickly or more
sound characteristics of the observer's native language. thoroughly. Sometimes the gain comes from forcing the observer
A major gain from hardware recording, of course, is the into using a series of varied codes, sometimes it is just the gain of
permanence of this complete record. It is not subject to selective having a more permanent record of his perceptions of the behavior
decay and can provide the stuff of reliability checks. Further, the under study. Steiner and Field (1960), for example, timed vocal
same content can be the base for new hypothesis-testing not contributions to a group discussion by means of a polygraph, and a
considered at the time the data were collected. Or material that popular supplementary device has been the Interaction Chrono-
was originally viewed as dross may become prime ore. For exam- graph, a recent use of which is reported in Chapple (1962).
ple, Bryan employs taped interviews in his study of call girls. A big boom exists, and properly, for audio tape recorders.
Among other things, these are coded for the frequency of tele- With the development of superior omnidirectional and highly
phone calls received during the period of the interview. Such directional microphones, many of the former mechanical limita-
information serves as a partial check on the girl's self-report of tions have been resolved. The tape becomes the first source of
business activity (Bryan, 1965). data, and it is often considered the initial input into a hardware
Hardware, of varying degrees of flexibility, has been used system. Thus, Andrew (1963) took tapes of sound patterns from
throughout the history of scientific observation. To reduce the risk primates (including man) and fed them into a spectograph for
of forgetting, if notlzillg else, permanent records were made of more detailed analyses.
observed behavior. They may have noted less than the total Similarly, Heusler, Ulett, and their associates (Heusler, Ulett,
behavior, but they did serve to reduce reliance on human memory. and Blasques, 1959; Callahan et al., 1960; Heusler, Ulett, and
Boring (1961) writes of Galton, "an indefatigable measurer." Callahan, 1960; Ulett et al., 1961; Ulett, Heusler, and Callahan,
He used to carry a paper cross and a little needle point, 1961) developed what they termed a noise-level index for hospital
arranged so that he could punch holes in the paper to keep wards. Their substantive concern was measuring the effects on
count of whatever he was at that time observing. A hole at the drugs on hospital patients, and they planted tape recorders to pick
head of the cross meant greater, on the arm equal, and on the up ward noises. These sounds are meshed in an integrator which
bottom, less [p. 1541. provides a numerical total of the activity. Originally, this noise
Galton also contributed to that voluntary, self-descriptive re- level had been rated by judges; in later work, however, the authors
active measure, the questionnaire, whose overuse William James used a direct index of noise level, thus reducing biases, among
anticipated: "Messrs. Darwin and Galton have set the example of them the possible confounding due to a judge's recognition of a
circulars of questions sent out by the l~undredsto those supposed patient's voice.
able to reply. The custom has spread, and it will be well for us in A highly opportunistic use of audiotapes was demonstrated by
the next generation if such circulars be not ranked among the Matarazzo and his associates (Matarazzo et al., 1964)in their study
common pests of life7' (James, 1890, p. 194). of speech duration. The National Aeronautics and Space Ad-
Evolving from such simple recording methods was the con- ministration made available to these investigators the audiotapes
striction of communication developed for work in small-group of conversations between astronauts and ground communicators
research in laboratory settings. Artificially, the participants were for two orbital flights. From these tapes and the published
(or are) required to limit all communications to written notes, transcripts of the communications (NASA Manned Spacecraft
which are then saved by the investigator to provide a full record Center, 1962a; NASA Manned Spacecraft Center, 1962b) they
of all communication among participants. This is a very low-cost coded the duration of each unit of speech by means of an Inter-
device, much cheaper than tape recording, but its stilted quality action Chronograph. These data provided a test of proposi-
suggests a very high risk price for a very low dollar cost. tions developed in the experimental laboratories and previously
146 UNOBTRUSIVE MEASURES
I CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 147
I
reported (Matarazzo, 1962b; Matarazzo et al., 1963). In space, as in the obvious one-way mirror setting and provides a medium for a
good check on observation and self-report data.
the laboratory, tlie length of a response is positively correlated
For some reason, still photography has never had much of a
with the length of a question. It could hardly b e claimed that the
vogue a s research hardware. Boring (1953) mentions that "Voliva
astronauts were thinking of Matarazzo's hypotheses at the time I
they were steering their craft, and the astronaut findings supported supported his theory of the flat earth by a photograph of twelve
miles of tlie shore line of Lake Winnebago: you could see, he
tlie work of the laboratory. This highly imaginative research
argued, that the shore is horizontal and bowed."l Both still and
dipped into archives that were available, archives known not to
movie films have been used in the study ofeye behavior-direction,
suffer from intermediary distortions.
duration of looking, pupil dilation, and the like.
T h e fidelity and breadth of content of tlie audiotapes give
T h e physical location of eyes was used by Politz (1959) in a
them an edge over written records for archival analysis. Not only
study of commercial exposure of advertising posters placed on the
are they uncontaminated by other hands, but they contain more
outside of buses. Politz' equipment was movie cameras placed in
pertinent material physically unavailable on tlie written record.
buses and automatically activated in a series of short bursts spread
Matarazzo, for example, could not have conducted so accurate a
throughout a day. T h e camera was faced outward, over the poster
study had he been limited to transcripts alone. Interested in dura-
under study, and, in a switch on the Bunker Hill advice, a person
tion of speech, where tlie natural unit is a second of time, he would
whose two eyes could be seen in the developed film was classi-
have had to make estimates of duration from word counts, which
fied as "in" the advertising audience. This design, which used
are pockmarked with substantial individual response-set errors in
a random sample of both locations and time, is exemplary for
rate of speech, different levels of noncontent interjections (um-
its control of a large number of extraneous variables that could
mm's), and tlie like.
jeopardize external validity. T h e visibility of the camera raises a
There is a weighty mass of research material almost uii-
question. It occupied a bus seat, and there is the probability that it
touched by social scientists, although known and used by liistori-
attracted some attention. That is, the eyes counted as looking at
ans. It is found in the oral archives of tlie national radio and
tlie poster were looking at the camera instead. This should result
television networks, which have kept disc, film and tape recordillgs
in an overestimate of the size of the bus-poster advertising audi-
of radio and television shows over the years. T h e recent advent of
ence. If one could assume tliat the novelty of the camera would
video tape recordings has provided another dimension to these
wear off over time, one could test the hypothesis by making a
archives.
longitudinal analysis of the material. If the test materials were
Videotapes are being used in some experimental research to
controlled for novelty and extinction themselves, the estimated
validate the results of paper-and-pencil tests. A student of the
audience level should decline over the time period if a significant
effectiveness of television commercials has run some preliminary
number of people were viewing the camera and not the poster.
checks on an advertising exposure test. He called friends and
Walters, Bowen, and Parke (1963) recorded the eye move-
asked them to send their secretaries to his office on tlie pretext
ments of male undergraduates viewing a series of pictures of nude
of picking up a package. After arriving, tlie secretary was asked
or almost nude males and females. T h e Inen were told tliat a
to wait in a reception room which contained newspapers, niaga-
moving spot of light 011 the film indicated tlie eye movements of a
zines, and a turned-on television set. A hidden television camera
previous subject. For about half the subjects, the light roved over
monitored her behavior as she turned to the printed material,
'In the s a m e article. Pr~~f'essor Boring comments o n changes in the human
watched television, or just sat. She left, unsuspecting, and observer as an instrument. "I remember 1 1 0 ~a 1)rofesso1. of' genetics many years
was subsequently interviewed by standard questionnaire methods ago s h ~ ~ w eme d published drawings of' cell nuclei dated both before and after the
to determine her expnsure to television conimercials and maga- i l i s c ~ ~ v e ranti
y descripti~)nof c h r ~ ~ r n ~ ~ s ~Chro~nosomes
~rnes. kept showing up in tile
a . in the earlier" ( p . 176).
later d r a w i ~ ~ g n11t
zines and newspapers. This is a more advanced variation of
148 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 149

the bodies portrayed in the film, while with the other half, the solve the technical problems of a jiggling bus, more stable situa-
light centered on the background of the pictures. The eye move- tions should present little difficulty.
ment of the subjects was influenced, and Many of our laboratory experiences could be replicated in
natural settings. Landis and Hunt's (1939) method of studying
Subjects who had been exposed to a supposedly sexually movement responses could easily be applied in nonlaboratory
uninhibited model spent a significantly longer time looking at
settings. Shooting off a gun, the authors filmed the subject's
the nude and semi-nude bodies, and significantly less time
looking at the background of the pictures [Walters, Bowen, and gestural response pattern, which included such movements as
Parke, 1963, p. 771. drawing the shoulders forward, contracting the abdomen, and
bending the knees. Facial patterns included closing the eyes and
Another investigation showed that the presence of a female inhib-
widening the mouth. It will be remembered that Krout found that a
ited the interest of male students in "sexy" magazines. The
gesture of placing the hand to the nose was correlated with fear.
magazines were avoided with a woman present, but upon her
With the stimulus of an unexpected gunshot, the immediate
leaving, they were quickly retrieved (Sechrest, 1965b).
response may be independent of any contaminants due to the
Zamansky (1956; 1959) used "time looking" at different types
experimental setting.
of photos in his studies of homosexuality and paranoid delusions.
Exline (1963; 1964,) and Exline and Winters (1964) worked behind
one-way mirrors to make controlled observations of mutual
glances, time spent looking at someone while speaking to him, HARDWARE
:
time spent looking when being spoken to, and the like. PHYSICAL
SUPPLANTING
OF THE OBSERVER
Then, of course, there are the apparatus studies of pupil dila- The hardware measures mentioned so far have been mainly
tion. Gump (1962) reports that Chinese jade dealers were sensitive concerned with reducing the risk associated with the human ob-
to this variable and determined a potential buyer's interest in vari- server's fallibility as a measuring instrument - his selective per-
ous stones by observing the dilation of his pupils as pieces were ceptions and his lack of capacity to note all elements in a complex
shown (astute buyers countered this by wearing dark glasses). set of behaviors. Another use to which hardware has been put is to
Hess and Polt (1960) measured pupil dilation on 16-millimeter obtain research entrke into situations which are excluded by the
film and related it to stimulus materials. The stimulus objects were usual simple observational method. Some of these content areas
a series of pictures-a baby, a mother holding a child, a partially have been unattainable because of the privacy of the behavior,
nude woman, a partially nude man, and a landscape. The six others because of the prohibitive costs of maintaining observa-
pictures elicited clear-cut differences in pupil size, and sex differ- tional scrutiny over a substantial enough sample of time. Sitting in
ences were present. for the observer, hardware can help resolve both problems.'
Commercial applications of this method have been based on
work under the direction of Hess. See Foote (1962); West (1962); Z G a l t o ~writing
~, from Africa, sent the following letter to his brother: "I have
Anonymous (1964b); Krugman (1964). seen figures that would drive the females of our native larid desperate-figures that
The pupil-dilation studies have all been conducted in poten- could afford to scoff at crinoline, nay more, as a scientific man ancl as a lover of the
beautiful I have dexterously even without the knowledge of the parties concerned,
tially reactive settings. Whether or not such a measure could be resorted to actual measurement. Had I been a proficient in the language, I should
employed without the subject's awareness is questionable. There have advanced, and bowed and smiled like Goldney, I should have explained the
are technical difficulties with laboratory apparatus as is, and dress of the ladies of our country, I should have said that the earth was ransacked
for iron to afford steel springs, that the seas were fished with consummate dariug to
resolving field-use problems might be too much to expect. obtain whalebone, that far distant lands were overrun to possess ourselves of
But certainly the eye-direction and duration-of-looking mea- caoutchou-that these three products were ingeniously wrought by competing
sures are amenable to naturalistic use. If Politz (1959) was able to artists, to the utmost perfection, that their handiwork was displayed in every street
UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL

"Blind bugging" via audiotapes is one such approach-a . . . losses of comments lasting over one minute occurred at the
coiitroversial one wheii applied in certain settings, and illegal in rate of about seven times per recorded hour . . . the critical
objection lay . . . in the tape's sometimes useful lack of selec-
many. Jury deliberations are not observable because of standard
tivity: the record was a long-drawn-out tissue of inanities in
legal restraints, but Strodtbeck and his colleagues (Strodtbeck & which the very diffuseness made analysis more difficult than
James, 1955; Strodtbeck & Mann, 1956; Strodtbeck, James, & when one was dealing with the more condensed material of
Hawkins, 1957) received the approval of the court and coullsel recollection [p. 2991.
from both sides to place hidden microphones in the jury room. T h e Which are the "better" data-the true conversation or the
use of concealed recording devices presents ethical questions "condensed material of recollection"-is up to the investigator to
that have been underlined by Amrine and Sanford (1956), Bur- decide. The loss of content, liowever, is a severe limiting condition,
chard (1957), and Shils (1959). demonstrating the selective utility of some hardware. In this case,
W e may add as an aside that among the most astute devices the recorder would b e adequate for recording sound level, but
for concealed recording is a microphone rigged in a mock hearing inadequate for providing a complete record of conversations.
aid. It works extremely well in inducing the subject to lean over Many pieces of hardware have been developed for measuring
and shout directly into the recording apparatus. T h e presence of a
the level of physical activity, a variable that has been viewed as
dangling cord does not inhibit response.
symptomatic of many things. Perhaps the earliest mention of this
T h e "cocktail-party effect" is an acoustical term for the
type of measure was made by Galton (1884), who was at that time
process of listening to one among a multitude of talkers. First interested in the physical equivalents of metaphorical language.
suggested by Pollack and Pickett (1957) and expanded by Mac- H e took as liis example the "inclinatioii of one person toward
Lean (1959), it was used by Legget and Northwood (1960) in another." This situation is clearly seen wheii two people are sitting
conducting experiments at eight gatherings, relating recorded next to each other at a dinner table, according to Galton. To
sound level to number of people attending and drawing on records demonstrate this empirically in quantitative terms (Galton, 1884;
for total consumption of food and drink. T h e experimenters found Watson, 1959), he suggested a pressure gauge with an index and
tliat the nature of the beverage served made no significant differ- dial to indicate changes in stress arranged on the legs of the chair
ence in the buildup of sound levels, that all-male gatherings were on the side nearest the other person. Galtoii specified three
slightly quieter than mixed gatherings, and that the maximum necessary conditions for this type of experiment: the apparatus
sound levels were 80 to 85 decibels, ". . . not quite high enough to must be effective; it must not attract notice; and it must be capable
cause permanent impairment of hearing" (Legget & Northwood, of being applied to ordinary furniture. All of these criteria are
1960, p. 18). See also Hardy (1959) and Carhart (1965). appropriate for contemporary apparatus studies.
Riesman and Watson (1964) met with failure in their attempts It is obvious that such a device may be a substitute for human
to record party conversations on tape observers when their presence might contaminate the situation,
and where no c o n v e ~ i i e ~ hidden
it observation site is available.
corner ant1 advertised in every periodical but that o n the ~ ~ t l i hand,
er tliat great as is
European skill, yet it was nothing bef;)re tlie 1iandiw1)rkof a bountet~usnature.
Indeed, many of the studies discussed earlier as "simple observa-
Here I should have blushed bowetl and s~iiiledagain, handetl the tape and re- tion" are amenable to mechanization, provided Galton's criteria
quested them to make themselves the necessary measuremelit as I stoocl h y can b e met.
and registered the inches or rather yards. This however I could not do-there
were none but l.Iissionaries near to interpret for me, they would never have There is F. Scott Fitzgerald's fictional account in The Last
entered into my feelings and therefore to them I did not apply-but I sat at a Tycoon, of how the title character, a movie executive, evaluated
distance with my sextant, and as the ladies turned themselves about, as women the quality of rushes (preliminary, unedited film "takes") by
always tlo, to be admired, I surveyecl them in every way and subsec~uetitlymeasured
the distance of the spot where they stood-worked out and tabulatecl the results observing how much they made him wiggle in liis chair. The more
at my leisure" (Pearsoli. 1914, p. 232). the wiggles, the poorer tlie movie scenes. Simple observational
CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 153
152 UNOBTRUSIVE MEASURES

measures could be made of twistings by a concealed observer, but their study of the restlessness of patients as a partial measure of
they would clearly be inferior to a more mechanical device. the effect of various drugs. Their rather complex apparatus con-
Galton (1885) suggested a fidget measure based on the amount sists basically of a series of pulleys and springs, set under the
of body sway among an audience. The greater the sway, the springs of the bed, which record the displacement of the mattress.
greater the boredom. "Let this suggest to observant philosophers, When the patient is perfectly motionless, the relay system does not
when the meeting they attend may prove dull, to occupy them- operate, but the slightest movement will be recorded. A more
selves in estimating the frequency, amplitude and duration of the simple device is possible with baby cribs: the activity level of the
fidgets of their fellow sufferers" (p. 175). The American playwright child is measured by shaving down one of the four crib legs and
Robert Ardrey notes coughing as an audience response to bore- attaching a meter which records the frequency of jiggling. This is
dom. much less fine a measure than that of Cox and Marley, for the child
could move without activating the meter. For studies which don't
One cougher begins his horrid work in an audience, and the
cough spreads until the house is in bedlam, the actors in rage, require such fine calibration, however, the simplicity of the device
and the playwright in retreat to the nearest saloon. Yet let the is appealing.
action take a turn for the better, let the play tighten up, and that As beds, cribs, and chairs can be wired, so, too, can desks.
same audience will sit in a silence unpunctuated by a single Foshee (1958) worked on the hypothesis that a greater general
tortured throat [p. 85].3 drive state would manifest itself in greater activity. Here is a good
A mechanical device has been employed by Kretsinger (1952; theoretical proposition testable by a device appropriate for natural
1959). He used what he terms an electromagnetic movement meter settings away from the laboratory. To measure activity, Foshee
to study gross bodily movement in theater audiences-a very used a schoolroom desk which was supported at each corner by
difficult observational setting because of inadequate illumination. rubber stoppers. Attached to the platform which supported the
(See also Lyle, 1953.) Kretsinger claims that this method was desk was a mechanical level arrangement which amplified the
"objective, essentially linear, and completely removed from the longitudinal movements of the platform. Through an elaborate
subject's awareness." The technique transmission system, the amplitude and frequency of the subject's
movement could be measured. Foshee does not mention whether
. . . was based upon a capacity operated electronic system often
used in burglar alarm applications. As modified by the author, it the subjects (in this case a group of mental defectives) were aware
consisted of an oscillator detector, a D.C. amplifier, and an of the apparatus or not, but it would seem likely that the device
Esterline-Angus ink writing recorder. A concealed copper could be constructed to evade detection.
screen was located near the head of the S watching the film. To reach into the difficult setting of a darkened movie house
As the S moved, the effective capacity of the oscillator circuit
for the study of expressive movement, several investigators have
varied, changing the frequency of its oscillation. This fre-
quency shift was converted to a change in D.C. voltage ampli- employed infrared photography (Siersted 8s Hansen, 1951; Bloch,
..
fied sufficiently to drive an ink pen on a moving paper c h a r t . 1952; Field, 1954; Greenhill, 1955; Gabriele, 1956). This type
completely removed from the S's awareness [Kretsinger, of filming eliminates almost entirely the element of subject
1959, p. 741. awareness of the observational apparatus. It is clearly superior
The importance of heeding population characteristics is borne to unaided observation because of the advantage of working
out by his conclusion, "There was some evidence that the pre- in the dark; the brighter the light in which to see the subject,
sence of girls had a disquieting effect upon the boys . . . " (p. 77). the brighter the light for him to see you. This is illustrated in
Cox and Marley (1959) devised another movement measure in Leroy-Boussion's (1954) visible-observer study of emotional ex-
pressions of children during a comedy film. Although Leroy-
Trom African Genesis, by Robert Ardrey. Copyright@1961by Literat S.A. Boussion claimed to be only a projection aide during the film, she
Reprinted by permission of Atheneum Publishers.
154 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 155

did have to eliminate certain subjects who "seemed to b e aware" mark. It is important to note that these are not random errors
of her presence as an observer. It would thus seem likely that there which balance out, but are constant for the method. Because of
were other subjects who did not make their awareness known to this, it is equally important that human observers be there with the
the investigator. Putting aside the question of reducing the sample mechanical device, particularly in its early period of installation, to
size, the more important issue is whether those who were aware study whether any such errors are admitted to the data.
(and discarded as subjects) were a selective group (the more Ellis and Pryer (1959) have demonstrated the complexities
suspicious or paranoid, for example). Infrared photography drasti- possible with photoelectric cells in their study of the movements of
cally reduces such selective loss. Further, it is possible to match children with severe neuropathology. Their apparatus consisted of
the infrared camera with the regular projection machine so that in a square plywood enclosure in which the children played. Elec-
subsequent analysis the photographs of audience reaction can b e tronic devices were located on the outside surface of the walls and
matched easily to the specific film sequence. arranged so that the beams crisscrossed the enclosure at two-foot
T h e danger of relying solely on interview or questionnaire self- intervals. Interruption of a beam would be recorded, with each
reports is sharply illustrated in the Siersted and Hansen (1951) beam recorded separately.
study. They supplemented their filming by interviewing the chil- With the light beams visible, the behavior of subjects may b e
dren who had seen the film. There were marked differences modified-either because they dart back and forth in a playful
between these interview responses and both the filmed reactions game with the beam, or because it inhibits their movement to know
and verbal comments made during the film (recorded with hidden that they are under observation. Ellis and Pryer suggest modifica-
tape recorders). tions of their technique to avoid such risks, among them installing
For some reason, the French have been leaders in research on infrared exciter lamps, noiseless relays, and soundproofing.
movie hardware. Toulouse and Mourgue (1948) worked with res- As a final example of the use of hardware to get otherwise
piratory reactions in order to index reactions to films, and it has difficult content, there is Weir's (1963) report of her audiotape
even been suggested that the temperature of the room in which a recordings of a two-and-a-half-year-old boy falling to sleep. T h e
film is viewed might be monitored a s an indicator. child practices language, working with noun substitution and
T h e estimation of attendance at an event or an institution can articulation. In the evening of the day when lie was first offered
be carried out by planting observers who count heads. Another raspberries, he says, "berries, not bayreez, berries." Maccoby
way is to mechanize and count circuit breaks. T h e "electric eye," (1964<),who summarized the study, states: "These observations
particularly when supplemented by a time recorder, provides a provide insight into language learning processes wliich are
useful record of frequency of attendance and its pacing. T h e ordinarily covert and not accessible to observation" (p. 211).
photoelectric cells are typically set up on either side of a doorway
so that any break of the current will register a mark on an attached
recording device. As Trueswell (1963) shrewdly pointed out, how-
ever, this apparatus is not free from mechanical or reactivity Most of the observation studies reported so far have been ones
contaminants. Particularly when the device is first installed, it is in which the observer is passive. H e may take the behavior a s it
common for people to step back and forth through the light or to comes, or he may introduce mechanization to improve the ac-
wave arms and legs, thus registering three or four marks for a single curacy and span of his observations, but he has not typically
entry. Another difficulty is the placement of the cells. If they are altered the cues in the environment to which the person or group is
set too low, it is possible for each leg to register a separate mark as responding. This passivity has two costs. It is possible that the
the person walks through. If the doorway is wide enough to admit behavior under study occurs so infrequently that an inordinate
two people, a couple may walk together and register only one amount of effort is expended on gathering large masses of data,
CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 157
156 UNOBTRUSIVE MEASURES
many acknowledged in puzzled courtesy and how many felt Carroll
only a small segment of which is useful. Or, paying the second
had given then1 their due. If the study is replicated, it might be well
cost, the naturally occurring behavior is not stimulated by events
to send a control sample a wire saying, "It doesn't matter. We're
of sufficient discriminability. The investigator may want four or
with you anyway." The wire was an efficient way to stimulate a
five levels of intensity of a condition, say, when the convenient
response. An analogue may be an attempt to teach automobile
simple observation approach can produce only two.
driving by operant conditioning procedures. It is possible, but may
Rather than pay these costs, many investigators have actively
take a hazardously long time contrasted to active control by the
stepped into the research environment and "forced" the data in a
teacher (Banclura, 1962).
way that did not attract attention to the method. In some cases,
Simple observation, tnecllanized or not, is appropriate to a
this has meant grading experimental conditions over equivalent
broad range of imaginative and useful research comparisons.
groups, with each group getting a different "natural" treatment. In
Some of these we have mentioned. The advantage of contrived
a smaller number of studies, the conditions have been varied over
observation is to extend the base of simple observation and permit
the individual. Both classes are illustrated in this section.
more subtle comparisons of the intensity of effect.
Allen Funt of the television show "Candid Camera," perhaps
The early work of Hartshorne and May yields good examples
the most visible of the hidden observers, gave up simple observa-
of the manipulating observer. In Studies i n Service an,d Self
tion because of the high dross rate (Flagler, 1960). In the early
Control (Hartshorne, May, & Maller, 1929), they report on a long
years of the program, Funt's episodes consisted largely of studies
series of experiments - the first behavioral studies of "service."
of gestures and conversation (Hamburger, 1950; Martin, 1961).
Employing "production methods of measurement," they indexed
Particularly with conversation, Funt found that a large amount of
service or helpfulness by the subject's willingness to produce
time was required to obtain a small amount of material, and he
something-a toy in a shop, or the posting of a picture. Similarly,
turned to introducing confederates who would behave in such a
they employed "self sacrifice" techniques, measures on which the
way to direct attention to the topic of study.
subject had to give up something.
In one magnificent sequence of film, Funt prepared a cross-
The subjects were school children, and the active involvement
cultural comparison of how men from different countries respond
of the experimenter (teacher) in defining alternatives of behavior
to the request of a female confederate to carry a suitcase to the
was both expected and normal. The threat of subjects' awareness
corner. Filmed abroad, the episodes centered on the girl indicating
of being tested is less an issue in educational research, and the
she had carried the suitcase for a long time and would like a hand.
long line of studies on lecture versus discussion methods, as well
The critical material is the facial expression and bodily gestures of
as the current research on educational television, are a fine source
the men as they attempted to lift the suitcase and sagged under the
of learning research because the risk of the contaminant is so
weight. It was filled with metal. The Frenchman shrugged; the
reduced.
Englishman kept at it. Funt has offered to open his extensive film
In the same way, it is not a patently false condition for a
library to social scientists. For students of response to frustration
teacher to present students with the chance to help some other
or unexpected stimuli, this is rich ground.
children in hospitals. Hartshorne and May graded'the opportuni-
Obviously, experimental manipulation is not a contaminant. It
ties to help in an "envelope test." The student could put pictures,
is only when the manipulation is seen as such that reactivity enters
jokes, or stories in envelopes to give to hospitalized children, could
to threaten validity. Carroll (1962) showed that active initiation of
promise to do so, or not do so. In another behavioral measure of
stimuli can have its comic side. In an exploratory venture, he sent
sacrifice, one with more artificiality, however, the students were
out wires to 12 distant friends, congratulating each on his "recent
told they would be given some money. They were provided op-
achievement." Back came 11 acknowledgements of humble
portunities to bank it, give it to a charity, or keep it themselves. In
thanks. This approach lacks control, for we cannot know how
158 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 159

another phase, one that presaged many small-group experiments, Contrived observation, then, is observation in which the
the children were given a choice of whether they would work for stimuli or the available responses are varied in an inconspicuous
themselves or for the class in a spelling contest. way. For Hartshorne and May, tlie variation was primarily of the
In Hartshorne and May's Studies of Deceit (1928), children response alternatives.
were offered the opportunity not to return all of the coins distrib- T h e recent series of studies by Fantz and associates (Fantz,
uted for arithmetic practice, to cheat by changing original an- 1961a; Fantz, 1961b; Fantz, Ordy, & Udelf, 1962; Fantz, 1963;
swers in grading their own exam papers (which had previously Fantz, 1964) shows the more usual variation of the stimuli. They
been collected and then handed back with some excuse), to peek too, worked with subjects where tlie reactivity risk is low-new-
during "eyes-closed" tests and thus perforin with unbelievable born infants. The simple response measure was visual fix on a
skill, to exaggerate the number of chin-ups when allowed to make target, with the stimulus varied along such dimensions a s novelty,
their own records "unobserved." Forty separate opportunities color, and pattern. As far as a 48-hour-old infant is concerned, a
were administered i11 whole or in part to about eight thousand series of concentric rings in his visual field is as natural a s
pupils. anything else.
In one of their reports (May & Hartsliorne, 1927) is found the Stechler (1964) also studied newborns, observing the effect of
first presentation of what is now known a s Guttman scale analysis. medication administered to the mother during labor on the baby's
T h e experimenters found high unidimensionality for a series of attentiveness. Each child received three stimuli, "held near tlie
paper regrading opportunities: those students who cheated when baby's face for a total of nine m i n u t e s . . . . An observer hidden
an ink eraser was required cheated on every easier opportunity. from the babies recorded the total time they looked at and away
These studies of Hartshorne and May in the Character Educa- from the stimuli" (p. 315).
tion Inquiry are the classics of contrived observation, and nothing A much more hardheaded group of subjects, automobile
so thorough and ingenious has been done since. It is unfortunate salesmen, were studied by Jung (1959; 1960; 1961; 1962) in his
for subsequent measurement efforts tliat interpretation of the evaluations of the effect of various bargaining postures. T h e
cheating results was viewed as specific to the situation. To be sure, response measure was simple, the quoted price of an automobile
honesty was found to be relative to situation; for example, in one with specified features, and three different bargaining postures
study (May & Hartshorne, 1927), only 2 per cent cheated when were struck. In this well-designed series of experiments, confeder-
corrections required an ink eraser, while 80 per cent cheated when ates posed as customers and adopted one of these three poses: an
all tliat was needed was either erasing or adding a penciled digit. eager, naive "I just got my license. Where do I sign?" approach, an
But this is not inconsistent with the six cheating opportunities engineering, price-sophisticated approach, and one in between.
forming a single-factored test or unidimensional scale. The data T h e differences among test conditions are smaller in absolute
show a Guttinan reproducibility coefficient of .96. Even though the quotations than might have been expected-only $33 between
measure was only six items long, there was a Kuder-Richardson the extremes in one study. Because the research was conducted
reliability of .72 which becomes .84 when corrected for itein- in tlie heavily price-competitive Chicago area, the dollar dif-
marginal ceiling effects as suggested by Horst (1953). Pooling all ferences may well be less than in areas in which competition is
their disguised performance tests for a given trait, the experiinen- along other lines. T h e Chicago buyer, real or feigned, gets an
ters checked the character tests against reputational scores from automatic discount without any haggling. Jung has used the same
the so-called Guess-Who tests. T h e validities ranged from .315 to feigned-shopper approacli in studies of mortgage financing and the
.374. Although very low values in terms of the standards of their sale of mobile homes (Jung, 1963; Jung, 1964).
day, they are now recognized to be reasonable values typical of Brock (1965) turned the conditions around. His experimenter
those found for personality tests. Of course, the reputational in a study of decision change was a paint salesman. After custom-
measures contributed their full share of tlie error in validity. ers chose paint at a certain price, the salesman suggested either a
CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 161
160 UNOBTRUSIVE MEASURES
In the second experiment, interest was focused on the rela-
more or less expensive paint. The salesman was more successful
tionship between method of pay and dissonance. Thirty-six stu-
when he described himself as having recently bought the same
dents were hired for the same interviewing job and randomly
amount of paint at a different price.
assigned to the four conditions:
Franzen (1950) conducted a sales experiment with pharma-
1. Experimental dissonance condition - students were paid
cists. The confederate was again a "customer," who related
$3.50 an hour and made to feel overpaid.
various symptoms of illness to the pharmacist. The symptoms
2. Control condition-students were paid $3.50 an hour and
were graded by their severity as told by the "customer," and it was
made to feel it was an equitable wage.
noted whether or not a visit to a physician was suggested. Exam-
3. Experimental dissonance condition - students were paid
ples of the symptoms, all of which could be related to early
30 cents an interview and made to feel overpaid on a piece
cancer, were loss of voice, sore on lip, heartburn and stomach
rate.
trouble, and constipation.
4. Control condition - students were paid 30 cents an inter-
Franzen also administered opinion questionnaires to an
view and made to feel payment was fair.
equivalent, randomly selected group of pharmacists. The results
Adams' hypothesis was supported: "hourly workers in the
from the questionnaire are different from those of the field study.
dissonance condition had a higher mean productivity than their
The contrived observation results, we suspect, have a higher
controls, whereas piece-workers in the dissonance condition had a
predictive value than those from the questionnaires.
lower mean productivity than their controls" (1963b, p. 13).
These findings recall the classic study comparing verbal
In the final experiment, Adams showed that pieceworkers
attitudes and overt acts: Lapiere's (1934) research on prejudice.
who perceive that they are inequitably overpaid will perform better
He and a Chinese couple visited 250 hotels and restaurants and
quality work at a lower productivity level than pieceworkers paid
were refused service just once. Yet when questionnaires were sent
the same rate who believe that the wage is fair. The experimental
to those same places, asking if Chinese customers were welcome,
subjects increased their "inputs" on each unit of work, thereby
some 92 per cent answered negatively. As a control, LaPiere sent
increasing the quality, but decreasing the quantity.
identical questionnaires to 100 similar establishments which his
We have detailed this study because of the example that
party had not visited, and the response was similar.
Adams provides of the shrewd hypothesis-testing potential of some
J. S. Adams (1963a; 1963b) has demonstrated in his important
of the rudimentary, but natural, measures available for experimen-
work on wages what fine research can be undertaken with simple
tal study.
productivity data. He conducted three experiments "to test how
The violation of prohibitions has offered the subject matter for
people behave when they are working on a relatively highly paid
several field studies of contrived observation. In these, the physi-
job for which they feel underqualified (that is, when they feel their
cal world of the observed was actively manipulated, and the
pay, or outcome, exceeds their qualifications or input)" (196313, p.
dependent variables were simple motor acts.
10).
Freed and his colleagues (Freed et al., 1955) experimented
The subjects were students who were hired for part-time
with sign violation. To what extent will students violate a sign
temporary interviewing, not knowing they were part of an experi-
urging the use of an inconvenient side door rather than the
ment. In the first experiment, they were paid $3.50 an hour and
customary main door of a university building? The degree of
divided into two groups. The experimental group was led to believe
prohibition was varied on the sign (high, medium, or low) with the
that they were not qualified for the job as interviewers; they "were
interesting fillip of a confederate who conformed to the sign or
treated quite harshly." The control group was told they had met all
violated it. In a control condition no confederate was present.
qualifications for the job. With productivity the dependent varia-
Ninety subjects were assigned to nine different experimental
ble, the experimental subjects produced significantly more than
groups, combinations of prohibition strength of the sign and
the control subjects.
162 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 163

a confederate's presence or behavior. They showed main effects a formal ordinance prohibiting parking was introduced, and the
for the two variables with no interaction between them. In an observation continued. Thus, there is a time-series analysis in
extension of these findings, another investigator (Sechrest, 1965b) which the baseline is the period preceding the introduction of
found that a politely worded sign, "Please Do Not Use This Door," signs.
elicited fewer violations than the more abrupt "Use Other Door." In a second set of "administrative" studies, the militancy of
Violation of traffic signals was the dependent variable in a enforcement of traffic violations was the independent variable.
study by Lefiowitz, Blake, and Mouton (1955). An experimenter Parking was limited by ordinance (sign) to 30 minutes. I11 one
was again an active element in the setting-a male who dressed in condition, violations were tagged after 80 minutes and in a second
either high- or low-status clothing. The confederate either con- condition, after 45 minutes. The frequency and duration of parking
formed to or violated a traffic signal that ordered him to "wait" on under these two conditions were compared to a control condition
a street corner. An observer a hundred feet from the corner noted in which no tickets were given at all.
the number of people on the corner who went along with the The third set centered on rotary traffic. All studies in this
confederate. Would the difference in dress elicit differences in the group used a large circular tract of pavement from which five
number of pedestrians violating the light? Austin, Texas, was the streets radiated. Observation consisted of noting the path in which
locale, and pedestrians violated the sign more often in the pre- the area was crossed.
sence of a model-significantly more often when the nonconform- Five observation periods were employed. The first was before
ing model was in his higher-status dress. there was any ordinance regulating the direction of traffic. The
Cratty (1962) attempted a replication of this study in Evans- second was before a formal ordinance was enacted, but in the
ton, Illinois. He added observation of the race of violators and presence of signs to "keep right." The third followed enactment of
conformers, and his analysis suggests a significant difference the ordinance. The fourth was while the ordinance was visibly in
between Negroes and whites on violations under both conditions. effect-barriers would not allow cars through the center of the
The racial composition of the sample could thus confound com- circle. Finally, tlie barrier and signs were removed when the
p a r i s o n ~Cratty's
.~ findings also illustrate the problem of the cross- ordinance was still in effect. The observed directions of flow under
sectional stability of a research measure. To have good intersec- the five conditions were compared.
tional comparisons, the degree of usual conformity to the signs Another driving study, reported by Sechrest (1965b), was
should be equivalent. Comparison of the Texas and Illinois data is concerned with willingness of drivers to accept a challenge to
muddied by a law-abiding difference. Only 1 per cent of the Texas "drag" at stop signals. The investigators challenged by pulling
sample violated the sign when the confederate was absent; over 60 alongside a car, gunning the engine of their car, and looking once
per cent of the Illinois sample did. at their "opponent." They used different stimulus cars and re-
Moore and Callahan (1943) conducted research in New corded several attributes of the responding cars. Results showed a
Haven, concentrating on traffic and parking behavior. Three sets strong decline in acceptance of the challenge with increasing age
of studies are reported, all of them nonreactive, all examples of and with presence of passengers other than tlie driver in the
contrived observation. In the first set, the "parking" studies, respondent's car. As for the stimulus cars? Very few drivers
observations were made over a four-year period. Moore and Calla- wanted to drag with a Volkswagen.
han first observed the number and length of time parked of cars in Traffic behavior offers a splendid opportunity for naturalistic
areas where there was no formal ordinance against parking. Then experimentation. A large body of control data is already available,
produced for engineering studies or by market-research firms to
-'It might be interesting to draw a sample of United States cities with varying
degrees of reported racial tension and compare, by race, the extent to which minor document the exposure of outdoor advertising. It should be possi-
violations such as walking on a red signal are committed. ble, for example, to study the effect of different degrees of threat in
164 UNOBTRUSIVE IPIEASURES
CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 165
a persuasive message by studying the degree to which drivers slow
different cities to insure a representative sample of the public at
down, if they do, after passing different classes of signs-say,
large" (p. 91). While 85 per cent of t h e control letters were
those threatening legal enforcement or the danger to personal
returned, only 54 per cent of the test (containing slug) were, and
safety. Radar devices or filming from a n overhead helicopter could
some 13 per cent of the test letters that were returned had been
provide the measure of speed. What might b e of particular interest
opened. After the letters were dropped, there was a chance to
in such a study would b e the extinction rate under the different
work with auxiliary information on the unsuspectillg subjects.
conditions. Does a n enforcement warning have a faster rate of
decay than a personal safety message? What happens if both Watching the pickup of the letters proved to be a most enter-
classes are equated for initial effect on speed? taining pastime. Some were picked up and immediately posted
at the nearest mailbox. Others were examined minutely, evi-
Another study relating to legal processes is that of Schwartz
dently precipitating quite a struggle between the finder and his
and Skolnick (196213). They investigated the effect of criminal conscience, before being pocketed or mailed. Some were car-
records on the employment opportunities of unskilled workers. ried a number of blocks before being posted, one person carry-
Four employmeilt folders were prepared for an applicant; all ing a letter openly for nine blocks before mailing it. A lady in
folders were identical save for description of the criminal-court Ann Arbor, Michigan, found a letter and carried it six miles in
her car to deliver it personally, although she was not acquainted
record of the applicant. E a c h of 100 employers was assigned one of with the addressee. One letter, picked up in Harrisburg, Penn-
four treatment folders, and the employer was asked whether he sylvania was mailed from York, Pennsylvania. Another picked
could "use" the man described in the folder. T h e employers were up in Toledo, Ohio was mailed in Cleveland. Still another from
never given any indication that they were participating in a n the Toledo streets was mailed from Monroe, Michigan. Two
experiment. Even wheil the applicant was described as having missives left on the steps of the cradle of liberty in Philadelphia
failed to find their way into a mailbox. Two of five letters left
been acquitted with a n excusing letter from a judge or acquitted on church steps during Sunday services failed to return [p. 931.
without a letter, the incidence of employers who thought they
might use the applicant dropped. Grinder (1961: 1962) and Grinder and McMichael(1963) have
reported studies using a "ray gun" type of apparatus. Like Harts-
horne and May, their interest was in studying character, or
31
"conscience development." Children operated a "ray gun
Jones (1946) has provided a n excellent summary of early individually in a realistic game situation.
behavioral studies of character development, many of which follow
the entrapment strategy of Hartshorne and May. A recent example Seated seven feet from a target box, subjects were asked to
shoot the ray gun pistol 20 times at a rotating rocket. With each
is the work of Freeman and Ataov (1960). They contrived a situa- pull of the ray gun trigger, prearranged scores from zero to five
tion in which the subjects had a chance to cheat by grading their were registered by score lights also housed in the target box.
own examinations by a scoring sheet. Using three classes of ques- High scores were rewarded with a marksman, sharpshooter, or
tions -fill-in blanks, multiple-choice, and true-false - they found expert badge . . . subjects cumulated their scores on a paper
the number of changed answers for each class. T h e three formed score sheet. Subjects were judged to have resisted temptation if
the scores recorded on their score sheets indicated that thev
a Guttinan scale with a reproducibility coefficient of .94. had not earned a badge (they could not honestly). They were
O n e of the more interesting studies on honesty has been judged to have yielded to temptation if their score sheet showed
reported by Merritt and Fowler (1948), who "lost" two kinds of that they had falsified their scores in order to earn one of the
stamped and addressed envelopes, one containing a trivial mes- badges [Grinder & McMichael, 1963, p. 5041.
sage, the other a lead slug of the dimensions of a fifty-cent piece.
Similar in its assumption of dishonesty by subjects is a study
They dropped the letters "on many different days and in many
by Brock and Guidice (1963). T h e subjects were students ranging
166 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL 167

from the second to the sixth grade, and these children were tlie subject was asked to sign. Four conditions were employed,
individually asked if they would leave the class, go to another room, with each student receiving one:
and participate in an experiment. Upon entering the room, the 1. A petition on a proposal that had previously elicited 96 per
subject found the experimenter in a flustered state with her purse cent positive response and confederate signed
spilled and money lying on the floor. At this point the experimenter 2. T h e same 96 per cent positive proposal, but confederate
left, saying slie would b e back shortly, and asking the subject to refused to sign
pick up the contents of her purse while s h e was gone. T h e measure 3. A proposal that had received only a 15 per cent positive
used was the amount of money stolen by the subject. There is a response and confederate signed
good risk tliat some of the children, having been told to go to 4. T h e same 15 per cent proposal; confederate did not sign.
another room for an "experiment," became suspicious. T h e effect Solicitation to sign a petition is a common enough event in
of this is to make uncertain how many of those who took no money academic settings, and may be becoming more common outside
were honest and how many were acute. This bias could possibly the cloistered world. Certainly it offers a broad freedom of move-
interact with a practice effect on the part of the experimenter. Did ment in experimentation and structuring of contrived conditions.
she play her role differently over time, as slie became more Searching for volunteers does, too, and several studies have used
practiced or more bored? If professional actresses have problems observation of the simple "volunteer-not volunteer" alternatives as
"keeping fresh," it is reasonable to ask about amateurs. the behavioral variable.
T h e more complex the action of the confederate, the greater Schachter and Ha11 (1952) experimented with college students,
the risk of an experimenter effect, and the greater the possibility employing different situational elements in eliciting volunteers for
for more gradations in the experimental variable. an experiment and then noting whether or not tlie volunteers did in
fact show u p for the experiment. Classes were divided into four
groups and given different restraints. In one group, after a request-
ing speech, the listeners were asked to fill out a questionnaire,
Petition-signing has been the dependent variable in observa- whether or not they wished to participate in the experiment. Those
tional studies by Blake, Mouton, and Hain (1956) and Helson, who dicl wish to participate were merely asked to check an
Blake, and h'loutolr (1958). In the first study, the strength of the appropriate place on the form. In a second group, the forms were
plea to sign was varied, and frec~uencyof signing was closely passed around the room, and allyone who wanted to participate in
associated with this variable. A confederate was then introduced tlie experiment could take one. A third group was asked to raise
into this situation to provide varying reactions of another person. hands if interested; a fourth group also raised hands, but half of
In some situations he signed readily, i n others refused, and in still the class had been enlisted as confederates and "volunteered."
others his response was unknown to the person approached. T h e Scliacl~terand Hall's conclusion was tliat neither the high- nor low-
behavior of tlie confederate influenced signing, and an interesting constraint condition was particularly desirable in soliciting volun-
finding was that both variables operated independently-the teers. If the experimenters made it easy to refuse, they got a high
strength of the plea and endorsement by another. refusal rate, but high attendance among those who did volunteer.
The Helson, Blake, and Mouton (1958) study used petition- Contrariwise, placing high pressure on volunteering yielded a
signing as a response behavior within a larger experimental set- higher level of volunteers, but a smaller number who held to their
ting. Students who had volunteered to participate in an experiment promise.
were taken by a guide to an experimental room. On tlie way, the Rosenbaum and Blake (1955) wanted to test the hypothesis
pair was stopped, and a confederate asked the guide to sign a that the act of volunteering is "a special case of conformance with
petition. After the guide (also a confederate) signed or did not sign, social norms or standards, rather t h a n . . . an individualistic act
168 UNOBTRUSIVE MEASURES CONTRIVED OBSERVATION: HIDDEN HARDWARE AND CONTROL

conditioned by an esselitially unidentifiable complex of inner AN OVER-ALL

APPRAISAL
OF
tensions, needs, etc." (p. 193). Subjects were plucked from stu- HIDDENHARDWARE
AND CONTROL
dents studying in the university library, and conditiolis were varied
so that the subject saw either an acceptance or rejection of the As the discussion and examples of observational research
request from a confederate. In a third group, the student accepted have progressed over these past two chapters, the reader may have
or rejected the volunteering request in the absence of a model. As been sensitive to a movement along a passivity-activity dimension.
predicted, acceptances were high with a conforming confederate, In the studies reported in the chapter on simple observation
low with a nollconforming confederate, and in between on tlie methods, the observer was a non-intervening passive onlooker of
control condition. behavior that came before his eyes or ears. He may have scram-
The University of Texas library was also the site of another bled about in different locations to reduce some population restric-
study by Rosenbaum (1956). Volunteering was the dependent tions, but his role was a quiet, receptive one. In many ways, this is
variable, and stimulus conditions included three request strengths appropriate for the covert character of the studies we have out-
(determined by a pilot study) and three background conditions lined.
employing confederates. The confederate entered the library, sat With the hardware employed in the studies cited in this
next to an unsuspecting student, and then the volunteer-seeker chapter, the investigator engaged his data more - actively expand-
entered and started with the confederate. ing tlie possible scope of the content of research and acliievillg a
Blake and his associates (Blake et al., 1956) determined the more faithful record of what behaviors did go on. Yet tlie hard-
effect on the level of volunteering of varying the attractiveness of ware varied, too. Some of the hardware devices are static, while
alternatives to volunteering. The public or private character of the others are mobile. To the degree that the hardware is mobile (say,
volunteering was also varied, with conditions altered so that a class a microphone in a mock hearing aid versus one secreted in a table),
might substitute volunteering time for time otherwise devoted to: the experimenter has flexibility to make more economical forays
(1) a pop quiz, (2) released time from class, (3) a control, with no into locational sampling. He could sample in a number of locations
time gained. It might be observed that under the pop-quiz alterna- by installing more permanent recording devices, but commonly a
tive, 98.8 per cent of the subjects volunteered under the private- more feasible method is to sample occasions and time with mobile
commitment situation and 100.0 per cent under public commit- equipment. As electronic technology develops, more opportunities
ment. The 1.2 per cent above is accounted for by the single arise. It isn't so long ago that television cameras had to be
aberrant student who preferred a pop quiz to participation in an anchored in one spot.
experiment. When, through deliberate choice or no realistic alternative,
Volunteering for social action was the subject of a study by the investigator was limited to a fixed instrument, he was forced to
Gore and Rotter (1963). The action in this case was the willingness depend on the character of the population which flowed past that
of students in a southern Negro college to engage in different types spot and the content appropriate to it. The waiting game can give
of segregation-protest activity. This criterion measure was corre- accurate and complete measurement of a limited population and
lated with previously obtained scores on control of reinforcement limited content, and the decision to use such an approach is
and social desirability scales, these scales not specifically dealing posited on two criteria, one L'tl~eoretical,''one LLpractical."Are the
with the segregation issue. A generalized attitude toward internal limitations likely to be selective enough to inhibit the generalizabil-
or external control was shown to predict the type and degree of ity of tlie findings? Can the investigator absorb the time and
behavior subjects were willing to perform in attempts at social money costs of developing material with a low saturation of
change. pertinent data for his comparisons?
170 UNOBTRUSIVE MEASURES

In the contrived-observation studies, the experimenters took

the next step and intervened actively in t h e production of the data,
striding away from passive a n d critically placed observations. T h e
effect was to produce very dense data, of which a high proportion
was pertinent to the research comparisons. Further, a finer grada-
tion of stimuli was then possible, and more subtle shadings of
difference could be noted. By active intervention, as the petition
and volunteering studies of conformity show, it was also possible to
make estimates of the interaction of variables, a n extremely A Final Note
difficult matter with passive observation.
As the experimenter's activity increases, and h e achieves the
gains of finer measurement and control, the price paid is the In the dialectic between impulsivity and restraint, the scientific
increased risk of being caught- that the subjects of t h e observa- superego became too harsh-a development that was particu-
tion will detect the recording device, or will suspect that t h e larly effective in intimidating adventurous research, because
the young were learning more about methodological pitfalls
confederate is really a "plant." This is a high price, for if h e is than had their elders.. . .
detected, the experimenter's research is flooded with the reactive
measurement errors which the hidden-observation approach, re- (Riesman, 1959, p. 11)
gardless of its simplicity or complexity, is designed to avoid. At the
extreme end of contrivance, when a confederate is a visible actor
in the subject's world, it requires the greatest finesse to protect David Riesman's remarks on the evolution of communications
against detection and against changes in the behavior of the research apply equally well to the broader panoply of the study of
confederate damaging to comparison. T h e best defense, as always, social behavior. As social scientists, we have learned much of the
is knowledge, and almost all of the observational approaches have labyrinth that is research on human behavior, and in so doing
built into them the capacity to examine whether or not population discovered an abundance of cul-de-sacs. Learning t h e complex-
or instrument contaminants are working to confound the data. ities of the maze shortened our stride through it, and often led to a
pattern of timid steps, frequently retraced. No more can the
knowledgeable person enjoy the casual bravura that marked the
sweeping and easy generalizations of a n earlier day.
T h e facile promulgation of "truth," backed by a few observa-
tions massaged by introspection, properly met its end-flattened
by a more questioning and sophisticated rigor. T h e blackballing of
verification by introspection was a positive advance, but an ad-
vance by subtraction. Partly as a reaction to the grandiosities of
the past, partly as a result of a growing sophistication about the
opportunities for error, t h e scope of individual research studies
shrank, both in the range of content considered and in the diversity
of procedures.
T h e shrinkage was understandable and desirable, for cer-
tainly no science can develop until a base is reached from which
171
172 UNOBTRUSIVE MEASURES A FINAL NOTE 173

reliable and consistent empirical findings can be produced.' But if different areas of substantive content, often simultaneously, and
reliability is the initial step of a science, validity is its necessary also gather intelligence on the extent to which his findings are
stride. The primary effect of improved methodological practices hampered by population restrictions.
has been to further what we earlier called the internal validity of a The power of the questionnaire and interview has been enor-
comparison-the confidence that a true difference is being ob- mously enhanced, as have all methods, by the development of
served. Unfortunately, practices have not advanced so far in sensitive sampling procedures. With the early impetus provided by
improving external validity- the confidence with which the find- the Census Bureau to locational sampling, particularly to the
ings can be generalized to populations and measures beyond those theory and practice of stratification, concern about the population
immediately studied. restrictions of a research sample has been radically diminished.
Slowing this advance in ability to generalize was the laissez- Less well developed is the random sampling of time units - either
faire intellectualism of the operational definition. Operational over long periods such as months, or within a shorter period such
definitionalism (to use a ponderously cumbersome term) provided as a day. There is no theoretical reason why time sampling is
a methodological justification for the scientist not to stray beyond a scarce, for it is a simple question of substituting time for location
highly narrow, if reliable, base. One could follow a single method in a sampling design. Time sampling is of interest not only for its
in developing data and be "pure," even if this purity were more control over population fluctuations which might confound com-
associated with sterility than virtue. parisons, but also because it permits colitrol over the possibility of
The corkscrew coi~volutionsof the maze of behavior were variable content at different times of the day or different months of
ironed, by definitional fiat, into a two-dimensional T maze. To the year.
define a social attitude, for example, solely by the character of The cost is high. And for that reason, government and com-
responses to a list of questionnaire items is eminently legitimate - mercial research organizations have led in the area, while aca-
so much so that almost everything we know about attitudes comes demic research continues to limp along with conscripted sopho-
from such research. Alinost everything we know about attitudes is mores. The controlled laboratory setting makes for excellent in-
also suspect because the findings are saturated with the inherent ternal validity, as one has tight control over the conditions of
risks of self-report information. One swallow does not make a administration and the internal structure of the questionnaire, but
summer; nor do two "strongly agrees," one "disagree," and an "I the specter of low generalizability is ever present.
don't know" make an attitude or social value. That same specter is present, however, even if one has a
Questionnaires and interviews are probably the most flexible national probability sample and the most carefully prepared ques-
and generally useful devices we have for gathering information. tionnaire form or interview schedule. So long as one has only a
Our criticism is not against them, but against the tradition which single class of data collection, and that class is the questionnaire or
allowed them to become the methodological sanctuary to which interview, one has inadequate knowledge of the rival hypotheses
the myopia of operational definitionalism permitted a retreat. If grouped under the term "reactive measurement effects." These
one were going to be limited to a single method, then certainly the
- - potential sources of error, some stemming from an individual's
verbal report from a respondent would be the choice. With no awareness of being tested, others from the nature of the investiga-
other device can an investigator swing his attention into so many tor, must be accounted for by some other class of measurement
'"Almost all experiments on the effects of persuasion communications, than the verbal self-report.
including those reported in the present volume, have been limited to investigating It is too much to ask of any single class that it eliminate all the
changes in opinion. The reason, of course, is that such changes can readily be
assessed in a highly reliable way, whereas other components of' verbalizable rival hypotheses subsumed under the population-, content-, and
attitudes, although of considerable theoretical interest, are much more difficult to reactive-effects groupings. As long as the research strategy is
measure" (Janis and Hovland, 1959, p. 3). based on a single measurement class, some flanks will be exposed,
174 UNOBTRUSIVE MEASURES A FINAL NOTE 175

and even if fewer are exposed with the choice of the questionnaire collection methods will be best for my research problem? W e
method, there is still insufficient justification for its use as the only suggest the alternative question: Which set of methods will be
approach. best?-with "best" defined as a series which provides data to test
If no single measurement class is perfect, neither is any the most significant threats to a comparison with a reasonable
scientifically useless. Many studies and many novel sources of expenditure of resources.
data have been mentioned in these pages. T h e reader may indeed There are a number of research conditions in whicli tlie sole
have wondered which turn of the page would provide a com- use of the interview or questionnaire leaves unanswerable rival
mentary on some Ouija-board investigation. It would have been explanations. The purpose of tliose less popular measurement
there had we known of one, and had it met some reasonable classes emphasized here is to bolster these weak spots and provide
criteria of scientific worth. These "oddball" studies have been intelligence to evaluate threats to valiclity. T h e payout for using
discussed because they deliionstrate ways in which the investiga- these measures is high, but the approach is more demanding of the
tor may shore u p reactive infirmities of the interview and question- investigator. In their discussion of statistical records, Selltiz and
naire. As a group, these classes of measurement are themselves her associates (Selltiz et al., 1959) note:
infirm, and individually contain more risk (more rival plausible The use of such data demands a capacity to ask many different
hypotheses) than does a well-constructed interview. questions related to the research problem.. . . The guiding
This does not trouble us, nor does it argue against their use, principle for the use of available statistics consists in keeping
for the most fertile search for validity comes from a combined oneself flexible with respect to the form in which research
questions are asked [p. 3181.
series of different measures, each with its idiosyncratic weak-
nesses, each pointed to a single hypothesis. When a hypothesis This flexibility of thought is required to handle tlie reactive
can survive the confrontation of a series of complelneiitary meth- measurement effects which are the most systematic weahness of
ods of testing, it contains a degree of validity unattainable by one all interview and questionnaire studies. These error threats are
tested within the more constricted frameworlc of a single method also systematically present in all observation studies in which the
(Campbell & Fiske, 1959). Fiiidi~igsfrom this latter approach must presence of an observer is known to those under study. To varying
always be subject to the suspicion that they are method-bound: degrees, measurements conducted in natural settings, without the
Will the comparison totter when exposed to an equally prudent individual's knowledge, control this type of error possibility. In all
but different testing method? There inust be a multiple operation- of them-hidden observation, coiitrivecl observation, trace anal-
alism. E. G. Boring (1953) put it this way: ysis, and secondary records-the individual is not aware of being
. . . as long as a new construct has only the single operational tested, and there is little danger that the act of measurement will
definition that it received at birth, it is just a construct. When it itself serve as a force for change in behavior or elicit role-playing
gets two alternative operational definitions, i t is beginning to be that confounds the data. There is also minimal risk that biases
validated. When the defining operations, because of proven coming from the physical appearance or other cues provided by
correlations, are many, then it becomes reified [p. 2221. the investigator will contaminate the results.
This means, obviously, that the notion of a single "critical In the observational studies, however, hiding the observer
experiment" is erroneous. T11e1-enzzlst be cr series of linked critical does not eliminate the risk that he will change as a data-collecting
experinzerzts, ecrch testing a different orctcropj)ing of the hypothesis. instrument over time. Any change, for the better or worse, will
It is through triangulation of data procured from different measure- introduce shifts that might be erroneously interpreted as stemming
ment classes that the investigator can most effectively strip of from the causal variable. This source of error inust be guarded
plausibility rival explanations for his comparison. T h e usual pro- against in the same way that it is in other measurement classes-
cedural question asked is, Which of the several available data- by careful training of the observer (interviewer), by permitting
176 UNOBTRUSIVE MEASURES A FINAL NOTE 177

practice effects to take place before the critical data are collected, 1 views. If one worked in New Guinea, for example, and had to
and by "blinding" the observer to the hypothesis. There is no way depend upon the lingua franca pidgin widely spoken there, he
of knowing, of course, whether all reasonable precautions have would find it adequate to indicate an answer to "Where do you
worked. For this, the only solution is an - internal longitudinal keep your fishing nets?" but too gross a filter to study the ethno-
analysis of data from a single observer and cross-analysis of data centricism of a tribe. Pidgin simply does not possess the subtle
from different observers at various times during the data collection. gradients required to yield textured responses to questions on
Finally, none of the methods emphasized here, by themselves,
can eliminate response sets which might strongly influence the
character of the data. These must be brought under experimental
i attitudes toward neighboring tribes or one's own tribe. Although it
is theoretically possible to learn all the regional dialects well
enough to be competent in a language, in practice this does not
control by manipulation of the setting itself (as in contrived field occur. A more pragmatic approach is to search for observational
experimentation) or by statistical operations with the data if the or trace evidence which will document aspects of ethnocentrism
character of the response sets is known well enough to permit (e.g., reactions to outsiders, disposition and use of weapons) and
adjustments. With archival records, it may be extremely difficult to then relate it to the verbal responses in the inadequate pidgin.
know if response sets were operating at the time the data were One more weakness of the dependence on language is that
produced. sometimes there is silence. So long as a respondent talks, glibly or
These methods also may counter a necessary weakness of the not, in a rich language or not, checks and controls can be worked
interview and questionnaire - dependence upon language. When on the reported ~ o n t e n tThere
.~ are, however, situations in which
one is working within a single society, there is always the question refusals to cooperate preclude any chance of correcting distorted
whether the differential verbal skills of various subcultures will information. This usually results in a biased research population
mislead the investigator. It is possible, if groups vary in articulate- and not a rejection of all findings, because it is almost always
ness, to overgeneralize the behavior or attitudes of the group or possible to find some people who will discuss any topic. But it can
individuals with the greater verbal fluency. This risk is particularly also result in a complete stalemate if only the verbal report is
marked for the interpretation of research reports which employ considered as the research instrument.
quotations liberally. The natural tendency of the writer is to choose An amusing example of this inability to get data by verbal
illustrative quotations which are fluent, dramatic, or engaging. If report, and a nonreactive circumvention, is provided by Shadegg
the pool of good quotations is variable across the subcultures, the (1964). In his book on political campaign methods, Shadegg writes
reader may mistakenly overvalue the ideas in the quotations, even of a campaign manager who used every available means to learn
though the writer himself does not. This is a question of presenta- the plans of his opponent, who, reasonably enough, was unwilling
tion, but an important one because of the disproportionate weight to grant a revealing interview. One method arranged for procuring
that may be placed on population segments. the contents of his opponent's wastebasket: "He came into posses-
The differential capacity to use the language artfully is one sion of carbon copies of letters . . . memos in the handwriting of his
source of error, while the absolute capacity of the language to opponent's manager." Admittedly a less efficient method than the
convey ideas is a n o t l ~ e rThis
. ~ is an issue strongly present in cross- interview, it admirably met the criterion of being workable: "It
cultural comparisons, where different languages may vary radi- took a lot of digging through the trash to come up with the nuggets.
cally as a medium of information transfer. The effect of this is to B u t . . . daily panning produced some very fine gold." The "inves-
limit the content possible for study with questionnaries or inter- tigator" did not limit himself to inferences drawn from observa-
?In a similar note on observers, Heyns and Lippitt (1954) ask if the "observer
tions of his opponent's public acts, but was able to develop
lacks the sensitivity or the vocabulary which the particular observation requires" I 3For an extended discussion of this issue, see Hyrnan et al. (1954) and Kahn
(p. 372). I and Cannell (1957).
178 UNOBTRUSIVE MEASURES A FINAL NOTE 179

ingeniously (although perhaps not ethically) a trace measure to unavailable, a proper contingent strategy is to interview others
complement the observation. Each aided the other, for the obser- who have had access to the same information, or who can report at
vations give a validity check on the nuggets among the trash (Was second hand. This is very shaky information, but useful if other
misleading material being planted?), and the nuggets gave a more intelligence is available as a check. For many investigations, of
accurate means of interpreting the meaning of the public acts. course, the nature of the distortion is itself an important datum and
Evidence of how others are sensitive to wastebaskets is seen can become a central topic of study when a reliable baseline is
in the practice in diplomatic embassies of burning refuse under p ~ s s i b l e If
. ~ other materials are present, and they usually are in a
guard, the discussion of refuse purchase by industrial spies record-keeping society, the best way to estimate past behavior is to
(Anonymous, 1964c), and the development of a new electric waste- combine methods of study of archival records, available traces,
basket that shreds discarded paper into unreadable bits. and verbal reports, even if secondhand. Clearly, direct observa-
Generally speaking, then, observational and trace methods tional methods are useless for past events.
are indicated as supplementary or primary when language may With studies of social change, the most practical method is to
serve as a poor medium of information-either because of its rely on available records, supplemented by verbal recall. If one
differential use, its absolute capacity for transfer, or when signifi- wanted more control over the data, it would be possible to conduct
cant elements of the research population are silent. a continuing series of field experiments extending over a long
The verbal methods are necessarily weak along another di- period of years. But the difficulty of such an approach is evidenced
mension, the study of past behavior or of change. For historical by the scarcity of such longitudinal, original-data studies in social
studies, there is no alternative but to rely mainly on records of the science. Forgetting the number of years required, there is the
past time. Behavioral research on the distant past is rare, however; problem of unstable populations over time, a growing problem as
more common are studies which center on experiences within the the society becomes more mobile. Potential errors lie in both
lifetime of respondents. For example, there is a large literature on directions as one moves forward or backward in time, and the
child-rearing practices, in which mothers recollect their behavior more practical approach of the two is to analyze data already
of years past. A sole dependence on this type of data-gathering is collected - making the ever present assumption that such are
highly suspect. It may be enough to note that Thomas Jefferson, in available.
his later years, observed that winters weren't as cold as they used A more integrative approach for studying change is to develop
to be. Available records could be used to check both Mr. Jefferson two discrete time series -one based 011 available records, the other
and other observers of secular changes in winter's fierceness. freshly developed by the investigator: With this strategy, it is
For more current evidence on the fallibility of such recall necessary to have an overlap period in which the relationships
data, see Pyles, Stolz, and Macfarlane, 1935; McGraw and Molloy, between the two series are established. Given knowledge of the
194,l; Smith, 1958; Weiss and Dawis, 1960-all of whom comment relationships, the available records can be studied retrospectively,
on, or test, the validity of mothers' recall of child-rearing practices. thereby providing more intelligence than would be possible if they
Weiss and Dawis wrote, "It is indefensible to assume the validity existed alone. Again, there is a necessary assumption: one must be
of purportedly factual data obtained by interview" (p. 384). The able to reject the plausibility of an interaction between time and
work of Haggard, Brekstad, and Skard (1960) and Robbins (1963) the method. If there is any content or population fluctuation
suggests that it is a problem of differentially accurate recall. In T h e courts have handled secondary information by excluding it under the
Haggard's phrase, the interviews "did not reflect the mothers' "hearsay" rule (Wigmore, 1935; Morgan, 1963). Epically put, "Pouring rumored
earlier experiences and attitudes so much as their current picture of scandal into the bent ear of blabbering busybodies in a pool room or gambling house
is no more disreputable than pronouncing it with clipped accents in a courtroom"
the past" (p. 317). (Donnelly, Goldstein, & Schwartz, 1962, p. 277). The case from which this is cited is
When, through death or refusal, reports of past behavior are Holmes, 379 Pa. 599 (1954).
180 UNOBTRUSIVE MEASURES A FINAL NOTE 181

beyond chance, such a method is invalid. Diagrammatically, where In the multimethod pattern of testing, the primary gains
0 is an observation and the subscript n equals new data and a coming from the less popular methods are protection against
available data: reactive measurement threats, auxiliary data in content areas
where verbal reports are unreliable, an easier method of determin-
ing long-term change, and a potentially lower-cost substitute for
some standard survey practices.
Offsetting these gains, there are associated problems for each
of the less popular measurement classes -indeed, if they were less
A final gain from the less reactive methods is frequently the problematic, we would be writing an argument in favor of an
lower cost of data collection. Many scholars know how to conduct increased use of the interview.
massive surveys which effectively control major sources of error; The most powerful aspect of the verbal methods - their ability
few do so. This knowledge is an underdeveloped resource. With to reach into all content areas-is a soft spot in the hidden-
survey interviews often costing $10 or more apiece, the failure is observation, trace, and archival analysis procedures. We have
understandable, however regrettable. When the interview or ques- noted remarkably adept and nonobvious applicatioils of data from
tionnaire is viewed as the only method, the researcher is doomed these sources, but for some content areas, the most imaginative of
to either frustration or a studied avoidance of thoughts on external investigators will have trouble finding pertinent material. Individu-
validity. Peace of mind will come if the investigator breaks the ally, those methods are simply not as broad gauged.
single-method mold and examines the extent to which other mea- Often missing, too, is complete knowledge of the conditions
surement classes can substitute for verbal reports. The price of under which the data were collected, the definitions of important
collecting each unit of data is low for most of the methods we have terms used in classification, and the control or lack of it over error
stressed. In some cases, the dross rate is high, and it may be risks that may be salient. This is particularly disturbing when
necessary to observe a hundred cases before one meets the dealing with conlparisons of public records from different areas or
research specifications. Nonetheless, even under these high dross- from widely different times. The variation in definitions of "sui-
rate conditions, the cost per usable response is often lower than cide" versus "accidental death," or the differential thoroughness
that of a completed interview or returned questionnaire. The lower with which marriages are entered in official records are examples
cost permits flexibility to expand into content and population areas of this issue. In general, for trace evidence and archival records, a
otherwise precluded, and the result of this is to increase the dominant concern is the possibility of selective deposit and selec-
confidence one has in generalizing findings. Just as in the case of tive survival of the research data. Through supporting research
studying social change, it may be possible to generate different designed to learn of these errors, it is sometimes possible to apply
data series, some based on verbal reports, others based on sec- corrections to what is available. At other times, the researcher
ondary or observational data. Providing for enough cases of the must remain in ignorance and make assumptions. If he restricts
more expensive procedures to yield a broad base for linkage, the himself to working with only such data, he remains helpless before
larger number of cases can be allocated to the usually less expen- their vagaries. If he uses other measurement classes, the process
sive observational or secondary methods. It is important to note of triangulating all the different data may provide a test of his
that we add "usually" before "less expensive." The savings are assumptions and reveal the presence or extent of error. The
centered in data-collection costs, and it may be that all the savings comparison of data from the different classes can always add
are vitiated by the elaborate corrections or transformations that a intelligence unavailable from comparisons of data from within the
particular data series may require. The cost of materials and single class.
analysis is an equivocal area indeed. Because of the risks of error and the danger of unknown
182 UNOBTRUSIVE MEASURES A FINAL NOTE 183

biases, we have stressed the importance of careful data sampling. information, particularly as one assigns relative weights to compo-
Wherever feasible, locational sampling should be employed, ex- nents collected into a single score. This is not as awesome as it
tending over regions as well as areas within a single locality. sounds, and if the investigator is sensitive to the potential useful-
Similarly, time sampling should be considered not only as a device ness of index numbers, he often finds enough secondary data
employed within a single day or week, but applied over months and available for the task, or may obtain new information without
years. By such effort, we are able to protect against both popu- extraordinarily high marginal costs. Insofar as these transforma-
lation and content restrictions, and very often produce inter- tions demand time and labor to make the raw data more precise,
esting data from comparisons of results from different locations or they are disadvantageous compared with standard questionnaire
times.j The need for time and location sampling is no less for procedures. There are, however, as we have suggested in various
observational or archival data than it is for interviews or question- points in the text, indications that index numbers and more simple
naires, for sampling is a problem that transcends the class of transformations could be used properly in all classes of measure-
measurement. ment. The Zeitgeist may as yet be inappropriate, but an important
Another common demand, this one not so applicable to the work will someday link index-number theory and literature to
verbal-report approaches, is that for data adjustment and conver- social-science measurement theory and practice.
sion. The need comes from the experimenter's decreased control These, then, are the gains; these the losses. There are no
over the production of his materials. The exception to this is the rewards for ingenuity as such, and the payoff comes only when
contrived field experiment, where the investigator can have full ingenuity leads to new means of making more valid comparisons.
control, but the data from archives, trace sources, and observa- In the available grab bag of imperfect research methods, there is
tions are frequently too raw to be used as is. The need is under- room for new uses of the old.
lined because of one of the major advantages of the secondary data Max Eastman once suggested that books should start with a
-their ability to produce fine time-series information. In time first section consisting of a few sentences, the second section a
series, it is usually necessary to account for extraneous sources of few pages, and so on. He even wrote one like that- The Enjoy-
variation, such as secular trends or cyclical patterns. Thus, the ment of Laughter. Since this has been an unconventional mono-
"score" which is the basis of comparison is some transformed graph on unconventional research procedures, it is proper
measure which is a residual of the total "score." In other studies, that it should have an unconventional close. We reverse Eastman's
the absolute number of cases varies from unit time to unit time, formula and offer a one-phrase final chapter and a one-paragraph
and the only reasonable comparison score is one which is related penultimate chapter.
in some way, through an average or percentage, for example, to the
variable baseline. The investigator may have no control over the
flow of an observed population, but he can obtain a count of that
flow and use this intelligence as the basis for modifying his
comparison score.
The more sophisticated forms of transformation, such as
index numbers based on multiple components, demand more
%ee, for example, Caplow and McGee's (1958) discussio~lof variation in
salaries in American universities-particularly the relationship between beginning
salaries and the prestige of the institution. In a related report, the head of an
employment agency reported, "The Chicago advertising man on the average makes
10 per cent more than his New York counterpart, 25 per cent more than he would
make on the West Coast, and 40 per cent more than he would make in a small town
or in the south" (Baxter, 1962, p. 65).
CHAPTER8 CHAPTER9

A Statistician on Method Cardinal Newman's Epitaph

We must use all available weapons of attack, face our From symbols and shadows to the truth.
problems realistically and not retreat to the land of fashionable
sterility, learn to sweat over our data with an admixture of
judgment and intuitive rumination, and accept the usefulness of
particular data even when the level of analysis available for
them is n~arkedlybelow that available for other data in the
empirical area.
(Binder, 1964, p. 294)
References

Adams, J. S. Toward an understanding of inequity. Journal ofAbnorma1

and Social Psychology, 1963, 67, 422-436. (a)
Adams, J. S. Wage inequities, productivity, and work quality. In Psycho-
logical research on pay. Reprint No. 220. Berkeley: Univer. of
California, Institute of Industrial Relations, 1963, pp. 9-16. (b)
Advertising Service Guild. The press and its readers. London: Art &
Technics, 1949.
Alger, C. F. Interaction in a committee of the United Nations General
Assembly. In J. D. Singer (ed.), Internatzonal yearbook of behavior
research, 6. New York: The Free Press, 1965. (In press.)
Allport, G. W. The use of personal documents i n psychological science,
New York: Social Science Research Council, 1942.
Amrine, M., & Sanford, F. In the matter of juries, democracy, science,
truth, senators, and bugs. American Psychologist, 1956, 11, 54-60.
Amthauer, R. Ergebnisse einer studie iiber krankheitsbedingte Fehlzei-
ten. Psychologische Rundschau, 1963, 14, 1-12.
Anastasi, A. Differential psychology. (3rd ed.) New York: Macmillan,
1958.
Andrew, R. J. The origin and evolution of the calls and facial expressions
of the primates. Behavior, 1963, 20, 1-109.
Angell, R. C. The moral integration of American cities. American Journal
of Sociology, 1951, 57, 123-126.
Anonymoi. Hair style as a function of hard-headedness us. long-hairedness
I in psychological research, a study in the personology of science.
Unprepared manuscript, Northwestern Univer. 8s Univer. of Chi-
cago, 1953-1960.
Anonymous. Z-Frank stresses radio to build big Chevy dealership. Adver-
tising Age, 1962, 33, 83.
Anonymous. Help wanted ads in September hit new high, NICB reports.
Advertising Age, November 2, 1964, 35, 74. (a)
1 Anonymous. In the eye of the beholder. Sponsor, December 28,1964,18,
25-29. (b)
I Anonymous. Litter bugged. Advertising Age, November 2, 1964,35,74. (c)
Anonymous. Senator Salinger? Newsweek, August 10, 1964, 63, 28. (d)
I Anonymous. Civil rights: by the book. Newsweek, March 1, 1965, 65, 37.
Ardrey, R. African genesis. New York: Delta, 1961.
Aronson, E. The need for achievement as measured by graphic expres-
187
188 UNOBTRUSIVE MEASURES REFERENCES 189

sion. In J. W. Atkinson (Ed.), Motives i n fantasy, action, aradsociety. Becker, H. S. Problems of inference and proof in participant observation.
Princeton: Van Nostrand, 1958. Pp. 249-265. American Sociological Review, 1958, 23, 652-660.
Arrington, R. Time sampling in studies of social behavior: a critical Belknap, G. M. A method for analyzing legislative behavior. Midwest
review of techniques and results with research suggestions. Psycho- Journal of Political Science, 1958, 2, 377-4'02.
logical Bulletin, 1943, 40, 81-124. Beloff, J., & Beloff, H. The influence of valence on distance judgments of
Arsenian, J. M. Young children in an insecure situation. Journal of human faces. Journal of Abnormal an,d Social Psychology, 1961,62,
Abnormal and Social Psychology, 1943, 38, 225-24~9. 720-722.
Ashley, J. W. Stock prices and changes in earnings and dividends: some Benney, M., Riesman, D., & Star, S. Age and sex in the interview.
empirical results. Journcrl of Political Economy, 1962, 70, 82-85. American Journal of Sociology, 1956, 62, 143-152.
Athey, K. R., Coleman, J. E., Reitman, A. P., & Tang, J. Two experi- Berelson, B. Content analysis i n communication research. Glencoe, Ill.:
ments showing the effect of the interviewer's racial background on Free Press, 1952.
responses to questionnaires concerning racial issues. Journal of Berger, C. S. An experimental study of doodles. Psychological Newsletter,
Applied Psychology, 1960, 44, 244-246. 1954, 6 , 138-141.
Babchuk, N., & Bates, A. P. Professor or producer: the two faces of Berkson, G., & Fitz-Gerald, F. L. Eye fixation aspect of attention to visual
academic man. Social Forces, 1962, 40, 341-348. stimuli in infant chimpanzees. Science, 1963, 139, 586-587.
Back, K . W. The well-informed informant. In R. N. Adams & J. J. Preiss Berlyne, D. E. Emotional aspects of learning. In P. R. Farnsworth, 0.
(Eds.), Human orgcrnizcrtion research. Homewood, Ill.: Dorsey Press, McNemar, & Q. McNemar (Eds.), Annual Review of Psychology,
1960. Pp. 179-187. 1964, 15, 115-142.
Bain, H. M., & Hecock, D. S. Ballot position and voter's choice: the Bernberg, R. E. Socio-psychological factors in industrial morale: I. the
arrangement of names on the ballot a,nd its effect on the voter. Detroit: prediction of specific indicators. Journal of Social Psychology, 1952,
Wayne State Univer. Press, 1957. 36, 73-82.
Bernstein, E. M. Morzey arad the economic system. Chapel Hill.: Univer.
Bain, R. K. The researcher's role: a case study. In R. N. Adams & J. J.
of North Carolina Press, 1935.
Preiss (Eds.), Human organizcrtion research. Homewood, Ill.: Dorsey
Berreman, J. V. M. Factors affecting the sale of modern books of fiction: a
Press, 1960. Pp. 140-152. study of social psychology. Unpublished doctoral dissertation, Stan-
Bales, R. F. Interaction process crncrl.ysis. Cambridge: Addison-Wesley, ford Univer., 1940.
1950. Binder, A. Statistical theory. In P . R. Farnsworth, 0. McNemar, & Q.
Bandura, A. Lecture on imitation learning, Northwestern Univer., 1962. McNe~nar(Eds.), Annual Review of Psychology, 1964, 15, 277-310.
Bandura, A,, & Walters, R. Social lec~rningand personality development. Birdwhistell, R. Kinesics and communication. In E. Carpenter (Ed.),
New York: Holt, Rinehart & Winston, 1963. Exploration i n communicatiorr. Boston: Beacon Hill, 1960. Pp. 54-64.
Barch, A. M., Trumbo, D., & Nangle, J. Social setting and conformity to a Birdwhistell, R. The kinesic level in the investigations of emotions. In P.
legal requirement. Journal ofAbnornza1 cznd Social Psychology, 1957, Knapp (Ed.), The expression of ernotions i n man. New York: Interna-
55, 396-398. tional Universities Press, 1963. Pp. 123-140.
Barker, R. G., & Wright, H. F. One boy's day: (1 specimen record of Blake, R. R., Berkowitz, H., Bellamy, R. Q., & Mouton, J. S. Volunteer-
behavior. New York: Harper & Bros., 1951. , ing as an avoidance act. Journal ofAbnormal and Social Psychology, -.
Barker, R. G., & Wright, H. F. Midzuest and its children: the l~sychologi- 1956, 53, 154-156.
cal ecology of an American town. Evanston. 111.: Row, Peterson, Blake, R. R., Mouton, J. S., & Hain, J. D. Social forces in petition signing.
1954. Southwest Social Science Quarterly, 1956, 36, 385-390.
Barry. H. Relationships between child training and the pictorial arts. Blau, P. M. The dynamics of bureaucracy. Chicago: Univer. of Chicago
Jorlrnnl of Abrlornlal and Social Psychology, 1957, 54, 380-383. Press, 1955.
Barzun, J. The delights ofdetection. New York: Criterion Books, 1961. Blau, P. The research process in the study of The dynamics of bureauc-
Bass, B. M. Ultimate criteria of organizational worth. Persorznel Psy- racy. In P. E. Hammond (Ed.), Sociologists at work. New York:
chology, 1952, 5, 157-173. Basic Books, 1964. Pp. 16-49.
Baxter, J. Chicago shops pay better than N. Y.: big agencies pay more too. I Bloch, V. L'Ptude objective du comportement des spectateurs. Revue
,4dvertising ,4ge, May 28, 1962, 33, 65. Internatiorzale de Filmologie, 1952, 3 , 221-222.
Baumrind, D. Some thoughts on ethics of research: after reading Mil- Blomgren, G. W., & Scheuneman, T. W. Psychological resistance to seat
gram's "Behavioral Study of Obedience," American Psychologist,
1964, 19, 421-423. i belts. Research Project RR-115, Northwestern Univer., Traffic Insti-
tute, 1961.
REFERENCES 191
190 UNOBTRUSIVE MEASURES
homogamous and interreligious marriages. Social Forces, 1963,41,
Blomgren, G. W., Scheuneman, T. W., & Wilkins, J. L. Effects of 353-362.
exposure to a safety poster on the frequency of turn signalling. Trafic Burchinal, L. G., & Kenkel, W. F. Religious identification and occupa-
Safety, 1963, 7, 15-22. tional status of Iowa grooms. American Sociological Review, 1962,27,
Boring, E. G. The role of theory in experimental psychology. American 526-532.
Journal of Psychology, 1953,66, 169-184. (Reprinted in E. G. Boring, Burma, J. H. Self-tattooing among deliquents: a research note. Sociology
History, psychology, and science. Ed. R. I. Watson & D. T. Campbell, and Social Research, 1959, 43, 341-345.
New York: Wiley, 1963. Pp. 210-225.) Burwen, R., & Campbell, D. T. The generality of attitudes toward
Boring, E. G. The beginning and growth of measurement in psychology. authority and nonauthority figures. Journal of Abnormal and Social
Isis, 1961, 52, 238-257. (Reprinted in E. G. Boring, History, psy- Psychology, 1957, 54, 24-31.
chology, and science. Ed. R. I. Watson & D. T. Campbell, New Callahan, J. D., Morris, J. C., Seifried, S., Ulett, G. A., & Heusler, A. F.
York: Wiley, 1963. Pp. 140-158.) Objective measures in psycl~opharmacology:baseline observations.
Boring, E. G. History, psychology and science. Ed. R. I. Watson & D. T. Missouri Medicine, 1960, 57, 714-718.
Campbell, New York: Wiley, 1963. Campbell, D. T. The informant in quantitative research. American Jour-
Boring, E. G., & Boring, M. D. Masters and pupils among the American nal of Sociology, 1955, 60, 339-342.
psychologists. American Journal of Psychology, 1948, 61, 527-534. Campbell, D. T. Leadership and its effects upon the group. Ohio Studies in
(Reprinted in E. G. Boring, History, psychology, and science. .Ed.,, R. Personnel, Research Monograph 83. Columbus: Ohio State Univer.,
I . Watson & D. T. Campbell, New York: Wiley, 1963. Pp. 132-139.) Bureau of Business Research, 1956.
Brayfield, A. H., & Crockett, W. H. Employee attitudes and employee Campbell, D. T. Factors relevant to the validity of experiments in social
performance. Psychological Bulletin, 1955, 52, 396-424. settings. Psychological Bulletin, 1957, 54, 297-312.
Bridgman, P. W. The logic of modern physics. New York: Macmillan, Campbell, D. T. Systematic error on the part of human links in cornrnuni-
1927. cation systems. Information and Control, 1959, 1, 334-369.
Brock, T. C. Communicator-recipient similarity and decision change. Campbell, D. T. Recommendations for APA test standards regarding
Journal of Personality ancl Social Psychology, 1965, 1 , 650-654. construct trait or discriminant validity. American Psychologist, 1960,
Brock, T. C., & Guidice, C. D. Stealing and temporal orientation. Journal 15, 546-553.
of Abnormal and Social Psychology, 1963, 66, 91-94. Campbell, D. T. The mutual methodological relevance of anthropology
Brogden, H., & Taylor, E. The dollar criterion-applying the cost ac- and psychology. In F. L. K. Hsu (Ed.), Psychological anthropology
counting concept to criterion construction. Personnel Psychology, approaches to culture and personality. Homewood, Ill.: Dorsey
1950, 3, 133-154. Press, i961. Pp. 333-352.
Brookover, L. A., & Back, K. W. Time sampling as a field technique. Campbell, D. T. Administrative experimentation, institutional records
Human Organization, 1965, in press. and nonreactive measures. In B. G. Chandler, E. F. Carlson, F.
Brown, J. W. A new approach to the assessment of psychiatric therapies. Bertolaet, C. Byerly, J. Lee, R. Sperber (Eds.), Research Seminar on
Unpublished manuscript, 1960. Teacher Education, Report on Cooperative Research Project No. G-
Brown, J. W. The use of the single case study with actuarial and indirect 011 supported by the Cooperative Research Program of the Office of
indices in psychiatric research. Unpublished manuscript, 1961. Education, U. S. Department of Health, Education, and Welfare,
Brozek, J. Recent developments in Soviet psychology. In P. R. Farns- Northwestern Univer., August, 1963. Pp. 75-120. (Duplicated.) (a)
worth, 0. McNemar, & Q. McNemar (Eds.), Annual Review of Campbell, D. T. From description to experimentation: interpreting trends
Psychology, 1964, 15, 493-594. as quasi-experiments. In C. W. Harris (Ed.), Problems i n measuring
Bryan, J. Personal communication, 1965. change. Madison, Wis.: Univer. of Wisconsin Press, 1963. Pp. 212-
Bugental, J. F. T. An investigation of the relationship of the conceptual 242. (b)
matrix to the self-concept. Unpublished doctoral dissertation, Ohio Campbell, D. T. Social attitudes and other acquired behavioral disposi-
State Univer., 1948. tions. In S. Koch (Ed.), Psychology: a study of a science, Vol. 6,
Burchard, W. W. A study of attitudes towards the use of concealed Investigations of man as socius. New York: McGraw-Hill, 1963. Pp.
devices in social science research. Social Forces, 1957, 36, 111. 94-176. (c)
Burchinal, L. G., & Chancellor, L. E. Ages at marriage, occupations of Campbell, D. T. Pattern matching as an essential in distal knowing. In K.
grooms and interreligious marriage rates. Social Forces, 1962, 40, R. Hammond (Ed.), The psychology of Egon Brunswik. New York:
348-354. i
I
i
Holt, Rinehart & Winston, 1965. (a)
Burchinal, L. G., & Chancellor, L. E. Survival rates among religiously
192 UNOBTRUSIVE MEASURES REFERENCES 193

Campbell, D. T. On the use of both pro and con items in attitude scales. Coleman, J., Katz, E., & Menzel, H. The diffusion of an innovation among
Unpublished manuscript, 1965. (b) physicians. Sociometry, 1957, 20, 253-270.
Campbell, D. T., & Fiske, D. W. Convergent and discriminant validation Coleman, R. P., & Neugarten, B. Social class in the city. In preparation.
bv the multitrait-multimetl~odmatrix. Psychological Bulletin, 1959, Conrad, B. The death of Manolete. Cambridge: Houghton Mifflin, 1958.
51j, 81-105. Cook, S. W., & Selltiz, C. A multiple-indicator approach to attitude
Campbell, D. T., Kruskal, W. H., & Wallace, W. P. Seating aggregation measurement. Psychological Bulletin, 1964, 62, 36-55.
as an index of attitude. Sociometry, 1966, in press. Coombs, C. A theory of data. New York: Wiley, 1963.
Campbell, D. T., & Mack, R. W. The steepness of interracial boundaries Cooper, S. L. Random sampling by telephone: a new and improved
as a function of the locus of social interaction. In preparation. method. Journal of Marketing Research, 1964', 1, 45-48.
Campbell, D. T., & McCormack, T. H. Military experience and attitudes Couch, A., & Keniston, K. Yeasayers and naysayers: agreeing response
toward authority. American Journal of Sociology, 1957, 62, 482-490. set as a personality variable. Journal of Abnormal and Social Psy-
Campbell, D. T., & Stanley, J. C. Experimental and quasi-experimental chology, 1960, 60, 151-174.
designs for research on teaching. In N. L. Gage (Ed.), Handbook of Couch, A., & Keniston, K. Agreeing response set and social desirability.
research on teaching. Chicago: Rand McNally, 1963. Pp. 171-246. Journal of Abnormal and Social Psychology, 1961, 62, 175-179.
Cane, V. R., & Heim, A. W. The effects of repeated testing: 111. further Cox, C. M. The early mental traits of three hundred geniuses. Stanford,
experiments and general conclusions. Quarterly Journal of Experi- Calif.: Stanford Univer. Press, 1926. (Abstracted in M. I. Stein & S.
mental Psychology, 1950, 2, 182-195. J. Heinze, [Eds.], Creativity and the individual. Glencoe, Ill.: Free
Cantril, H. Gauging public opinion. Princeton: Princeton Univer. Press,
Press, 1960. Pp. 128-133.)
1944.
Caplow, T., & McGee, R. The academic marketplace. New York: Basic Cox, G. H., & Marley, E. The estimation of motility during rest or sleep.
Buoks, 1958. Jorlrnal of Neurology, Neurosurgery and Psychiatry, 1959,22, 57-60.
Capra, P. C., & Dittes, J. E. Birth order as a selective factor among Craddick, R. A. Size of Santa Claus drawings as a function of time before
volunteer subjects. Journal of Abnormal and Social Psychology, and after Christmas. Journal of Psychological Studies, 1961,12, 121-
1962, 64, 302. 125.
Carhart, R. The binaural reception of meaningful materials. Unpublished Craddick, R. A. Size of witch drawings as a function of time before, on,
manuscript, Northwestern Univer., 1965. To appear in A. B. Graham and after Halloween. American Psychologist, 1962, 17, 307. (Abstr.)
(Ed.), Sensorineuro hearing processes and disorders. Boston: Little, Cratty, J. Conformity behavior as a function of dress and race. Unpub-
Brown, in press. lished manuscrint, Nortllwestern Univer.. 1962.
Carlson, J., Cook, S. W., & Stromberg, E. L. Sex differences in conversa- Crespi, L. P. The interview effect on polling. Public Opinion Quarterly,
tion. Journal of Applied Psychology, 1936, 20, 727-735. 1948. 12. 99-111.
Carroll, P. F. Personal communication, 1962. Cronbach, L. J. Response sets and test validity. Educational and Psycho-
Chapman, L. J., & Bock, R. D. Components of variance due to acquies- logical Measurement, 1946, 6, 475-494.
cence and content in the F-scale measure of authoritarianism. Cronbach, L. J. Proposals leading to analytic treatment of social percep-
Psycl~ologicnlBulletin, 1958, 55, 328-333. tion scores. In R. Tagiuri & L. Petrullo (Eds.),Personperception and
Chapple, E. D. Quantitative analysis of complex organizational systems. interpersonal behavior. Stanford, Calif.: Stanford Univer. Press,
H ~ ~ r n aOrganizatiolz,
n 1962, 21, 67-80. 1958. Pp. 353-379.
Christensen, H. T. Cultural relativism and premarital sex norms. Ameri- Crowald, R. H. Soviet grave markers indicate how buried rated with
can Sociological Reviezu, 1960, 25, 31-39. regime. El liniversul (Mexico City), August 15, 1964, 196, 12.
Clark, K. America's psychologists. Washington, D. C.: American Psycho- Dalton, M. Preconceptions and methods in Men W h o Manage. In P. E.
logical Association, 1957. Hammond (Ed.),Sociologists at zuork. New York: Basic Books, 1964.
Clark, W. H. A study of some of the factors leading to achievement and Pp. 50-95.
creativity with special reference to religious skepticism and belief. Darwin, C. The expression of the emotions i n man and animals. London:
Jo~crnnlof Sociczl Psychology, 1955, 41, 57-69. (Abstracted in M. I. Murray, 1872.
Stein & S. J. Heinze [Eds.], Creativity and the individual. Glencoe, Davis, R. C. Physiological responses as a means of evaluating informa-
Ill.: Free Press, 1960. Pp. 147-148.) tion. In A. D. Biderman & H. Zimmer (Eds.), The manipulation of
Cogley, J. Cited in D. McCahill, Parleys to evaluate Catholic status. human behavior. New York: Wiley, 1961. Pp. 142-168.
Chicago Sun-Times, June 8, 1963, 16, 12.
194 UNOBTRUSIVE MEASURES REFERENCES 195

departmental identification of executives. Sociometry, 1958,21, 140- ! measure of agreement response set. Journal ofAbnorma1 andSocia1
144. Psvcholopv. 1961. 62, 173-174.
-<,

DeCharms, R., & Moeller, G. Values expressed in American children's Ehrle, R. A., & Johnson, B. G. Psychologists and cartoonists. American
readers: 1800-1950. Journal of Abnormal and Social Psychology, Psychologist, 1961,16,693-695.
1962, 64, 136-142. Ekelblad, F. A. The statistical method i n business. New York: Wiley, 1962.
DeFleur, M. L., & Petranoff, R. M. A televised test of subliminal Ellis, N. R., & Pryer, R. S. Quantification of gross bodily activity in
persuasion. Public Opinion Quarterly, 1959, 23, 168-180. children with severe neuropathology. American Journal of Mental
Dempsey, P. Liberalism-conservatism and party loyalty in the U. S. Deficiency, 1959, 63, 1034-1037.
Senate. Journal of Social Psychology, 1962, 56, 159-170. Enciso, J. Design motifi of ancient Mexico. New York: Dover, 1953.
Deutsch, M. An experimental study of the effects of cooperation and r Evan, W. M. Peer-group interaction and organizational socialization: a
competition upon group process. Human Relations, 1949,2,199-231. study of employee turnover. American Sociological Review, 1963,28,
Dexter, L. A. What do congressmen hear? In N. Polsby, R. Dentler, & P. 4'29-4a35.
Smith (Eds.), Politics and social life. Boston: Houghton Mifflin, Exline, R. V. Explorations in the process of person perception: visual
1963. Pp. 485-495. interaction in relation to competition, sex and affiliation. Journal of
Dexter, L. A. Communications-pressure, influence or education? In Personality, 1963, 31, 1-20.
L. A. Dexter & D. M. White (Eds.), People, society and mass com- Exline, R. V. Affective phenomena and the mutual glance: effects of
munications. New York: Free Press, 1964. Pp. 394-409. evaluative feedback and social reinforcement upon visual interaction
Diamond, S. Some early uses of the questionnaire. Public Opinion with an interviewer. Technical Report No. 12, Office of Naval
Quarterly, 1963, 27, 528-542. Research Contract No. Nonr-2285(02), 1964.
Digman, J., & Tuttle, D. An interpretation of an election by means of Exline, R. V., Gray, D., & Schuette, D. Visual behavior in a
obverse factor analysis. Journal of Social Psychology, 1961,53, 183- I dyad as affected by interview content and sex of respondent. Journal
194. of Personality and Social Psychology, 1965, 1, 201-209.
Dittman, A. T., & Wynne, L. C. Linguistic techniques and the analysis of ! Exline, R. V., & Winters, L. C. Interpersonal preference and the mutual
emotionality in interviews. Journal of Abnormal and Social Psy- glance. Technical Report No. 13, Office of Naval Research Contract
chology, 1961, 63, 201-204. No. Nonr-2285(02), 1964.
Dollard, J., & Mowrer, 0. H. A method for measuring tension in written Fairbanks, H. The quantitative differentiation of samples of spoken
documents. Journal of Abnormal and Social Psychology, 1947,42,3- language. Psychological Monographs, 1944, 56, No. 2.(Whole No.
32. 255), 19-38.
Donnelly, R. C., Goldstein, J., & Schwartz, R. D. Criminal law. New Fantz, R. L. A method for studying depth perception in infants under six
York: Free Press, 1962. months of age. Psychological Record, 1961, 11, 27-32. (a)
Doob, L. W. Communication i n Africa. New Haven: Yale Univer. Press, Fantz, R. L. The origin of form perception. Scienti3c American, May
1961. 1961, 204, 66-72. (b)
Dornbusch, S., & Hickman, L. Other-directedness in consumer goods Fantz, R. L. Pattern vision in newborn infants. Science, 1963,140,296-297.
advertising: a test of Riesman's historical theory. Social Forces, Fantz, R. L. Visual experience in infants: decreased attention to familiar
1959, 38, 99-102. patterns relative to novel ones. Science, 1964, 146, 668-670.
DuBois, C. N. Time Magazine's fingerprints' study. Proceedings: 9th Fantz, R. L., Ordy, J. M., & Udelf, M. S. Maturation of pattern vision in
Conference, Advertising Research Foundation. New York: Advertis- infants during the first six months. Journal of Comparative and
ing Research Foundation, 1963. Physiological Psychology, 1962, 55, 907-917.
Duncan, C. P. Personal communication, 1963. Farris, C. D. A method of determining ideological groupings in the Con-
Durand, J. Mortality estimates from Roman tombstone inscriptions. gress. Journal of Politics, 1958, 20, 308-338.
American Journal of Sociology, 1960, 65, 365-373. Feshbach, S., & Feshbach, N. Influence of the stimulus object upon the
Durkheim, E. Suicide. Trans. J. A. Spaulding & G. Simpson, Glencoe, complementary and supplementary projection of fear. Journal of
Ill.: Free Press, 1951. I
Abnormal and Social Psychology, 1963, 66, 498-502.
Edwards, A. L. The social desirability variable i n personality assessment Festinger, L., & Katz, D. Research methods i n the behavioral sciences.
and research. New York: Dryden Press, 1957. New York: Holt, Rinehart & Winston, 1953.
Edwards, A. L., & Walker, J. N. A note on the Couch and Keniston Fiedler, F. E. The nature of teamwork. Discovery, February 1962.
196 UNOBTRUSIVE MEASURES REFERENCES 197

Fiedler, F. E., Dodge, J. S., Jones, R. E., & Hutchins, E. B. Galton, F. Measurement of character. Fortnightly Review, 18843, 36, 179-
Interrelations among measures of personality adjustment in non- 185.
clinical populations. Journal of Abnormal and Social Psychology, Galton, F. The measure of fidget. Nature, 1885, 32, 174-175.
1958, 56, 345-351. Garner, W. R. Context effects and the validity of loudness scales. Journal
Field, M. Children and films: a study of boys and girls i n the cinema. of Experimental Psychology, 1954,48, 218-224.
Dunfermline, Fife: Carnegie United Kingdom Trust, 1954. Garner, W. R., Hake, H. W., & Eriksen, C. W. Operationism and the
Fisher, I. The making of index numbers. Boston: Houghton Mifflin, 1923. concept of perception. Psychological Review, 1956, 63, 149-159.
Fiske, D. W. Values, theory and the criterion problem. Personnel Psy- Gearing, F. The response to a cultural precept among migrants from
chology, 1951, 4, 93-98. Bronzeville to Hyde Park. Unpublished master's thesis, Univer. of
Flagler, J. M. Profiles: student of the spontaneous. New Yorker, December Chicago, June, 1952.
10, 1960, 36, 59-92. Ghiselli, E. E., & Brown, C. W. Personnel and industrial psychology.
Flugel, J. C. Psychology of clothes. London: Hogarth, 1930. (2nd. ed.) New York: McGraw-Hill, 1955.
Foote, E. Pupil dilation-new measurement of ad's effectiveness. Goncourt, Edmond de. The Goncourt Journals: 1851-1870. Ed. &
Advertising Age, March 5, 1962, 33, 12. trans. by Louis Galanti5re from Journal of Edmond & Jules de
Forshufvud, S. V e ~ nmordade. Napoleon? Stockholm: A. Bonnier, 1961. Goncourt. New York: Doubleday, Doran, 1937.
Foshee, J. G. Studies in activity level: I. simple and c o m ~ l e xtask ~ e r f o r m - Good, C. V., & Scates, D. E. Methods of research, educational, psycho-
ances in defectives. American ~ o u r n a ofj ~ e n t a l ' ~ e f i c i e n d1958,
~, logical, sociological. New York: Appleton-Century-Crofts, 1954.
62. 882-886. Goode, W. J., & Ilatt, P. K. Methods i n social research. New York:
Fowler, E. M. Help-wanted ads show sharp rise. New York Times, May 13, McGraw-Hill, 1952.
1962,111, 1. Gordon, T. The development of a method of evaluating flying skill. Person-
Franzen, R. Scaling responses to graded opportunities. Public Opinion I nel Psychology, 1950, 3, 71-84.
Quarterly, 1950, 14, 484-490. Gore, P. M., & Rotter, J. B. A p~rsonalitycorrelate of social action.
Freed, A., Chandler, P. J., Mouton, J. S., & Blake, R. R. Stimulus and Journal of Personality, 1963, 31, 58-64.
background factors in sign violation. Journal of Personality, 1955,23, Gosnell, H. F. Getting out the vote: an experiment in the stimulation of
499. (Abstract.) voting. Chicago: Univer. of Chicago Press, 1927.
Freeman, L. C., & Ataov, T. Invalidity of indirect and direct measures of
Gottschalk, L. A., & Gleser, G. C. An analysis of the verbal content of
attitude toward cheating. Journal of Personality, 1960, 28, 443-447.
suicide notes. British Joz~rnalof Medical Psychology, l960,.33, 195-
French, E. G. Some characteristics of achievement motivation. Journal of
Experimental Psychology, 1955, 50, 232-236. 204.
French, N. R., Carter, C. W., & Koenig, W. The words and sounds of Gould, J . Costello TV's first headless star; only his hands entertain
audience, Nezu York Times, March 4, 1951, 100 (34), 1. Cited in I.
telephone conversations. Bell System Technical Journal, 1930,9,290-
Doig, Kefauver and crime; the rise of television news and a senator.
324.
Freud, S. Psychol)athology of everyday life. London: Unwin, 1920. Unpublished master's thesis. Northwestern Univer., 1962.
Fry, C. L. The religious affiliations of American leaders. Scientific Grace, H., & Tandy, M. Delegate communication as an index of group
Monthly, 1933,36,241-249. (Abstracted in M. I. Stein & S. J. Heinze tension. Journal of Social Psychology, 1957, 45, 93-97.
[Eds.], Creativity and the individual. Glencoe, Ill.: Free Press, 1960. Gratiot-Alphandery, H. L'enfant et le film. Revue Internationale de
Pp. 148-149.) Filmologie, 1951, 2 , 171-172. (a)
Gabriele, C. T. The recording of audience reactions by infrared photogra- Gratiot-Alphandery, H. Jeunes spectateurs. Revue Internationale de
phy, Technical Report, NAVTRADEVCEN 269-7-56, 1956. Filrnologie, 1951, 2 , 257-263. (b)
Gage, N. L., & Shimberg, B. Measuring senatorial progressivism. Green, E. Judicial attitudes in sentencing. New York: St. Martin's Press,
J o ~ ~ r n aofl Abnorn~aland Social Psychology, 194.9, 44, 112-117. 1961.
Galton, F. Hereditary genius. New York: D. Appleton, 1870. (Abstracted Green, H. B., & Knapp, R. H. Time judgment, aesthetic preference, and
in M. I. Stein, & S. J. Heinze [Eds.], Creativity and the individual. need for achievement. Jor~rnalof Abnormal and Social Psychology,
Glencoe, Ill.: Free Press, 1960. Pp. 85-90.) 1959, 58, 140-142.
Galton, F. Statistical inquiries into the efficacy of prayer. Fortrzightly Greenhill, L. P. The recording of audience reactions by infrared photogra-
Review, 1872, 12, 125-135. phy. Technical report from Pennsylvania State Univer. to U.S.
198 UNOBTRUSIVE MEASURES
REFERENCES 199
Navy, Special Devices Center, SPECDEVCEN 269-7-56, September
Hansen, A. H. Cycles of prosperity and depression in the United States.
20, 1955, pp. 1-11.
Griffin, J. R. Coia "catch," kicking draw much criticism. Chicago Sun-
Univer. of Wisconsin Studies i n Social Sciences and History. Madison,
1921.
Times, October 27, 1964, 17, 76.
Hanson, N. R. Patterns of discovery. Cambridge: Cambridge Univer.
Griffith, R. M. Odds adjustments by American horse race bettors.
American Journal of Psychology, 1949, 62, 290-294. Press, 1958.
Grinder, R. E. New techniques for research in children's temptation Hardy, H. C. Cocktail party acoustics. Journal of the Acoustical Societyof
behavior. Child Development, 1961, 32, 679-688.
America, 1959, 31, 535.
Grinder, R. E. Parental child learning practices, conscience, and resist- Hartmann, G. W. A field experiment on the comparative effectiveness of
ance to temptation of sixth grade children. Child Development, 1962, "emotional" and "rational" political leaflets in determining election
33, 803-820. results. Journal of Abnormal and Social Psychology, 1936, 31, 99-
Grinder, R. E., & McMichael, R. E. Cultural influence on conscience 114.
development: resistance to temptation and guilt among Samoans and Hartshorne, H., & May, M. A. Studies i n the nature of character. Vol. 1.
American caucasians. Journal of Abnormal and Social Psychology, Studies i n deceit. New York: Macmillan, 1928.
1963, 66, 503-507. Hartshorne, H., May, M. A., & Maller, J. B. Studies i n the nature of
Grusky, 0. Organizational goals and the behavior of informal leaders. character. Vol. 2. Studies i n service and self control. New York:
American Journal of Sociology, 1959, 65, 59-67. Macmillan, 1929.
Grusky, 0. The effects of formal structure on managerial recruitment: a Hartshorne, H., May, M. A., & Shuttleworth, F. K. Studies i n the nature of
study of baseball organization. Sociometry, 1963, 26, 345-353. (a) character. Vol. 3. Studies i n the organization of character. New York:
Grusky, 0. Managerial succession and organizational effectiveness. Macmillan, 1930.
American Journal of Sociology, 1963, 69, 21-31. (b) Harvey, J. The content characteristics of best-selling novels. Public
Guilford, J. P. Psychometric methods. New York: McGraw-Hill, 1954. Opinion Quarterly, 1953, 17, 91-114.
Guilford, J. P. The relation of intellectual factors to creative thinking in Haworth, M.R. An exploratory study to determine the effectiveness
science. In C. Taylor (Ed.), The 1955 University of Utah research of a filmed puppet show as a group projective technique for use
conference on the identijication of creative scientijic talent. Salt Lake with children. Unpublished doctoral dissertation, Pennsylvania
City: Univer. of Utah Press, 1956. Pp. 69-95. State Univer., 1956. University Microfilms, Ann Arbor, Mich.
Guion, R. M. Criterion measurement and personnel judgments. Personnel No. 19305.
Psychology, 1961, 14, 141-149. Helson, H., Blake, R. R., & Mouton, J. S. Petition-signing as adjustment
Gullahorn, J., & Strauss, G. The field worker in union research. In R. N. to situational and personal factors. Journal of Social Psychology,
Adams & J. J. Preiss (Eds.), Human organization research. Home- 1958, 48, 3-10.
wood, Ill.: Dorsey Press, 1960. Pp. 153-165. Hemphill, J. K., & Sechrest, L. B. A comparison of three criteria of
Gump, R. Jade: stone of heaven. New York: Doubleday, 1962. aircrew effectiveness in combat over Korea. Journal of Applied
Gusfield, J. R. Field work reciprocities in studying a social movement. In Psychology, 1952,36,323-327.
R. N. Adams & J. J. Preiss (Eds.), Human organization research. Henle, M., & Hubble, M. B. "Egocentricity" in adult conversation.
Homewood, Ill.: Dorsey Press, 1960. Pp. 99-108. Journal of Social Psychology, 1938, 9, 227-234.
Hafner, E. M., & Presswood, Susan. Strong inference and weak interac- Henry, H. Motivation research: its practice and uses for advertising,
tions. Science, 1965, 149, 503-510.
marketing, and other business purposes. London: Crosby Lockwood,
Haggard, E. A., Brekstad, A., & Skard, A. G. On the reliability of the 1958.
Herbini&re-Lebert, S. Pourquoi e t comment nous avons fait "Mains
anamnestic interview. Journal of Abnormal and Social Psychology,
1960, 61, 311-318. Blanches": premiZres experiences avec un film educatif realist.
Halbwachs, M. Les causes de suicide. Paris: Felix Alcan, 1930. spkcialement pour les mins de sept ans. Revue Internationale de
Hall, E. T. Silent assumption in social communication. Disorders of
Filmologie, 1951, 2, 247-255.
Communication, 1964, 42, 41-55. Hess, E. H., & Polt, J. M. Pupil size a s related to interest value of visual
Hall, R. L., & Willerman, B. The educational influence of dormitory stimuli. Science, 1960, 132, 349-350.
roommates. Sociometry, 1963, 26, 294-318. Heusler, A., Ulett, G., & Blasques, J. Noise-level index: an objective
Hamburger, P. Peeping Funt. New Yorker, January 7, 1950,25, 72-73. measurement of the effect of drugs on the psychomotor activity of
Hamilton, T. Social optimism in American protestanism. Public Opinion patients. Journal of Neuropsychiatry, 1959, 1, 23-25.
Quarterly, 1942, 6, 280-283. Heusler, A. F., Ulett, G. A., & Callahan, J. D. Comparative EEG studies
of tranquilizing drugs. Research Laboratories of the St. Louis State
200 UNOBTRUSIVE MEASURES REFERENCES 201
1
Hospital, St. Louis, Mo. Paper read at Pan-American Medical Con- Janowitz, M. Inferences about propaganda impact from textual and
gress, Mexico City, May 3, 1960. documentary analysis. In W. E. Daugherty & M. Janowitz (Eds.), A
Heyns, R., & Lippitt, R. Systematic observational techniques. In G. psychological warfare casebook. Baltimore: Johns Hopkins Press,
Lindzey (Ed.), Handbook of social psychology. Vol. 1. Cambridge: 1958. Pp. 732-735.
Addison-Wesley, 1954. Pp. 370-404. Jay, R., & Copes, J. Seniority and criterion measures of job proficiency.
Hildum, D. C., & Brown, R. W. Verbal reinforcement and interviewer Journal of Applied Psychology, 1957,41, 58-60.
bias. Journal ofAbnorma1 and Social Psychology, 1956,53, 108-111. Jecker, J., Maccoby, N., Breitrose, H. S., & Rose, E. D. Teacher
Hillebrandt, R. H. Panel design and time-series analysis. Unpublished accuracy in assessing cognitive visual feedback from students. Jour-
master's thesis, Northwestern Univer., 1962. nal of Applied Psychology, 1964, 48, 393-397.
Holmes, L. D. Ta'u: Stability and change i n a Samoan village. Reprint Jones, R. W. Progressivism in Illinois communities as measured by
No. 7. Wellington, N.Z.: Polynesian Society, 1958. library services. Transactions of the Illinois State Academy of
Horst, P. Correcting the Kuder-Richardson reliability for dispersion of Science, 1960, 53, 166-172.
item difficulties. Psychological Bulletin, 1953, 50, 371-374. Jones, V. Character development in children: a n objective approach. In
Houseman, E. E., & Lipstein, B. Observation and audit techniques for L. Carmichael (Ed.), Manual of child psychology. New York: Wiley,
measuring retail sales. Agricultural Economics, 1960, 12, 61-72. 1946. Pp. 707-751.
Hovland, C. I., Lumsdaine, A. A., & Sheffield, F. D. Experiments on mass Jung, A. F. Price variations among automobile dealers in Chicago,
communication. Princeton: Princeton Univer. Press, 1949. Illinois. Journal of Business, 1959, 32, 315-326.
Howells, L. T., & Becker, S. W. Seating arrangement and leadership Jung, A. F. Prices of Falcon and Corvair cars in Chicago and selected
emergence. Journal of Abnormal and Social Psychology, 1962, 64, cities. Journal of Business, 1960, 33, 121-126.
148-150. Jung, A. F. Impact of the compact cars on new-car prices. Journal of
,I Business, 1961, 34, 167-182.
Hughes, E. C. Men and their work. Glencoe, Ill.: Free Press, 1958.
Humpbreys, L. G. Note on the multitrait-multimethod matrix. Psychologi- Jung, A. F. Impact of the compact cars on new-car prices: a reappraisal.
cal Bulletin, 1960, 57, 86-88. I Journal of Business, 1962, 35, 70-76.
Hyman, H. H., Cobb, W. J., Feldman, J. J., Hart, C. W., & Stember, C. Jung, A. F. Dealer pricing practices and finance charges for new mobile
H. Interviewing i n social research. Chicago: Univer. of Chicago homes. Journal of Business, 1963, 36, 430-439.
Press, 1954. Jung, A. F. Mortgage availability and terms in Florida. Journal of Busi-
Ianni, F. A. Residential and occupational mobility as indices of the ness, 1964, 37, 274-279.
acculturation of an ethnic group. Social Forces, 1957-58, 36, 65-72. Kadish, S. On the tactics of police-prosecution oriented critics of the
Imanishi, K. Social organization of subhuman primates in their natural courts. Cornell Law Quarterly, 1964,, 49, 436-477.
habitat. Current Anthropology, 1960, 1, 393-407. Kahn, R. L., & Cannell, C. F. The dynamics of interviewing: theory,
Jackson, D. N., & Messick, S. J. A note on "ethnocentrism" and technique and cases. New York: Wiley, 1957.
acquiescent response sets. Journal of Abnormal and Social Psy- Kaminski, G., & Osterkamp, U. Untersuchungen iiber die Topologie
chology, 1957, 54, 132-134. sozialer Handlungsfelder. Zeitschrift fiir experimentelle und ange-
Jacques, E. Measurement of responsibility. Cambridge: Harvard Univer. wandte Psychologie, 1962, 9, 417-451.
Press, 1956. Kane, F. Clothing worn by out-patients to interviews. Psychiatric Com-
Jaffe, A. J., & Stewart, C. D. Manpower, resources and utilizations. New munications, 1958, 1 (2).
York: Wiley, 1951. Kane, F. Clothing worn by an out-patient: a case study. Psychiatric
Jahoda-Lazarsfeld, M., & Zeisel, H. Die Arbeitslosen von Marienbad. communications, 1959, 2 (2).
Leipzig: Hirzel, 1932. Kane, F. The meaning of the form of clothing. Psychiatric Communica-
James, J. A preliminary study of the size determinant in small group tions, 1962, 5 (1).
interaction. American Sociological Review, 1951, 16, 474-477. Kanfer, F. H. Verbal rate, eyeblink and content in structured psychiatric
James, R. W. A technique for describing community structure through interviews. Journal of Abnormal and Social Psychology, 1960, 61,
newspaper analysis. Social Forces, 1958, 37, 102-109. J 341-347.
James, W. The principles of psychology. New York: Holt, 1890. Kappel, J. W. Book clubs and the evaluation of books. Public Opinion
Janis, I. L., & Hovland, C. I. An overview of persuasibility research. In I. Quarterly, 1948, 12, 243-252.
L. Janis & C. I. Hovland (Eds.), Personality and persuasibility. Katz, D. Do interviewers bias poll results? Public Opinion Quarterly,
New Haven: Yale Univer. Press, 1959. Pp. 1-26. 1942, 6, 24*8-268.
202 UNOBTRUSIVE MEASURES REFERENCES 203

Kavanau, J. L. Behavior: confinement, adaptation, and compulsory re- Krout, M. H. An experimental attempt to determine the significance of
gimes in laboratory studies. Science, 1964, 143, 490. unconscious manual symbolic movements. Journal of General Psy-
Kendall, L. M. The hidden variance: what does it measure? American chology, 1954, 51, 121-152. (a)
Psychologist, 1963, 18, 452. Krout, M. H. An experimental attempt to produce unconscious manual
Kerlinger, F. N. Foundations of behavioral research: educational and symbolic movements. Journal of General Psychology, 1954, 51, 93-
psychological inquiry. New York: Holt, Rinehart & Winston, 1964. 120. (b)
Kimbrell, D. L., & Blake, R. R. Motivational factors in the violation of a Krueger, L. E. & Ramond, C. K. References. In M. Mayer, The intelli-
prohibition. Journal of Abnormal and Social Psychology, 1958, 56, gent man's guide to sales measures of advertising. New York: Adver-
132-133. tising Research Foundation, 1965. Pp. 29-71.
Kinsey, A. C., Pomeroy, W. B., Martin, C. E., & Gebhard, P. H. Sexual Krugman, H. E. Some applications of pupil measurement. Journal of
behavior i n the human female. Philadelphia: W . B. Saunders, 1953. Marketing Research, 1964, 1, 15-19.
Kintz, B. L., Delprato, D. J., Mettee, D. R., Persons, C. E., & Schappe, Kuhn, T. The structure of scientific revolutions. Chicago: Univer. of
R. H. The experimenter effect. Psychological Bulletin, 1965,63,223- Chicago Press, 1962.
232. Kupcinet, I. Kup's column. Chicago Sun-Times, March 9, 1965, 18, 46.
Kirk, P. L. Criminalistics. Science, 1963, 140, 367-370. Lander, B. Towards a n understanding of juvenile delinquency. New York:
Kitsuse, J. I., & Cicourel, A. V. A note on the uses of official statistics. Columbia Univer. Press, 1954.
Social Problems, 1963, 11, 131-139. Landis, C. National differences in conversation. Journal ofAbnorma1 and
Knox, J. B. Absenteeism and turnover in a n Argentine factory. American Social Psychology, 1927, 21, 354-357.
Sociological Review, 1961, 26, 424-428. Landis, C., & Hunt, W. A. The startle pattern. New York: Farrar &
Kort, F. Predicting Supreme Court decisions mathematically: a quantita- Rinehart, 1939.
tive analysis of "right to counsel" cases. American Political Science Landis, M. H., & Burtt, H. E. A study of conversations. Journal of
Review, 1957, 51, 1-12. Comparative Psychology, 1924, 4, 81-89.
Kort, F. Reply to Fisher's "Mathematical Analysis of Supreme Court Lang, K., & Lang, G. E. Decisions for Christ: Billy Graham in New York
Decisions." American Political Science Review, 1958, 52, 339-348. City. In M. Stein, A. J. Vidich, & D. M. White (Eds.), Identity and
Kramer, E. Judgment of personal characteristics and emotions from anxiety. Glencoe, Ill.: Free Press, 1960. Pp. 415-427.
nonverbal properties of speech. Psychological Bulletin, 1963, 60, LaPiere, R. T. Attitudes vs. actions. Social Forces, 1934, 13, 230-237.
408-420. Lasswell, H. D. The world attention survey. Public Opinion Quarterly,
Kramer, E. Elimination of verbal cues in judgments of emotion from voice. 1941, 5, 456-462.
Journal of Abnormal and Social Psychology, 1964, 68, 390-396. Lea, T. The brave bulls. Boston: Little, Brown, 1949.
Krasner, L. Studies of the conditioning of verbal behavior. Psychological Lefkowitz, M., Blake, R. R., & Mouton, J. S. Status factors in pedestrian
Bulletin, 1958, 55, 148-170. violation of traffic signals. Journal of Abnormal and Social Psy-
Kretsinger, E. A. An experimental study of gross bodily movement as an chology, 1955, 51, 704-706.
index to audience interest. Speech Monographs, 1952, 19, 2444-248. Legget, R. F., & Northwood, T. D. Noise surveys of cocktail parties.
Kretsinger, E. A. An experimental study of restiveness in preschool Journal of the Acoustical Society of America, 1960, 32, 16-17.
educational television audiences. Speech Monographs, 1959,26, 72- Lehman, H. C., & Witty, P. A. Scientific eminence and church member-
77. ship. Scientific Monthly, 1931, 36, 544-549. (Abstracted in M. I.
Krislov, S. Amicus curiae brief: from friendship to advocacy. Yale Law Stein, & S. J. Heinze, Creativity and the individual. Glencoe, Ill.:
Journal, 1963, 72, 694-721. Free Press, 1960. Pp. 149-150.)
Krout, M. H. Major aspects of personality. Chicago: College Press, 1933. Leipold, W. D. Psychological distance in a dyadic interview a s a function
Krout, M. H. Further studies on the relation of personality and gestures: a of introversion-extraversion, anxiety, social desirability and stress.
nosological analysis of autistic gestures. Journal of Experimental Unpublished doctoral dissertation, Univer. of North Dakota, 1963.
Psychology, 1937, 20, 279-287. Lenski, G. E., & Leggett, J. C. Caste, class, and deference in the
Krout, M. H. Gestures and attitudes: an experimental study of the verbal research interview. American Journal of Sociology, 1960,65,463-467.
equivalents and other characteristics of a selected group of manual Leroy-Boussion, A. Etude du comportement &motional enfantin au cours
autistic gestures. Unpublished doctoral dissertation, Univer. of Chi- de la projection d'un film comique. Revue Internationale de Filmo-
cago, 1951. logie, 1954, 5, 105-123.
204 UNOBTRUSIVE MEASURES REFERENCES 205

Lewin, H. S. Hitler youth and the Boys Scouts of America. Human MacRae, D. The role of the state legislator in Massachusetts. American
Relations, 19437, 1, 206-227. Sociological Review, 1954, 19, 185-194. (a)
Lewis, 0. The children of Sanchez. New York: Random House, 1961. MacRae, D. Some underlying variables in legislative roll call votes. Public
Libby, W. I. Accuracy of radio-carbon dates. Science, 1963,140, 278-280. Opinion Quarterly, 1954,, 18, 191-196. (b)
Lippmann, W. The public philosophy. New York: New American Library, MacRae, D., & MacRae, E. Legislators' social status and their votes.
1955. American Journal of Sociology, 1961, 66, 599-603.
Lodge, G. T. Pilot stature in relation to cockpit size: a hidden factor in Madge, J. The tools of social science. New York: Doubleday Anchor, 1965.
Navy jet aircraft accidents. American Psychologist, 1963, 17, 468. Mahl, G. Disturbances and silences in the patient's speech in psychother-
(Abstr.) apy. Journal of Abnormal and Social Psychology, 1956, 53, 1-15.
Lombroso, C. The man ofgenius. London: Walter Scott, 1891. (Abstracted Maller, J. B. The effect of signing one's name. School and Society,
in M. I. Stein, & S. J. Heinze, Creativity and the individual. 1930, 31, 882-884.
Glencoe, Ill.: Free Press, 1960. Pp. 350-353.) Manago, B. R. Mad: out of the comics rack and into satire. Add One, 1962,
Loomis, C. P. Political and occupational changes in a Hanoverian village, 1, 41-46.
Germany. Sociometry, 1946, 9, 316-333. Marsh, R. M. Formal organization and promotion in a pre-industrial
Lucas, D. B., & Britt, S. H. Advertising psychology and research. New society. American Sociological Review, 1961, 26, 547-556.
York: McGraw-Hill, 1950. Martin, P. I call on the Candid Camera man. Saturday Evening Post,
Lucas, D. B., & Britt, S. H. Measuring advertising effectiveness. New May 27, 1961, 234, 26-27.
York: McGraw-Hill, 1963. Matarazzo, J. D. Control of interview behavior. Paper read at American
Lustig, N. I. The relationships between demographic characteristics and Psychological Association, St. Louis, September, 1962. (a)
pro-integration vote of white precincts in a metropolitan southern Matarazzo, J. D. Prescribed behavior therapy: suggestions from noncon-
community. Social Forces, 1962, 40, 205-208. tent interview research. In A. J. Bachrach (Ed.), Experimental
Lyle, H. M. An experimental study of certain aspects of the electromag- foundations of clinical psychology. New York: Basic Books, 1962.
netic movement meter as a criterion to audience attention. Sl~eech Pp. 471-509. (b)
Monographs, 1953, 20, 126. (Abstr.) Matarazzo, J. D., Weitman, M., Saslow, G., & Wiens, A. N. Interviewer
Mabie, E. A study of the conversation of first-grade pupils during free play influence on duration of interviewee speech. Journal of Verbal
periods. Journal of Educational Research, 1931, 24, 135-138. Learning and Verbal Behavior, 1963, 1, 451-458.
Mabley, J. Mabley's report. Chicago American, January 22, 1963, 62, 3. Matarazzo, J. D., Wiens, A. N., Saslow, G., Dunham, R. M., & Voas, R.
McCarroll, J. R., & Haddon, W. A controlled study of fatal accidents in B. Speech durations of astronaut and ground communicator. Science,
New York City. Journal of Chronic Diseases, 1961, 15, 811-826. 1964, 143, 148-150.
McCarthy, D. A comparison of children's language in different situations Mattbews, T. S. The sugarpill. New York: Simon & Schuster, 1957.
and its relation to personality traits. Journal of Genetic Psychology, May, M. A., & Hartshorne, H. First steps toward a scale for measuring
1929, 36, 583-591. attitude. Journal of Educational Psychology, 1927,17, 145-162
Maccoby, E. E. Developmental psychology. In P. R. Farnsworth, 0. Mechanic, D., & Volkart, E. H. Stress, illness behavior and the sick role.
McNemar, & Q. McNemar (Eds.), Annual Review of Psychology, American Sociological Review, 1961, 26, 51-58.
1964, 15, 203-250. Melbin, M. Organization practice and individual behavior: absenteeism
McClelland, D. C. The achieving society. Princeton: Van Nostrand, 1961. among psychiatric aides. American Sociological Review, 1961, 26,
- . .
McGranahan, D., & Wayne, I. German and American traits reflected in 14-23.
popular drama. Human Relations, 1948, 1, 429-455. Melton, A. W. Some behavior characteristics of museum visitors. Psycho-
McGrath, J. E. The influence of positive interpersonal relations on logical Bulletin, 1933, 30, 720-721. (a)
adjustment and effectiveness in rifle teams. Journal ofAbnorma1 and Melton, A. W. Studies of installation at the Pennsylvania Museum of Art.
Social Psychology, 1962, 65, 365-375. Museum News, 1933, 11, 508. (b)
McGraw, M., & Molloy, L. B. The pediatric anamnesis: inaccuracies in Melton, A. W. Problems of installation in museums of art. Studies i n
eliciting developmental data. Child Development, 1941, 12, 255-265. museum education. Washington, D.C.: American Association of
MacKinney, A. C. What should ratings rate? Personnel, 1960, 37, 75-78. Museums, 1935.
MacLean, W. R. On the acoustics of cocktail parties. Journal of the Melton, A. W. Distribution of attention in galleries in a museum of
Acoustical Society of America. 1959, 31, 79-80. science and industry. Museum News, 1936, 13, 3, 5-8.
206 UNOBTRUSIVE MEASURES REFERENCES 207

Melton, A. W., Feldman, N. G., & Mason, C. W. Experimental studies of Nagel, S. Ethic affiliations and judicial propensities. Journal of Politics,
the education of children in a museum of science. Publications of the 1962, 24, 92.
American Association of Museums, New Series, No. 15, 1936. Naroll, R. The preliminary index of social development. Am.erican Anthro-
Merritt, C. B., & Fowler, R. G. The pecuniary honesty of the public at pologist, 1956, 58, 687-715.
large. Journal of Abnormal and Social Psychology, 1948, 43, 90-93. Naroll, R. Controlling data quality. In Series Research in Social Psy-
Middleton, R. Fertility values in American magazine fiction: 1916-1956. chology. Symposia Studies Series, No. 4, September, 1960. Pp. 7-12.
Public Opinion Quarterly, 1960, 24, 139-143. Naroll, R. Two solutions to Galton's problems. Philosophy of Science,
Miller, G. A. Population, distance and the circulation of information. 1961, 28, 15-39.
American Journal of Psychology, 194*7,60, 276-284. Naroll, R. Data quality control. Glencoe, Ill.: Free Press, 1962.
Mills, F. C. The behavior of prices. New York: National Bureau of Naroll, R., & Naroll, F. On bias of exotic data. Man, 1963,25,24-26.
Economic Research, 1927.
Mindak, W. A., Neibergs, A., & Anderson, A. Economic effects of the NASA Manned Spacecraft Center. Results of the first United States
Minneapolis newspaper strike. Journalism Quarterly, 1963, 40, manned orbital space jlight, February 20, 1962. Washington, D.C.:
213-218. U.S. Government Printing Office, 1962. (a)
Mitchell, W. C. Index numbers of wholesale prices i n the U.S. and foreign NASA Manned Spacecraft Center. Results of the second United States
countries: I. the making and using of index numbers. Bulletin No. manned orbital space jlight, M a y 24, 1962. Washington, D.C.: U.S.
284. Washington, D.C.: U.S. Department of Labor, Bureau of Labor Government Printing Office, 1962. (b)
Statistics, 1921. National Advertising Company. Shopping center research study. Bedford
Moore, H. T. Laboratory tests of anger, fear, and sex interest. American Park, Ill.: Author, 1963.
Journal of Psychology, 1917, 28, 390-395. Nixon, H. K. Attention and interest in advertising. Archives of Psychology,
Moore, H. T. Further data concerning sex differences. Journal ofAbnor-
I
1924, 11, 1-68.
/
ma1 and Social Psychology, 1922,17, 210-214. North, R. C., Holsti, 0. R., Zaninovich, M. G., & Zinnes, D. A. Content
Moore, U., & Callahan, C. Law and learning theory: a study i n legal analysis. Evanston, Ill.: Northwestern Univer. Press, 1963.
control. New Haven: Yale Law Journal Co., 1943. O'Connor, N. (Ed.) Recent Soviet psychology. Trans. Ruth Kish, R.
Moore, W. E. The exploitability of the "labor force" concept. American Crawford, & H. Asher. New York: Pergamon, 1961.
Sociological Review, 1953, 18, 68-72. Olson, W. C. The measurernen,t of nervous habits i n normal children.
Morgan, E. M. Basic problems of evidence. New York: Joint Committee on Minneapolis: Univer. of Minnesota Press, 1929.
Continuing Legal Education of the American Law Institute and the Orne, M. T. The nature of hypnosis: artifact and essence. Journal of
American Bar Association, 1963. Abnormal and Social Psychology, 1959, 58, 277-299.
Morgenstern, 0. On the accuracy of economic observations. (2nd ed.) Orne, M. T. On the social psychology of the psychological experiment:
Princeton: Princeton Univer. Press, 1963. with particular reference to demand characteristics and their im-
Mosteller, F. Use as evidenced by an examination of wear and tear on plications. American Psychologist, 1962, 17, 776-783.
selected sets of ESS. In K. Davis et al., A study of the need for a Orne, M. T. & Evans, F. J. Social control in the psychological experi-
new encyclopedic treatment of the social sciences. Unpublished ment: antisocial behavior and hypnosis. Journal of Personality and
manuscript, 1955. Pp. 167-174. Social Psychology, 1965, 1 , 189-200.
Mosteller, F., & Wallace, 0. L. Inference in an authorship problem: a Orne, M. T., & Scheibe, K. E. The contribution of nondeprivation factors
comparative study of discrimination methods applied to the author- in the production of sensory deprivation effects: the psychology of
ship of The Federalist Papers. Journal of the American Statistical the "panic button." Journal of Abnormal and Social Psychology,
Association, 1963, 58, 275-309. 1964, 68, 3-12.
Mudgett, B. D. Index numbers. New York: Wiley, 1951. Osgood, C. E. Method and theory in experimental psychology. New York:
Murphy, G., & Murphy, L. Soviet life and Soviet psychology. In R. A. Oxford Univer. Press, 1953.
Bauer (Ed.), Some views on Soviet psychology. Washington, D.C.: Osgood, C. E. & Walker, E. Motivation and language behavior: a content
American Psychological Association, 1962. Pp. 253-276. analysis of suicide notes. Journal of Abnormal and Social Psy-
Murray, E. J., & Cohen, M. Mental illness, milieu therapy, and social chology, 1959, 59, 58-67.
organization in ward groups. Journal of Abnormal and Social Psy- OSS Assessment Staff. Assessment of men. New York: Rinehart, 1948.
chology, 1959, 58, 48-54,. Paisley, W. J. Identifying the unknown communicator in painting, litera-
208 UNOBTRUSIVE MEASURES REFERENCES 209

ture and music: the significance of minor encoding habits. Journal of Riesman, D., & Ehrlich, J. Age and authority in the interview. Public
Communication, 1964, 14, 219-237. Opinion Quarterly, 1961, 25, 39-56.
Parker, E. B. The effects of television on public library circulation. Public Riesman, D., & Watson, J. The sociability project: a chronicle of frustra-
Opinion Quarterly, 1963, 27, 578-589. tion and achievement. In P. E. Hammond (Ed.), Sociologists at work.
Parker, E. B. The impact of a radio book review program on public library New York: Basic Books, 1964. Pp. 235-321.
circulation. Journal of Broadcasting, 1964, 8, 353-361. Riker, W., & Niemi, D. The stability of coalitions on roll calls in the
Pearson, K. The life, letters and labours of Francis Galton. Vol. 1. House of Representatives. American Political Science Review, 1962,
Cambridge: Cambridge Univer. Press, 1914. 56, 58-65.
Perrine, M., & Wessman, A. W. Disguised public opinion interviewing Riley, M. W. Sociological research: I. a case approach. New York:
with small samples. Public Opinion Quarterly, 1954, 18, 92-96. Harcourt, Brace & World, 1963.
Pettigrew, T. A profile of the Negro American. Princeton: Van Nostrand, Robbins, L. C. The accuracy of parental recall of aspects of child
1964,. development and of child rearing practices. Journal ofAbnorma1 and
Phillips, R. H. Miami goes Latin under Cuban tide. New York Times, Social Psychology, 1963, 66, 261-270.
March 18, 1962, 111, 85. Robins, L. N., Hyman, H., & O'Neal, P. The interaction of social class
Platt, J. R. Strong inference. Science, 1964, 146, 347-353. and deviant behavior. American Sociological Review, 1962, 27,
Polansky, N., Freeman, W., Horowitz, M., Irwin, L., Paponia, N., Rapa- 480-492.
port, D., & Whaley, F. Problems of interpersonal relations in Robinson, D., & Rolide, S. Two experiments with an anti-semitism poll.
research on groups. Human Relations, 1949, 2, 281-291. Journal of Abnormal and Social Psychology, 1946,41, 136-144.
Politz Media Studies. The readers of "The Saturday Evening Post." Robinson, E. S. The behavior of the museum visitor. Publications of the
Philadelphia: Curtis Publishing Co., 1958. American Association of Museums, New Series, No. 5, 1928.
Politz Media Studies. A study of outside transit poster exposure. New Roens, B. B. New findings from Scott's special advertising research
York: Alfred Politz, 1959. study. In Proceedings: 7th Annual Conference, Advertising Research
Pollack, I., & Pickett, J. M. Cocktail party effect. Journal of the Acousti- Foundation. New York: Advertising Research Foundation, 1961. Pp.
cal Society of America, 1957, 29, 1262. 65-70.
Pool, Ithiel de Sola (Ed.). Trends i n content analysis. Urbana: Univer. of Rogow, A. A., & Lasswell, H. D. Power, corruption and rectitude.
Illinois Press, 1959. Englewood Cliffs, N.J.: Prentice-Hall, 1963.
Popper, K. Logic der Forschung. Wien: Springer, 1935.
Rorer, L. G. The great response-style myth. Psychological Bulletin, 1965,
Popper, K. The logic of scientific discovery. New York: Basic Books, 1959.
63, 129-156.
Popper, K. Conjectures and refutations. New York: Basic Books, 1962.
Rosenbaum, M. E. The effect of stimulus and background factors on the
Prosser, W. L. Handbook of the law of torts. (3rd ed.) St. Paul: West,
volunteering response. Journal of Abnormal and Social Psychology,
1964. 1956, 53, 118-121.
Pyles, M. K., Stolz, H. R., & Macfarlane, J. W. The accuracy of mothers'
reports on birth and developmental data. Child Development, 1935,6, Rosenbaum, M. E., & Blake, R. R. Volunteering as a function of field
structure. Journal of Abnormal and Social Psychology, 1955, 50,
165-176.
Quine, W. V. From a logical point of view. Cambridge: Harvard Univer. 193-196.
Press, 1953. Rosenthal, A. M. Japan, famous for politeness, has a less courteous side,
Rashkis, H., & Wallace, A. F. C. The reciprocal effect. Archives of too. New York Times, February 25, 1962, 111, 20.
General Psychiatry, 1959, 1 , 4'89-498. Rosenthal, R. On the social psychology of the psychological experiment:
Ray, M. L. Cross-cultural content analysis: its promise and its problems. the experimenter's hypothesis as unintended determinant of experi-
Unpublished manuscript, Northwestern Univer., 1965. mental results. American Scientist, 1963, 51, 268-283.
Reddy, J. Heady thieves find Wheeling their Waterloo. Chicago Sun- Rosenthal, R. Experimenter outcome-orientation and the results of the
Times, February 28, 1965, 18, 66. psychological experiment. Psychological Bulletin, 1964, 61, 405-4'12.
Riesman, D. Orbits of tolerance, interviewers and elites. Public Opinion Rosenthal, R., & Fode, K. L. Psychology of the scientist: V. three
Quarterly, 1956, 20, 49-73. experiments in experimenter bias. Psychological Reports, 1963, 12,
Riesman, D. Comment on "The State of Communication Research." 491-511.
Public Opinion Quarterly, 1959, 23, 10-13. Rosenthal, R., & Lawson, R. A longitudinal study of the effects of
210 UNOBTRUSIVE MEASURES REFERENCES 211

experimenter bias on the operant learning of laboratory rats. Journal Schwartz, R. D., & Skolnick, J. H. Two studies of legal stigma. Social
of Psychiatric Research, 1963, 2, 61-72. Problems, 1962, 10, 133-142. (b)
Rosenthal, R., Persinger, G. W., Vikan-Kline, L., & Fode, K. L. The Sebald, H. Studying national character through comparative content
effect of early data returns on data subsequently obtained by out- analysis. Social Forces, 1962, 40, 318-322
come-biased experimenter. Sociometry, 1963, 26, 487-498. Sechrest, L. Handwriting on the wall: a view of two cultures. Unpublished
Ross, H. L. The inaccessible respondent: a note on privacy in city and manuscript, Northwestern Univer., 1965. (a)
country. Public Opinion Quarterly, 1963, 27, 269-275. Sechrest, L. Situational sampling and contrived situations in the assess-
Ross, H. L., & Campbell, D. T. Time series data in the quasi-experiment of behavior. Unpublished manuscript, Northwestern Uni-
mental analysis of the Connecticut speeding crackdown. Unpublished ver., 1965. (Mimeographed.) (b)
manuscript, 1965. Sechrest, L., & Flores, L. The occurrence of a nervous mannerism in two
Rotter, J. B., Liverant, S., & Crowne, D. P. The growth and extinction of cultures. Journal of Nervous and Mental Disease, in press.
expectancies in chance, controlled and skilled tasks. Journal of Sechrest, L., Flores, L., & Arellano, L. Social distance and language in
Psychology, 1961, 52, 161-177. bilingual subjects. Unpublished manuscript, Northwestern Univer.,
Ruesch, J., & Kees, W. Nonverbal communication: notes on the visual 1965.
perception of human relations. Berkeley: Univer. of California Press, Sechrest, L., & Wallace, J. Figure drawing and naturally occurring
1956. events: elimination of the expansive euphoria hypothesis. Journal of
Rush, C. A factorial study of sales criteria. Personnel Psychology, 1953,6, Educational Psychology, 1964, 55, 42-44.
9-24. Selltiz, C., Jahoda, M., Deutsch, M., & Cook, S. W. Research methods i n
Salzinger, K. A method of analysis of the process of verbal communication social relations. New York: Holt, Rinehart & Winston, 1959.
between a group of emotionally disturbed adolescents and their Severin, D. The predictability of various kinds of criteria. Personnel
friends and relatives. Journal of Social Psychology, 1958, 47, 39-53. Psychology, 1952, 5, 93-104.
Sawyer, H. G. The meaning of numbers. Speech before the American Shadegg, S. C. How to win a n election. New York: Toplinger, 1964.
Association of Advertising Agencies, 1961. Shepard, H. R., & Blake, R. R. Changing behavior through cognitive
Schachter, S. The psychology of afiliation. Stanford, Calif.: Stanford change. Human Organization, 1962, 21, 88-92. Published by the
Univer. Press, 1959. Society for Applied Anthropology.
Schachter, S., & Hall, R. Group derived restraints and audience persua- Shils, E. A. Social inquiry and the autonomy of the individual. In D.
sion. Human Relations, 1952, 5, 397-4'06. Lerner (Ed.), The human meaning of the social sciences. Cleveland:
Schanck, R. L., & Goodman, C. Reactions to propaganda on both sides of Meridian, 1959. Pp. 114-157.
a controversial issue. Public Opinion Quarterly, 1939, 3, 107-112. Siersted, E., & Hansen, H. L. RCaction des petits enfants au cinema:
Schneidman, E. S., & Farberow, N. L. Some comparisons between resumi: d'une serie d'observations faites au Danemark. Revue
genuine and simulated suicide notes in terms of Mowrer's concepts Internationale de Filmologie, 1951, 2, 241-245.
of discomfort and relief. Journal of General Psychology, 1957, 56, Singh, P. H., & Huang, S. C. Some socio-cultural and psychological
251-256. determinants of advertising in India: a comparative study. Journal of
Schubert, G. Quantitative analysis of judicial behavior. Glencoe, Ill.: Social Psychology, 1962, 57, 113-121.
Free Press, 1959. Sleeper, C. B. Samplings of leisure-time conversations to find sex differ-
Schubert, G. Judicial decision-making. New York: Free Press, 1963. ences in drives. Unpublished, cited in G. Murphy & S. L. Murphy,
Schulman, J. L., Kasper, J. C., & Throne, J. M. Brain damage and Experimental social psychology. New York: Harper, 1931.
behavior. Springfield: W . I. Thomas, 1965. Sletto, R. F. A construction of personality scales by the criterion of
Schulman, J. L., & Reisman, J. M. An objective measure of hyper- internal consistency. Hanover, N.H.: Sociological Press, 1937.
activity. American Journal of Mental Deficiency, 1959, 64, 455-456. Smedslund, J. Educational psychology. In P. R. Farnsworth, 0. McNe-
Schwartz, M. S., & Stanton, A. H. A social psychological study of mar, & Q. McNemar (Eds.), Annual Review o- f Psychology,
- 1964,15,
incontinence. Psychiatry, 1950, 13, 399-416. 251-276.
Schwartz, R. D. Field experimentation in sociolegal research. Journal of Smith, H. T. A comparison of interview and observation methods of
Legal Education, 1961, 13, 401-410. mother behavior. Journal of Abnormal and Social Psychology, 1958,
Schwartz, R. D., & Skolnick, J. H. Television and tax compliance. In L. 57, 278-282.
Arons & M. A. May (Eds.), Television and human behavior. New Snyder, E. C. Uncertainty and the Supreme Court's decisions. American
York: Appleton-Century-Crofts, 1962. (a) Journal of Sociology, 1959, 65, 241-245.
212 UNOBTRUSIVE MEASURES REFERENCES 213
Snyder, R., & Sechrest, L. An experimental study of directive group Strodtbeck, F. L., & Mann, R. D. Sex role differentiation in jury delibera-
therapy with defective delinquents. American Journal of Mental tions. Sociometry, 1956, 19, 3-11.
Dejiciency, 1959, 63, 117-123. Stuart, I. R. Minorities vs. minorities: cognitive, affective and conative
Solley, C. M., & Haigh, G. A. A note to Santa Claus. Topeka Research components of Puerto Rican and Negro acceptance and rejection.
Papers, The Menninger Foundation, 1957, 18, 4-5. Journal of Social Psychology, 1963, 59, 93-99.
Solomon, R. L. An extension of control group design. Psychological Sussman, L. Mass political letter writing in America. Public Opinion
Bulletin, 1949, 46, 137-150. Quarterly, 1959, 23, 203-212.
Sommer, R. Studies in personal space. Sociometry, 1959, 22, 247-260. Sussman, L. Dear F. D. R . New York: Bedminster, 1963.
Sommer, R. Personal space. Canadian Architect, 1960, pp. 76-80. Swift, A. L., Jr. The survey of the YMCA of the City of New York.
Sommer, R. Leadership and group geography. Sociometry, 1961, 24, (Limited ed.) New York: Association Press, 1927.
99-110. Tannenbaum, P. H., & Noah, J. E. Sportugese: a study of sports page
Sommer, R. The distance for comfortable conversations: further study. communication. Journalism Quarterly, 1959, 36, 163-170.
Sociometry, 1962, 25, 111-116. Tarde, G. L'Opinion et l a foule. Paris: Felix Alcan, 1901.
Spiegel, D. E., & Neuringer, C. Role of dread in suicidal behavior. Terman, L. M. The intelligence quotients of Francis Galton in childhood.
Journal of Abnormal and Social Psychology, 1963, 66, 507-511. American Journal of Psychology, 1917,28, 209-215.
Stechler, G. Newborn attention as affected by medication during labor. Thomas, D. S. Some new techniques for studying social behavior. New
Science, 1964, 144, 315-317. York: Columbia Univer. Press, 1929.
Stein, M. I., & Heinze, S. J. Creativity and the individual. Glencoe, Ill.: Thomas, W. I., & Znaniecki, F. The Polish peasant i n Europe and
Free Press, 1960. America: monograph of an immigrant group. Vol. 1. Chicago:
Steiner, G. A. The people look at television. New York: Knopf, 1963. Univer. of Chicago Press, 1918.
Steiner, I. D. Group dynamics. In P. R. Farnsworth, 0. McNemar, & Q.
Thorndike, E. L. Your City. New York: Harcourt, Brace, 1939.
McNemar (Eds.), Annual Review of Psychology, 1964, 15, 421-446. Thorndike, R. L. Personnel selection. New York: Wiley, 1949.
Steiner, I. D., & Field, W. L. Role assignment and interpersonal influ-
Toulouse, M. M., & Mourgue, R. Des rQactions respiratoires au cours de
ence. Journal of Abnormal and Social Psychology, 1960,61,239-245. profictions cin&matographiques. Revue Internationale de Filmo-
Stephan, F. F., & McCarthy, P. J. Sampling opinions. New York: Wiley, iogie, 1948,2, 77-83.
1958. --
Trueswell. R. W. A survey of library users' needs and behavior as related
Stern, R. Golk. New York: Criterion Books, 1960. to the application of data processing and computer technique. Un-
Stewart, J. Q. Empirical mathematical rules concerning the distinction published doctoral dissertation, Northwestern Univer., 1963.
and equilibrium of population. Geographical Review, 194(7, 37, Turner, W. Dimensions of foreman performance: a factor analysis of
461-485. criterion measures. Journal of Applied Psychology, 1960, 44,
Stoke, S. M., & West, E. D. Sex differences in conversational interests. 216-223.
Journal of Social Psychology, 1931,2, 120-126. Udy, S. H. Cross-cultural analysis: a case study. In P. E. Hammond (Ed.),
Stouffer, S. A. Problems in the application of correlation to sociology. SocioloPists at work. New York: Basic Books, 1964. Pp. 161-183.
Journal of the American Statistical Association, 1934.29,52-58. (Re- Ulett, G. A., Heusler, A., Callahan, J. Objective measures in psycho-
printed in S. A. Stouffer,Social research to test ideas. Glencoe, Ill.: pharmacology (methodology). In E. Rothlin (Ed.), Neuro-psycho-
Free Press, 1962. Pp. 264-270.) pharmacology, 1961, 2 , 401-409.
Stouffer, S. A. Social research to test ideas. Glencoe, Ill.: Free Press, Ulett, G. A., Heusler, A., Ives-Word, V., Word, T., & Quick, R. Influ-
1962. (Reprinted from P. F. Lazarsfeld, Radio and the printedpage. ence of chlordiozepoxide on drug-altered EEG patterns and behavior.
New York: Duell, Sloan and Pearce, 1940. Pp. 266-272.) Medicina Experimentalis, 1961, 5, 386-390.
Stouffer, S. A., Lumsdaine, A. A., Lumsdaine, M. H. Williams, R., Ulmer, S. S. Quantitative processes: some practical and theoretical
Smith, M., Janis, I., Star, S., & Cottrell, L. The American soldier:
~~ -.... .
- ~ applications. In Hans W. Baade (Ed.), Jurimetrics. New York: Basic
combat and its aftermath. Vol. 2. Princeton: Princeton Univer. Books, 1963.
Press, 194$9. Underwood, B. J. Psychological research. New York: Appleton-Century-
Strodtbeck, F. L., & James, R. M. Social process in jury deliberations.
Crofts, 1957.
Paper read at American Sociological Society, 1955.
Vernon, D. T. A., & Brown, J. The utilization of secondary or less
Strodtbeck, F. L., James, R. M, & Hawkins, C. Social status in jury preferred sources of information by persons in potentially stressful
deliberations. American Sociological Review, 1957, 22, 713-719. situations. Unpublished manuscript, 1963.
214 UNOBTRUSIVE MEASURES REFERENCES 215

Vidich, A. J., & Shapiro, G. A. A comparison of participant observation West, D. V. In the eye of the beholder. Television Magazine, 1962, 19,
and survey data. American Sociological Review, 1955, 20, 28-33. 60-63.
Vincent, C. E. Socioeconomic status and familial variables in mail Whisler, T. L., & Harper, S. F. Performance appraisal: research and
questionnaire responses. American Journal of Sociology, 1964, 69, practice. New York: Holt, Rinehart & Winston, 1962.
647-653. White, R. K. Hitler, Roosevelt and the nature of war propaganda. Journal
Vose, C. E. Caucasians only. Berkeley: Univer. of California Press, 1959. of Abnormal and Social Psychology, 19449, 44, 157-174.
Walters, R. H., Bowen, Norma V., & Parke, R. D. Experimentally Whyte, W. H. The organization man. New York: Simon & Schuster, 1956.
induced disinhibition of sexual responses. Unpublished manuscript, Wigmore, J. H. A student's textbook of the law of evidence. Brooklyn:
Univer. of Waterloo, 1963. Cited in A. Bandura & R. H. Walters, Foundation Press, 1935.
Social learning and personality development. New York: Holt, Rine- Wigmore, J. H. The science of judicial proof as given by logic, psy-
hart, & Winston, 1964. Pp. 76-79. chology, and general experience and illustrated i n judicial trials.
Warner, W. L. The living and the dead. New Haven: Yale Univer. Press, (3rd ed.) Boston: Little, Brown, 1937.
1959. Williams, R. Probability sampling in the field: a case history. Public
Warner, W. L., Meeker, M., & Eells, K. Social class i n America. Chicago: Opinion Quarterly, 1950, 14, 316-330.
Science Research Associates, 1949. Wilson, E. B. A n introduction to scientific research. New York: McGraw-
Washburne, C. The good and bad in Russian education. New Era, 1928,9, Hill, 1952.
8-12. Windle, C. Test-retest effect on personality questionnaires. Educational
Watson, J., Breed, W., & Posman, H. A study in urban conversation: and Psychological Measurement, 1954, 14, 617-633.
sample of 1001 remarks overheard in Manhattan. Journal of Social Winick, C. Thoughts and feelings of the general population as expressed
Psychology, 1948, 28, 121-123. in free association typing. The American Imago, 1962, 19, 67-84.
Watson, R. I. Historical review of objective personality testing: the Winship, E. C., & Allport, G. W. Do rosy headlines sell newspapers?
search for objectivity. In B. M. Bass & I. A. Berg (Eds.), Objective Public Opinion Quarterly, 1943, 7 , 205-210.
approaches to personality assessment. Princeton: Van Nostrand, Winston, S. Birth control and the sex-ratio at birth. American Journal of
1959. Pp. 1-23. Sociology, 1932, 38, 225-231.
Wax, R. H. Reciprocity in field work. In R. N. Adams & J. J. Preiss Wolff, C. A psychology of gesture. London: Methuen, 1948.
(Eds.), Human organization research. Homewood, Ill.: Dorsey Press, Wolff, C. The hand i n psychological diagnosis. London: Methuen, 1951.
1960. Pp. 90-98. Wolff, W., & Precker, J. A. Expressive movement and the methods of
Webb, E. J. Men's clothing study. Chicago: Chicago Tribune Co., 1957. experimental depth psychology. In H. H. Anderson & G. L. Ander-
Webb, E. J. How to tell a columnist: I. Columbia Journalism Review, son (Eds.), A n introduction to projection techniques. New York:
1962, 1, 23-25. (a) Prentice-Hall, 1951. Pp. 457-497.
Webb, E. J. Television programming and the effect of ratings. Paper read Wolfson, R. Graphology. In H. H. Anderson & G. L. Anderson (Eds.), A n
at Association for Education in Journalism, Chapel Hill, N.C., introduction to projection techniques. New York: Prentice-Hall,
1962. (b) 1951. Pp. 416-456.
Webb, E. J. How to tell a columnist: 11. Columbia Journalism Review, Yates, F. Sampling methods for censuses and surveys. New York: Hafner,
1963, 2, 20. 1949.
Webb, E. J. The orthographies of seven African languages. In prepara- Yule, G. U., & Kendall, M. G. A n introduction to the theory of statistics.
tion. (14th ed.) New York: Hafner, 1950.
Wechsler, H. Community growth, depressive disorders and suicide. Zamansky, H. S. A technique for assessing homosexual tendencies. Jour-
American Journal of Sociology, 1961, 67, 9-16. nal of Personality, 1956, 24, 436-448.
Weir, R. H. Language i n the crib. ThiTHague: Mouton, 1963. Zamansky, H. S. An investigation of the psychoanalytic theory of para-
Weiss, D. J., & Dawis, R. V. An objective validation of factual interview noid delusions. Journal of Personality, 1958, 26, 410-425.
data. Journal of Applied Psychology, 1960, 44, 381-385. Zeisel, H. S a y it with figures. (4th ed.) New York: Harper, 1957.
Weitz, J. Selecting supervisors with peer ratings. Personnel Psychology, Zipf, G. K. Some determinants of the circulation of information. American
1958, 11, 25-35. Journal of Psychology, 1946, 59,401-421.
Werner, H., & Wapner, S. Changes in psychological distance under Zipf, G. K. Human behavior and the principle of least effort. Cambridge:
conditions of danger. Journal of Personality, 1953, 24, 153-167. Addison-Wesley, 1949.
Index

Ability to replicate, 33-34 society news, 77; songbooks, 78;

Access to descriptive cues, 33, 85, 138 sports records, 76; tax payments, 70;
Accretion measures, controlled, 44-46 television repair, 78; tombstones, 53,
Accretion measures, natural, 38-43 54, 55, 61, 62; traffic fatalities, 73;
Acquiescent response set, 19-20 travel rates, 76; water pressure, 73,
Actuarial records, 57-65 74; weather data, 72
Adams, J. S., 160, 161 Archives, episodic and private: ab-
Advertising Service Guild, 123 senteeism, 100, 101, 102; advertising,
Alger, C. F.,89, 126 97, 98; air travel, 90; alcohol con-
Allport, G. W., 79, 95, 104 sumption, 89, 90; art, 109; autograph
Amrine, M.,150 prices, 93; book sales, 92, 97; chil-
Amthauer, R.,101 dren's drawings, 109; college grades,
Anastasi, A.. 19 102, 103; courts-martial, 103; desk
Anderson, A., 73, 97 calendars, 94; diaries, 106; drug
Andrew, R.J., 145 adoption, 93, 94; drug sales, 94;
Angell, R. C., 73 economic forecasts, 94, 95; height of
Anonymity guarantees, 15 pilots, 89; insurance purchases, 90,
Archival records, 14, 175, 179: advan- 91; job promotion, 101; job seniority,
tages of, 53, 87; biases of, 54-55, 84; 101; job turnover, 100, 101; laundry
selective deposit of, 54-55; selective activity, 104; letters, 104, 105, 106,
survival of, 54-55 107, 108; medical visits, 102, 103;
Archives, continuous: bedside records, pay increases, 101; peanut sales, 91;
81, 82; birth records, 57, 59; book re-enlistment rates, 100; seat belt
sales, 80, 81; cartoons, 77, 78; city sales, 96; soap consumption, 89;
budgets, 73; Congressional Record, stamp sales, 93; stock sales, 95;
68; death records, 73; directories, suicide notes, 108, 109; tirdiness,
63, 64, 65; divorce records, 59; elec- 102; thievery, 90; ticket sales, 97;
tion statistics, 69, 70; freight ship- union grievances, 102; work produc-
ments, 76; horse race betting, 80; tion, 99
journal articles, 82; judicial voting, 70, Ardrey, R., 152
71; legislative voting, 65-67; library Arellano, Lourdes, 125
withdrawals, 81; magazine fiction, 58; Armstrong, R., 80
marriage records,59,60; moonphases, Arrington, R.,136
72; newspaper circulation, 76, 79,80; Arsenian, J. M.,113
obituaries, 63; park acreage, 73; Ashley, J. W., 95
parking meter collections, 73; patents, Ataov, T.,164
75; photographs, 78, 79; plays, 78; Athey, K.R.,21
political speeches, 67,68, 78; popula- Awareness of being tested, 13-16, 50,
tion change, 61; power failures, 74, 175
75; press conferences, 77; sales Audiotapes, 144, 145, 146, 150, 151,
records, 76; school attendance, 73; 155
217
INDEX INDEX

Babchuk, N., 64 Bullfighters, v, 115, 116 Cox, G. H., 152, 153 Erosion measures, natural, 36-38
Back, K. W., 27, 115 Burchard, W. W., 150 Craddick, R. A,, 109, 111 Error: from investigator, 21-23, 113, 138,
Bain, H. M., 69, 70 Burchinal, L. G., 59 Cratty, J., 162 139, 142, 14+7n;from respondent, 13-
Bain, R. K., 113, 114 Burma, J. H., 116 Crespi, L. P., 18 21, 178
Bales, R. F., 113 Burtt, H. E., 129, 130 Criteria issues, 98-100 Ethics of unobtrusive research, v-vi, 150
Bandura, A., 156 Burwen, R., 28 Cronbach, L. .I., 19 Evan, W. M., 100
Barch, A. M., 127 Crowald, R. H., 62 Evans, F. J., 17
Barker, R. G., 137 Exline, R. V., 148
Barry, H., 109 Callahan, C., 162 "Exotic data" bias, 114
Barzun, J., 39 Callahan, J. D., 145 Dalton, M., 115 Experimental design, 6
Bass, B. M., 99n Campbell, D. T., 3, 5, 7, 12, 13, 14, 15, Darwin, C., 119, 120, 144 Experimenter error, 23
Bates, A. P., 64 20, 23, 28, 73, 100, 103, 123, 174 Data transformation, 46-50, 82-84, 111, "Experting" error, 17-18
Baxter, J., 18211 Cane, V. R., 19 180, 182-183 Expressive movement, 119-123
Baumrind, D., 25 Cannell, C. F., 21, 177n Dawis, R. V., 178 Exterior physical signs, 115-119
Becker, H. S., vii Cantril, H., 18, 21 DeCharms, R., 75, 80, 82
Becker, S. W., 125 Caplow, T., 18211 DeFleur, M. L., 97
Bellamy, R. Q., 168 Capra, P. C., 25 Delprato, D. J., 23 Facial expressions, 119, 120, 122, 149
Beloff, H., 125 Carhart, R., 150 Demand characteristics of research, 17 Fairbanks, H., 130
Beloff, J., 125 Carlson, J., 131 Dempsey, P., 66 Fantz, R. L., 159
Benney, M., 21 Carroll, P. F., 156 Deutsch, M., 13, 22, 112n, 113, 175 Farherow, N. L., 109
Berelson, B., 75n Carter, C. W., 130 Dexter, L. A., 106, 108 Farris, C. D., 66
Berger, C. S., 109 Chancellor, L. E., 59 Dickens, C., 113 Feldman, J. J., 21, 177n
Berkowitz, H., 168 Chandler, P. J., 161 Digman, J., 69 Feldman, N. G., 20
Berkson, G., 134 Chapman, L. J., 19 Direct mail experiments, 96 Feshbach, N., 123, 124
Berlyne, D. E., 110 Chapple, E. D., 145 Disc recordings, 146 Feshhach, S., 123, 124
Bernherg, R. E., 101 Cheating, 157, 158, 164, 165, 166 Dittman, A. T., 128 Festinger, L., 112n
Bernstein, E. M., 7 Christensen, H. T., 59 Dodge, J. S., 103 Fidget measures, 152, 153
Berreman, J. V. M., 80, 97 Cicourel, A. V., 23 Dollard, J., 104 Fiedler, F. E., 103
Binder, A., 184 Clark, K., 65 Donnelly, R. C., 179 Field, M., 153
Birdwhistell, R., 120 Clark, W. H., 64 Doob, L. W., 116, 133 Field, W. L., 14,5
Blake, R. R., 136,161,162,166,167,168 Cobh, W. J., 21, 177n Dornhusch, S., 98 Film recordings, 146, 147, 164
Blasques, J., 145 "Cocktail-party effect," 150 "Dragging," 163 Fisher, I., 8
Blau, P., 14 Cogley, J., 68 Dross rate, 32-33, 51,105, 107,133,141, Fiske, D. W., 3,5,28,99n, 103,174
Bloch, V., 153 Coleman, J., 93, 94 155, 156, 169, 170, 177, 180 Fitz-Gerald, F. L., 134
Blomgren, G. W., 96, 127 Coleman, J. E., 21 DuBois, C. N., 40, 44 Flagler, J. M., 156
j
Bock, R. D., 19 Coleman, R. P., 77 Duncan, C. P., 37 Flores, L., 122, 125
Body movement, 119-123 Complementary measurement, 1, 3, 5, Dunham, R. M., 30, 145 Flugel, J. C., 117
Boring, E. G., 64, 84, 95, 14.4, 147, 174 89, 178, 179 Durand, J., 53, 62 Fode, K. L., 23
Boring, M. D., 64 Conrad, B., 115, 116 Durkheim, E., 61, 72 Foote, E., 148
Bowen, Norma V., 147 Content: restrictions on, 31, 32, 50, 54, Forshufvud, S., 39
Brayfield, A. H., 102 57, 85, 105, 129, 133, 142, 146, 149, Foshee, J. G., 153
Breed, W., 132 176, 177, 178, 181, 182; rigidity of, 30, Eastman, M., 183 Fowler, E. M., 97
Brekstad, A., 178 31, 86; stability of over area, 31, 32, Eels, K., 118 Fowler, R. G., 164, 165
Britt, S. H., 31, 96 51, 86, 181; stability of over time, 31, Ehrle, R. A., 77 Franzen, R., 160
Brock, T. C., 159, 165 51, 86, 179, 180 Ehrlich, J., 21 Freed, A., 161
Brogden, H., 99n Conversation sampling, 127-134 Ekelblad, F. A., 8 Freeman, L. C., 164
Brookover, L. A., 27 Cook, S. W., 5, 13, 22, 112n, 131, 175 "Electric" eyes, 154, 155 Freeman, W., 113
Brown, J. W., 38, 81, 89 Coombs, C., 100 i Ellis, N. R., 155 French, E. G., 15
Brown, R. W., 30 Cooper, S. L., 26 Enciso, J., 50 French, N. R., 130
Brozek, J., 82 Copes, J., 101 Entrapment, 164-166 Freud, S., 118, 128
Bryan, J., 144 Cottrell, L., 102 Eriksen, C. W., 3 Fry, C. L., 64
Bugental, J. F. T., 68 Cox, C. M., 104 Erosion measures, controlled, 43, 44 Funt, A., 156
I INDEX
220 INDEX 1

Gabriele, C. T., 153 ment error, 142-149; for physically Interviewer effects, 21-22, 50, 87, 113, Kretsinger, E. A., 152
Gage, N. L., 66 supplanting observer, 149-155 114, 115 Krislov, S., 71
Galton, F., 49, 60, 14911, 150n, 151, 152 Hardy, H. C., 150 Interviews, 1, 3, 11, 21, 22, 30, 32, 33, Krout, M. H., 120, 122, 14,9
Garner, W. R., 3 Harper, S. F., 31, 99, 101 34,53,98, 138, 144*,154,172-173,176, Krueger, L. E., 95
Gearing, F., 117 Hart, C. W., 21, 177n 178,180 Krugman, H. E., 148
Gebhard, P. H., 42 Hartshorne, H., 157, 158, 159 Invalidity of measures: sources of, 12-27 Kruskal, W. H., 123
Ghiselli, E. E., 31,99, 100 Harvey, J., 80 Irwin, L., 113 Kuh~i,T., 10
Goldstein, J., 179 Hatt, P . K., 112n Ives-Word, V., 145 Kupcinet, I., 108
Goncourt, Edmond de, 137n Hawkins, C., 150
Good, C. V., 11211 Hecock, D. S., 69, 70
Goode, W. J., 112n Heim, A. W., 19 Jackson, D. N., 19 Landis, C., 130, 149
Goodman, C., 12; 18 Heinze, S. J., 60 Jacques, E., 135 Landis, M. H., 129, 130
Gordon, T., 9911 Helson, H., 166 Jaffee, A. J., 6 Lang, G. E., 115
Gore, P.M., 168 Henle, M., 131 Jahoda, M., 13, 22, 96,11211, 175 Lang, K., 115
Gosnell, H. F., 69 Henry, H., 95, 96 Jahoda-Lazarsfeld, M., 96 Language behavior, 127-134
Gottschalk, L. A., 109 Herhiniere-Lebert, S., 124 James, J., 32 LaPiere, R. T., 160
Gould, J., 121 Hess, E. H., 148 James, R. M., 150 L.asswel1, H. D., 62, 78
Government records, 65-75 Heusler, A. F., 145 James, R. W., 77 Lawson, R., 23
Grace, H., 68 Heyns, R., 17611 James, W., 144, Lea, T., 116
Graham, Billy, 115 Hickman, L., 98 Janis, I. L., 102, 172n Leacock, S., 115
Gratiot-Alphandery, H., 124 Hildum, D. C., 30 Janowitz, M., 106 Lefkowitz, M., 162
Green, E., 71 Hillebrandt, R. H., 90 Jay, R., 101 Legget, R. F., 150
Green, H. B., 118 Holmes, L. D., 21 Jokes, 115 Leggett, J. C., 21
Greenhill, L. P., 153 Holsti, 0. R., 75n Jones, R. E., 103 Lehman, H. C., 64,
Griffin, J. R., 122 Honesty, 157, 158, 164, 165, 166 Jones, R. W., 81 Leipold, W. D., 125
Griffith, R. M., 80, 95 Horowitz, M., 113 Jones, V., 164 Lenski, G. E., 21
Grinder, R. E., 165 Horst, P., 158 Jung, A. F., 159 Leroy-Boussion, A., 153
Grusky, O., 76, 126 Hovland, C. I., 12, 17211 Lewin, H. S.,78
Guidice, C. D., 165 Howells, L. T., 125 Lewis, O., 59
Guilford, J. P., 7, 101 Huang, S. C., 98 Kadish, S., 14 Libby, W. I., 56
Guinea pig effect, 13-16 Hubble, M. B., 131 Kahn, R. L., 21, 17711 Lippitt, R., 17611
Guion, R. M., 98, 99 Hughes, E. C., 41 Kaminski, G., 125 Lippmann. W., 86
Gullahorn, J., 114, Humphreys, L. G., 3 Kane, F., 118 Location, physical, 123-127
Gump, R., 148 Hunt, W. A., 149 Kappel, J. W., 80 Lodge, G. T., 88
Gusfield, J. R., 114 Hutchins, E. B., 103 > Kasper, J. C., 43 Lombroso, C., 72
Guttman scaling, 158 Hyman, H. H., 21, 17711 Katz, D., 21, 112n Lucas, D. B., 31, 96
Katz, E., 93, 94 Lumsdaine, A. A., 12, 102
Kees, W., 120 Lumsdaine, M. H., 102
Haddon, W., 140 Ianni, F. A., 82 Kendall, L. M., 5 Lustig, N. I., 69
Hafner, E. M., 9 Imanishi, K., 123 Kendall, M. G., 8 Lyle, H. M., 152
Haggard, E. A., 178 Imperfection of all measures, 3-4, 172 Kenkel, W. F., 59
Haigh, G. A., 109, 111 Inclination measure, 151 Kerlinger, F. N., 112n
Hain, J. D., 166 Index correlations, 7 Kinesics, 120 Mabie, E., 131
Hake, H. W., 3 Index numbers, 6-8, 47-50, 82, 83, 182- Kinsey, A. C., 42 Mabley, J., 74
Halbwachs, M., 22 183 Kintz, B. L., 23 Maccoby, E. E., 155
Hall, R. L., 102, 167 Industrial and institutional records, 98- Kitsuse, J. I., 23 McCarroll, J. R., 140
Hamburger, P., 156 104 Knapp, R. H., 118 McCarthy, D., 131
Hamilton, T., 95 Informants, 114, 115 Knox, J. B., 100 McCarthy, P. J., 24
Hansen, A. H., 8 Infrared photography, 153, 154 Koenig, W., 130 1CIcClelland, D. C., 40, 135
Hansen, H. L., 153-154 Interaction chronograph, 145 Kort, F., 70 McCormack, T. H., 5, 15
I
Hanson, N. R., 10 Interpretable comparisons and plau- ! Kramer, E., 128 Macfarlane, J. W., 178
Hardware: for avoiding human instru- sible rival hypotheses, 5-10 Krasner, L., 30 McGee, R., 182n
222 INDEX INDEX

McGranahan, D., 78 Mowrer, 0 . H., 104 Observer, caaacitv

- . weakness of, 143, Pickett, J. M., 150
McGrath, J. E., 102 Mudgett, B. D., 8 144 Platt, J. R., 9
McGraw, M., 178 Multiple operationism, 1-5, 34,, 53, 98, Observer, intervening, 155-164 Plausible rival hypotheses, 8-10, 174-
Mack, R. W., 14 174 Observers. visible.. 113., 114 175
MacKinney, A. C., 99n Murphy, G., 120 O'Connor, N., 82 Polansky, N., 113
MacLean, W. R., 150 Murphy, L., 120 Olson, W. C., 137 Political and judicial records, 65-72
McMichael, R. E., 165 Operating ease and validity checks, Politz Media Studies, 23, 147, 149
MacRae, D., 66, 67, 83, 118 32-34 Pollack, I., 150
MacRae, E., 118 Nagel, S., 70 Operational definition as mistaken con- Polt, J. M., 148
Madge, J., 112n Nangle, J., 127 cept, 4 Pomeroy, W. B., 4,2
Mahl, G., 128 Naroll, F., 114 Operationism and multiple operations, Pool, I. de Sola, 75n
Maller, J. B., 157 Naroll, R., 40, 50, 54, 55, 56, 114 3-5, 172 Popper, K., 10
Manago, B. R., 115 NASA, 145 Ordy, J. M., 159 Population restrictions, 23-26, 51, 85,
Mann, R. D., 150 National Advertising Company, 96 Orne, M. T., 17 105, 107, 108,133, 138, 169, 173, 177,
Marley, E., 152, 153 Neibergs, A., 73, 97 Osgood, C. E., 108, 110, 143 182
Marsh, R. M., 65 Neugarten, B., 77 OSS Assessment Staff, 117 Population stability: over areas, 27, 51,
Martin, C . E., 42 Neuringer, C., 108 Osterkamp, U., 125 139; over time, 26-27, 51, 139, 179-
Martin, P., 156 Niemi, D., 66, 83 Outcroppings, measurement of, 27-29, 180
Mason, C. W., 20 Noah, J. E., 79 34 Posman, H., 132
Mass media records, 75-82 Noise-level index, 145 "Preamble" effect, 18
Matarazzo, J. D., 30, 145, 146 North, R. C., 7511 Precker, J. A., 120
Matthews, T. S., 78 Northwood, T. D., 150 Paisley, W. J., l l l n Presswood, Susan, 9
May, M. A., 157, 158, 159 Paponia, N., 113 Probability sampling, 24
Measurement, absolute, 5-6 Parke, R. D., 147 Prosser, W. L., 5
Measurement as change agent, 18-19, Observation, participant, 114, 115, Parker, E. B., 81,111 Pryer, R. S., 155
50, 175 132 Participant observation, 114,115,132 Pupil dilation, 120, 148
Measurement as comparison, 5-10, 172, Observation, simple, 112-141, 179; con- Pearson, K., 15011 Pyles, M. K., 178
181 versation sampling, 127.134.; expres- Perrine, M., 132
Mechanic, D., 103 sive movement, 119-123; exterior Persinger, G. W., 23
Meeker, M., 118 physical signs, 115-119; physical loca- "Personal equation," 95 Questionnaires, 1, 3, 19, 20, 33, 34,
Melbin, M., 100 tion, 123-127; time duration, 134,135 Personal space, 125 53, 98, 138, 144, 154, 160, 176, 178,
Melton, A. W., 20, 37, 134 Observation measures, simple: aggrega- Persons, C. E., 23 180
Menzel, H., 93, 94, tion, 123, 124; athletes' movement, Petitions, 166-168 Quick, R., 145
Merritt, C. B., 164, 165 123; bullfighter's beard, v, 115, 116; Phillips, R. H., 119 Quine, W. V., 10
Messick, S. J., 19 calluses, 119; chair position, 124; Photoelectric cells, 155, 156
Mettee, D. R., 23 clothing, 117,118; conversations, 128- Photography, 147, 153, 154, 156, 164,
Middleton, R., 58, 77 134; eye movement, 122; facial ex- Physical location, 123-127 Ramond, C. K., 95
Miller, G. A., 76 pression, 119,120,122,149; frowning, Physical signs, exterior, 115-119 Rapaport, D., 113
Mills, F. C., 8 119, 120; gait, 120: grimaces, 122; Physical trace measures: "actometer," Rashkis, H., 81
Mindak, W. A., 73, 97 hair erection, 120; hair length, 116, 43; book wear, 37, 38; door locking, Ray, M. L., 78
Mitchell, W. C., 8 117: hand movement, 120, 121; hand- 4,O; fingerprints, 40, 45; glue seals on Reactive measurement effect, 13-23,50,
Moeller, G., 75, 80, 82 prints, 121; houses, 118; jewelry, 118; pages, 44, 47; hair analysis, 39; litter, 53, 86, 113-115, 131, 139, 170, 172
Molloy, L. B., 178 jokes, 115; language of the mass 42; noseprints, 45, 46; radio station Reddy, J., 123
Moore, H. T., 129, 134 media, 119; leg jiggling, 122; physical frequencies, 39; refuse, 41; shoe wear, Reisman, J. M., 43
Moore, U., 162 distance, 123, 124, 125; physical 4,3; suits of armor, 40; stair wear, 35; Reitman, A. P., 21
Moore, W. E., 6 position, 125, 126, 127; pupil dilation, tile wear, 2, 36,37; toilet inscriptions, Replication, 33-34
Morgan, E. M., 179 120; scars, 116; shoe styles, 117; 42; urns, 40, 41; weighing food, 38; Research instrument change, 22-23, 87
Morgenstern, 0 . . 56 store names, 119; street signs, 119; whiskey bottles, 2, 41, 42; wrist- Response sets, 19-21, 50, 134, 139, 142,
Morris, J. C., 145 tattoos, 116; teeth uncovering, 120; watches, 43 143, 176
Mosteller, F., 38, 111 telephone speech, 130: time duration, Physical traces, 35-52: ad>antages of, Riesman, D., 21, 150, 171
Mourgue, R., 154 134., 135; tribal markings, 116; turn 50, 51, 52; selective deposit of, 50; Riker, W., 66, 83
Mouton, J. S., 161, 162, 166, 168 signalling, 127; visual duration, 134; selective survival of, 50; weaknesses Riley, M. W., 105, 11211, 114
Movement, expressive, 119-123 window signs, 119 of, 36, 50, 51 Robbins, L. C., 178
INDEX INDEX

Robinson, D., 21 Shils, E. A., vi, 150 Thorndike, R. L., 31, 99, 101, 103 Watson, R. I., 150
Robinson, E.'S., 20 Shimberg, B., 66 Throne, J. M., 43 Wax, R. H., 114
Roens, B. B., 97 Siersted, E., 153, 154 Time duration, 134, 135 Wayne, I., 78
Rogow, A. A., 62 Singh, P. H., 98 Toulouse, M. M., 154 Webb, E. J., 68, 80, 117, 118, 143
Rohde, S., 21 Skard, A. G., 178 Triangulation of measurement, 3, 174, Wechsler, H., 61
Role selection, 16-18, 50 Skolnicli, J. H., 70, 164 179, 181 Weir, R. H., 155
Roles, confounding. 17 Sleeper, C. B., 130 Trueswell, R. W., 154 Weiss, D. J., 178
Rorer, L. G., 19 Sletto, R. F., 19 Trumho, D., 127 Weitman, M., 146
Rosenhaum, M. E., 167, 168 Smith, H. T., 125, 178 Turner, W., 99n Weitz, J., 101
Rosenthal, A. M., 117 Smith, M., 102 Tuttle, D., 69 Werner, H., 125
Rosenthal, R., 23, 139 Snyder, E. C., 71 Wessman, A. W., 132
Ross, H. L., 24, 73 Snyder, R., 100 West, D. V., 148
Ratter, J. B., 168 Solley, C. M., 109, 111 Udelf, M. S., 159 West, E. D., 130
Ruesch, J., 120 Solomon, R. L., 12 Udy, S. H., 84, 85 Whaley, F., 113
Rush, C., 99n Sommer, R., 124 Ulett, G. A., 145 Whisler, T. L., 31, 99, 101
Spiegel, D. E., 108 Ulmer, S. S., 70 White, R. K., 78
Stanley, J. C., 12, 13 Underwood, B. J., 15 Whyte, W. H., 125
Sales records, 90-98 Stanton, A. H., 103 Wiens, A. N., 30, 145, 146
Sales situations, experimental, 159, 160 Star, S., 21, 102 Wiggle measures, 151, 152
Salzinger, K., 106 Stechler, G., 159 Validity, external, 11, 12, 172, 173, 180 Wigmore, J. H., 9, 122, 179
Sampling: areas, 27, 31-32, 138, 169, Stein, M. I., 60 Validity, internal, 10, 12. 172 Wilkins, J. L., 127
182; telephone, 26; time, 26, 31, 58, Steiner, G. A., 78 Validity, threats to, 12-27 Willerman, B., 102
135-138, 173, 182; volunteers, 25 Steiner, I. D., 102, 145 Vending machines, 97 Williams, R., 24, 102
Sampling error, 23-27, 29-32 Stember, C. H., 21, 177n Vernon, D. T. A., 81 Wilson, E. B., 4
Sanford, F., 150 Stephan, F. F., 24 Videotape recording, 146, 147, 156 Windle, C., 19
Saslow, G., 30, 145, 146 Stern, R., 63 Vidich, A. J., 5 Winick, C., 128
Sawyer, H. G., 41 Stewart, C. D., 6 Vikan-Kline, L., 23 Winship, E. C.. 79, 95
Scates, D. E., 112n Stewart, J. Q., 76 Vinrent, C . E., 25 Winston, S., 57
Schachter, S., 102, 167 Stoke, S. M., 130 Violation of prohibitions. 161, 162, 163 Winters, L. C., 148
Schanck, R. L., 12, 18 Stolz, H. R., 178 Voas, R. B., 30, 145 Witty, P. A., 64
Schappe, R. H., 23 Stouffer, S. A,, 7, 102 Volkart, E. H., 103 Wolff, C., 121
Scheihe, K. E., 17 Strauss, G., 114 Volunteering, 25, 166-168 Wolff. W., 120
Scheuneman, T. W., 96, 127 Strodtheck, F. L., 150 Vose, C. E., 71 Word, T., 145
Schneidman, E. S., 109 Stromherg, E. L., 131 Wright, H. F., 137
Schuhert, G., 70, 121 Stuart, I. R., 102, 111 Written documents, 104,-109
Schulman, J. L., 43 Superstitious behavior, ,l22 Walker, E., 108, 110 Wynne, L. C., 128
Schwartz, M. S., 103 Sussman, L., 106 Wallace, A. F. C., 81
Schwartz, R. D., 14, 70, 164, 179 Wallace, J., 109, 111
Yule, G. U., 8
Sehald, H., 78 Wallace, 0. L., 111
Sechrest, L. B., 40, 42, 100, 109, 111, Tandy, M., 68 Wallace, W. P., 123
118, 121, 122, 123, 125, 127, 148, 162 Tang, J., 21 Walters, R., 14.7 Zamansky, H. S., 148
Seifried, S., 145 Tannenhaum, P. H., 79 "Waltz technique," 124 Zaninovich, M. G., 75n
Selective deposit of materials, 50, 54, Tarde, G., 129 Wapner, S., 125 Zeisel, H., 8, 96
55, 56 Taylor, E., 9911 Warner, W. L., 61, 62, 118 Zinnes, D. A., 75n
Selective survival of materials, 50, 54, Telephone sampling, 26 Washhurne, C., 135 Zipf, G. K., 75, 76, 91
56, 57 Temperature measure, 154 Watson, J., 132, 150 Znaniecki, F., 104
Selltiz, C., 13, 22, 11211, 175 Terman, L. M., 104-105, 110
Severin, D., 99n Test familiarity, 17
Shadegg, S. C., 177 Theory, outcroppings of, 28
Shapiro, G. A., 5 Thomas, D. S., 137
Sheffield, F. D., 12 Thomas, W. I., 104
Shepard, H. R., 136 Thorndike, E. L., 72, 73, 83 PRINTED IN U.S.A.