0% found this document useful (0 votes)
292 views22 pages

Ch4 Klaus Krippendorff Content-Analysis-4e

Uploaded by

andreatriputrixz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views22 pages

Ch4 Klaus Krippendorff Content-Analysis-4e

Uploaded by

andreatriputrixz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Sage Research Methods

Content Analysis: An Introduction to Its Methodology

For the most optimal reading experience we recommend using our website.
A free-to-view version of this content is available by clicking on this link, which
includes an easy-to-navigate-and-search-entry, and may also include videos,
embedded datasets, downloadable datasets, interactive questions, audio
content, and downloadable tables and resources.

Author: Klaus Krippendorff


Pub. Date: 2022
Product: Sage Research Methods
DOI: https://doi.org/10.4135/9781071878781
Methods: Content analysis, Research questions, Sampling
Disciplines: Communication and Media Studies, Marketing
Access Date: October 3, 2024
Publisher: SAGE Publications, Inc.
City: Thousand Oaks
Online ISBN: 9781071878781

© 2022 SAGE Publications, Inc. All Rights Reserved.


Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

The Logic of Content Analysis Designs

As a technique, content analysis relies on several specialized procedures for handling text. These
can be thought of as tools for designing suitable analyses. This chapter outlines the key components
of content analysis and distinguishes among several research designs, especially designs used in
the preparation of content analyses and designs for content analyses that collaborate with other re-
search methods to contribute to larger research efforts.

4.1 Content Analysis Designs

The very idea of research—a repeated search within data for generalizations, patterns that appear to per-
meate the data—presupposes explicitness about methodology. Unless researchers explain clearly what they
have done, how can they expect to be able to replicate their analyses or to process more texts than an in-
dividual can read? Beyond that, how can they convince others that their research was sound and thus their
results should be accepted?

A datum is a unit of information that is recorded in a durable medium, distinguishable from and comparable
with other data, analyzable through the use of clearly delineated techniques, and relevant to a particular prob-
lem. Data are commonly thought of as representing observations or readings, but they are always the prod-
ucts of chosen procedures and geared toward particular ends—in content analysis, data are justified in terms
of the procedures the researcher has chosen to answer specific questions concerning phenomena in the con-
text of given texts. Hence data are made, not found, and researchers are obligated to say how they generated
their data.

The network of steps a researcher takes to conduct a research project is called the research design, and
what knits the procedural steps into the fabric of a coherent research design is the design’s logic. Generally,
this logic concerns two qualities: the efficiency of the procedural steps (avoiding structural redundancies while
preventing “noise” from entering an analysis) and the evenhandedness of data processing (preventing preju-
dices from biasing the data or favoring of one outcome over another). This logic enables analysts to account
to their scientific community for how the research was conducted. For a research design to be replicable,
not merely understandable, the researcher’s descriptive account of the analysis must be complete enough to
serve as a set of instructions to coders, fellow researchers, and critics—much as a computer program deter-
Content Analysis: An Introduction to Its Methodology
Page 2 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

mines what a machine is to do. Although the thoroughness of a computer program may serve as a scientific
ideal, in social research the best one can hope for is an approximation of that ideal. Content analysts in par-
ticular must cope with a good deal of implicitness in their instructions. (I will return to this topic in subsequent
chapters.)

Traditional guides to research methods tend to insist that all scientific research tests hypotheses concerning
whether or not patterns are evident in the data. Content analysis, however, has to address prior questions
concerning why available texts came into being, what they mean and to whom, how they mediate between
antecedent and consequent conditions, and, ultimately, whether they enable the analysts to select valid an-
swers to questions concerning their contexts. Hence the logic of content analysis designs is justifiable not
only according to accepted standards of scientific data processing (efficiency and evenhandedness) but also
by reference to the context in relation to which texts must be analyzed.

Figure 2.2 contextualizes content analysis and content analysts most comprehensively. It may be seen to
contain Figure 4.1, which represents the simplest content analysis design. Here, the analyst relies solely on
available texts to answer a research question. Although this figure locates texts and results—inputs and out-
puts of the analysis—in a chosen context, it suggests nothing about the nature of the context that justifies the
analysis (discussed in Chapter 3) or about the network of needed analytical steps, which I address below.

Figure 4.1 ■ Content Analysis: Answering Questions Concerning a Context of Texts

Content Analysis: An Introduction to Its Methodology


Page 3 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

4.1.1 Components

Here we open the “content analysis” box in Figure 4.1 and examine the components the analyst needs to pro-
ceed from texts to results. Listing these components is merely a convenient ways to partition, conceptualize,
talk of, and evaluate content analysis designs step by step. As accounts of what the components do must al-
so serve as instructions for replicating them elsewhere, each component has a descriptive and an operational
state:

• Unitizing: relying on definitions of relevant units


• Sampling: relying on sampling plans
• Recording/coding: relying on coding instructions
• Reducing data to manageable representations: relying on established statistical techniques or other
methods for summarizing or simplifying data
• Abductively inferring contextual phenomena: relying on established analytical constructs or pre-
sumed models of the chosen context as warrants
• Narrating the answer to the research question: relying on narrative traditions or discursive conven-
tions established within the discipline of the content analyst

Together, the first four components constitute what may be summarily called data making—creating com-
putable data from raw or unedited texts. In the natural sciences, these four are embodied in physical mea-
suring instruments. In the social sciences, the use of mechanical devices is less common—often impossi-
ble—and data making tends to start with observations. The fifth component, abductively inferring contextual
phenomena, is unique to content analysis and goes beyond the representational qualities of data. I describe
each of the components in turn below.

Unitizing draws systematic distinctions within a continuum of otherwise undifferentiated text—documents, im-
ages, voices, videos, websites, and other observables—that are of interest to an analysis, omitting irrelevant
matter but keeping together what cannot be divided without loss of meaning. In Chapter 5, I discuss differ-
ent kinds of units—sampling units, recording units, context units, units of measurement, units of enumera-
tion—and the different analytical purposes they serve. With these diverse uses, unitizing may occur at various
places in a content analysis design. Content analysts must justify their methods of unitizing, and, to do so,
they must show that the information they need for their analyses is represented in the collection of units, not
in what unitizing omits and not in the relationships between the units, which an analysis discards by treating

Content Analysis: An Introduction to Its Methodology


Page 4 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

them independent of each other.

Sampling allows the analyst to economize on research efforts by limiting observations to a manageable sub-
set of units that is statistically or conceptually representative of the set of all conceivable relevant units, the
population or universe of interest. Ideally, an analysis of a whole population and an analysis of a represen-
tative sample of that population should come to the same conclusion. This is possible only if the population
manifests redundant properties that do not need to be repeated in the sample drawn for analysis. But samples
of text do not relate to the issues that interest content analysts in the same way that samples of individuals
relate to populations of individuals of interest in surveys of public opinion, for example. Texts can be read
on several levels—at the level of words, sentences, paragraphs, chapters, or whole publications; as literary
works or discourses; or as concepts, frames, issues, plots, genres—and may have to be sampled accordingly.
Hence creating representative samples for content analyses is far more complex than creating samples for,
say, psychological experiments or consumer research, in which the focus tends to be on one level of units,
typically individual respondents with certain attributes (I discuss the issues involved in sampling for content
analysis in depth in Chapter 6). In qualitative research, samples may not be drawn according to statistical
guidelines, but the selective use of quotes and examples that qualitative researchers present to their readers
are intended to serve the same function as do samples. Quoting typical examples in support of a general
point implies the claim that they are fair representations of the phenomena of concern.

Recording/coding bridges the gap between texts and someone’s reading them, between distinct images and
what people see in them, or between separate observations and their situational interpretations. One rea-
son for this analytical component is researchers’ need to create durable and analyzable records of otherwise
transient phenomena, such as spoken words or passing visual events. Once such phenomena are suitably
recorded, analysts can compare them across time, apply different methods to them, and replicate the analy-
ses of other researchers. Written text is always already recorded in this sense, and, as such, it is rereadable.
It has a material base—much like an audiotape, which can be replayed repeatedly—without being in an ana-
lyzable form, however. The second reason for recording/coding is, therefore, content analysts’ need to trans-
form unedited texts, original images, and/or unstructured sounds into analyzable representations. The coding
of text is mostly accomplished through human intelligence. I discuss the processes involved in recording and
coding in Chapter 7, and then, in Chapter 8, I discuss the data languages used to represent the outcomes
of these processes. In content analysis, the scientific preference for mechanical measurements over human
intelligence is evident in the increasing use of computer-aided text analysis (discussed in Chapter 11); the key
Content Analysis: An Introduction to Its Methodology
Page 5 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

hurdle of such text analysis software, not surprisingly, is the difficulty of programming computers to respond to
the meanings of texts. It often confuses character strings with what they mean to particular readers or users
of documents.

Reducing data serves analysts’ need for efficient representations, especially of large volumes of data. A type/
token statistic (a list of types and the frequencies of tokens associated with each), for example, is a more
efficient representation than a tabulation of all occurrences. It merely replaces duplications by a frequency.
Because one representation can be created from the other, nothing is lost. However, in many statistical tech-
niques for aggregating units of analysis—correlation coefficients, parameters of distributions, indices, and
tested hypotheses—information is lost. In qualitative pursuits, rearticulations and summaries have similar ef-
fects: They reduce the diversity of text ideally to what matters.

Abductively inferring contextual phenomena from texts moves an analysis outside their data. It bridges the
gap between descriptive accounts of texts and what they mean, refer to, entail, provoke, or cause. It points
to unobserved phenomena in the context of interest to an analyst. As I have noted in Chapter 2, abductive
inferences—unlike deductive or inductive ones—require warrants, which in turn may be backed by evidence.
In content analysis, such warrants are provided by analytical constructs (discussed in Chapter 9) that are
backed by everything known about the context. Abductive inferences distinguish content analysis from other,
largely inductive modes of inquiry.

Narrating the answers to content analysts’ questions amounts to researchers’ making their results compre-
hensible to others. Sometimes, this means explaining the practical significance of the findings or the contri-
butions they make to available literature. At other times, it means arguing the appropriateness of the use of
content analysis rather than direct observational techniques. It could also entail making recommendations for
actions—legal, practical, or for further research. Narrating the results of a content analysis is a process in-
formed by traditions that analysts believe they share with their audiences or the beneficiaries of their research
(clients, for example). Naturally, most of these traditions are implicit in how social scientists conduct them-
selves. Academic journals may publish formal guidelines for researchers to follow in narrating their results
and let peer reviewers decide whether a given content analysis is sound, interesting, and worthwhile.

The six components of content analysis do not need to be organized as linearly as suggested by Figure 4.2.
A content analysis design may include iterative loops—the repetition of particular processes until a certain
quality is achieved. Or components may recur in various guises. For example, unitizing may precede the sam-
Content Analysis: An Introduction to Its Methodology
Page 6 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

pling of whole documents, but it may also be needed to describe the details of their contents. Thus, coding
instructions may well include unitizing schemes. Moreover, a content analysis could use components that are
not specifically highlighted in Figure 4.2. Decisions, to mention just one analytical action, typically direct the
content analysts along an inferential path with many forks and turns toward one or another answer to the re-
search question. Here, decisions are part of the inference component. Finally, it is important to note that there
is no single “objective” way of flowcharting research designs.

Figure 4.2 ■ Components of Content Analysis

The analyst’s written instructions (represented in boldface type in Figure 4.2), which specify the components
in as much detail as feasible, include all the information the analyst can communicate to other analysts so that
they can replicate the design or evaluate it critically. The traditions of the analyst’s discipline (in medium type
in Figure 4.2) are an exception to the demand for explicitness. Most scientific research takes such traditions
for granted.

Any set of instructions, it must be noted, imposes a structure on the available texts. Ideally, this structure feels
natural, but it may feel inappropriate or forced, if not alien, relative to the analyst’s familiarity with the texts’
context. Take unitizing, for example. Texts may be cut into any kind of units, from single alphabetical char-
acters to whole publications. Unitizing is arbitrary but not for a particular content analysis. For example, if an
analyst wants to infer public opinion from newspaper accounts, stories may be more natural for an examina-

Content Analysis: An Introduction to Its Methodology


Page 7 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

tion of what readers think and talk about than, say, value-laden words that occur in these accounts. The use
of inappropriate units leads analysts to experience conceptual troubles. Or an analyst may apply a particular
sampling plan and then discover, perhaps too late, not only that the sampled documents are unevenly rele-
vant but also that the sampling plan has excluded the most significant ones. Finally, in reading given texts, an
analyst may encounter important concepts for which the coding instructions fail to provide suitable categories.
Such a discovery would render the recording/coding task irrelevant or of uncertain validity. During the devel-
opment phase of content analysis design, a sensible analyst should “resist the violence” that poor instructions
can inflict on the texts and attempt to reformulate instructions as needed so that they are appropriate to the
texts at hand. The path that a sensible approach might take is illustrated in Figure 4.2 by the dashed lines,
which show another flow of information that is motivated by the analyst’s resistance to inappropriate analytical
steps. The instructions in good content analysis designs always take such information into account.

A final point regarding Figure 4.2: As noted in Chapter 2, texts are always the observable parts of a chosen
context. The context directs the analysis of a text, and the results of the analysis contribute to a (re)concep-
tualization of the context, redirecting the analysis, and so forth. This reveals the essentially recursive nature
of the process of designing content analyses. This recursion contrasts sharply with the application of a con-
tent analysis design, which is essentially a one-way transformation of available texts into the answers to the
analyst’s research questions. We must therefore distinguish between the development of a content analysis,
during which a design with context-sensitive specificity should emerge, and the execution of a content analy-
sis, during which the design is relatively fixed and ideally replicable, regardless of what the texts could teach
the analyst. Interestingly, the context-sensitive path that the content analyst takes while developing the design
may no longer be recognizable when the finished design is applied to large volumes of text and/or replicated
elsewhere.

4.1.2 Quantitative and Qualitative Content Analysis

In Chapter 2, I noted that quantification cannot be a defining criterion for content analysis. Text is always
qualitative to begin with, categorizing textual units is considered the most elementary form of measurement
(Stevens, 1946), and a content analysis may well result in verbal answers to a research question. Using num-
bers instead of verbal categories or counting instead of listing quotes is merely convenient; it is not a require-
ment for obtaining valid answers to a research question. In Chapter 1, I suggested that the quantitative/qual-

Content Analysis: An Introduction to Its Methodology


Page 8 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

itative distinction is a mistaken dichotomy between the two kinds of justifications of content analysis designs:
the explicitness and objectivity of scientific data processing on one side and the appropriateness of the pro-
cedures used relative to a chosen context on the other. For the analysis of texts, both are indispensable. My
view is the subject of a continuing debate. Proponents of quantification, starting with Berelson and Lazarsfeld
(1948), but most explicitly Lasswell (1949/1965b), equate quantification with science and dismiss qualitative
analyses as mere literature. This view has been rightly criticized for restricting content analysis to numerical
counting exercises (George, 1959b) and for uncritically buying into the measurement theories of the natural
sciences or commercial or political interests in large markets. Proponents of qualitative approaches, who have
come largely from the traditions of political analysis, literary scholarship, ethnography, and cultural studies
(Bernard & Ryan, 1998), have been criticized for being unsystematic in their uses of texts and impressionistic
in their interpretations. Although qualitative researchers compellingly argue that each body of text is unique,
affords multiple interpretations, and needs to be treated accordingly, there is no doubt that the proponents
of both approaches sample text, in the sense of selecting what is relevant; unitize text, in the sense of dis-
tinguishing words, propositions, or larger narrative units and using quotes or examples; contextualize what
they are reading in light of what they know about the circumstances surrounding the texts; and have specific
research questions in mind. Thus, the components of content analysis in Figure 4.2 are undoubtedly present
in qualitative research as well, albeit less explicitly so. I think it is fair to state the following:

• Avowedly qualitative scholars tend to find themselves in a hermeneutic circle, using known literature
to contextualize their readings of given texts, rearticulating the meanings of those texts in view of the
assumed contexts, and allowing research questions and answers to arise together in the course of
their involvement with the given texts. The process of recontextualizing, reinterpreting, and redefining
the research question continues until some kind of satisfactory interpretation is reached (see Figure
4.3). Scholars in this interpretive research tradition acknowledge the open-ended and always tenta-
tive nature of text interpretation. Taking a less extreme position, content analysts are more inclined
to limit such hermeneutic explorations to the development phase of research design.
• Qualitative scholars resist being forced into a particular sequence of analytical steps, such as those
illustrated in Figure 4.2. Acknowledging the holistic qualities of texts, these scholars feel justified in
going back and revising earlier interpretations in light of later readings; they settle for nothing less
than interpretations that do justice to a whole body of texts. As such readings cannot easily be stan-
dardized, this process severely limits the volume of texts that a single researcher can analyze con-
sistently and by uniform standards. Because this process is difficult to describe and to communicate,

Content Analysis: An Introduction to Its Methodology


Page 9 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

qualitative studies tend to be carried out by analysts working alone, and replicability is generally of
little concern. By contrast, faced with larger volumes of text and working in research teams, content
analysts have to divide a body of texts into convenient units, distribute analytical tasks among team
members, and work to ensure the consistent application of analytical procedures and standards. For
these reasons, content analysts have to be more explicit about the steps they follow than qualitative
scholars may need to be.
• Qualitative researchers tend to acknowledge the possibility of multiple interpretations of textual units
by considering diverse voices (readers), alternative perspectives (from different ideological posi-
tions), oppositional readings (critiques), or varied uses of the texts examined (by different groups).
In Figure 2.2 these are referred to as the many worlds of others. This conflicts with the measure-
ment model of the natural sciences—the assignment of unique measures, single values, typically
numbers, to distinct objects—but not with content analysts’ ability to use more than one context for
justifying multiple inferences from texts.
• Qualitative content analysts support their interpretations by weaving quotes from the analyzed texts
and literature about the contexts of these texts into their conclusions, by constructing parallelisms,
by engaging in triangulations, and by elaborating on any metaphors they can identify. Such research
results tend to be compelling for readers who are interested in the contexts of the analyzed texts.
Quantitative content analysts, too, argue for the context sensitivity of their designs (or take this as
understood), but they compel readers to accept their conclusions by assuring them of the careful ap-
plication of their design.
• Committed qualitative researchers tend to apply criteria other than reliability and validity to their re-
sults. It is not clear, however, whether they take this position because intersubjective verification of
their interpretations is extraordinarily difficult to accomplish or whether the criteria they propose are
truly incompatible with making abductive inferences from texts. Among the many alternative criteria
qualitative scholars have advanced are, according to Denzin and Lincoln (2000, p. 13), trustworthi-
ness, credibility, transferability, embodiment, accountability, reflexivity, and emancipatory capabilities.

In other words, qualitative approaches to text interpretation are not incompatible with content analysis. The
recursion (hermeneutic circle) shown in Figure 4.2 is visible in Figure 4.3 as well, although the former figure
provides more details and is limited to the design phase of a content analysis. Multiple interpretations are
not limited to qualitative scholarship either. Content analysts can adopt multiple contexts, pursue multiple re-
search questions, and apply multi-valued codes to the phenomena of interests. Content analysis researchers’
reflexive involvement—systematically ignored in naturalist inquiries, often acknowledged in qualitative schol-
Content Analysis: An Introduction to Its Methodology
Page 10 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

arship—manifests itself in the awareness that they are the ones who construct contexts for their analysis,
acknowledge the worlds of others in the pursuit of their own research questions, and adopt analytical con-
structs based on available literature or prior knowledge about the contexts of given texts. Whether a close but
uncertain reading of small volumes of text is superior to a systematic content analysis of large bodies of text
is undecidable in the abstract.

Figure 4.3 ■ Qualitative Content Analysis

The reason this book appears to rely most heavily on examples involving quantitative content analyses is
that researchers working within this tradition have tended to encourage greater explicitness and transparency
than have qualitative scholars.

4.2 Designs Preparatory to Content Analysis

Generating data—describing what was seen, heard, or read—is relatively easy. Content analyses succeed or
fail, however, with the validity of the analytical constructs that guide their coding instruction and inform their
inferences. Once established, analytical constructs may become applicable to a variety of texts and may be
passed on from one analyst to another, much like a computational theory concerning the stable features of a
context. Below, I discuss three ways of establishing analytical constructs.

Content Analysis: An Introduction to Its Methodology


Page 11 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

4.2.1 Operationalizing Available Knowledge of the Context

Content analysts, by their very ability to read and have an interest in given texts, acknowledge at least cursory
knowledge of their sources: who else reads, appreciates, or uses the texts (note that authors always are the
first readers of what they write); what the texts in question typically mean and to whom; which institutionalized
processes are invoked in publishing and disseminating the texts; and what makes the texts hang together.
Knowledge of this kind, intuitive as it may seem in the beginning, concerns the stable features surrounding
given texts. Figure 4.4 suggests that such knowledge needs to be rearticulated into an inference mechanism.
Without clarifying one’s conceptions of the context of given texts, the analytic procedure employed may not
qualify as a “design.” I provide more specific discussions of this process in Chapter 9, but because the three
preparatory designs all yield the same result, an analytical construct, I present them here for comparison.

Operationalizing available knowledge may be as simple as equating the frequency of co-occurrence of two
categories of text with the strength of the association between the two conceptual categories in an author’s
mind. Other operationalizations include the construction of the tagging dictionary for a computer program,
which requires extensive knowledge of language use; formulating an algorithm that formalizes the proposi-
tions found in the message effects literature; and a computer program written to trace the linguistic entail-
ments of selected political slogans through a body of texts. Operationalizations must be justified, of course,
and available theory, literature, or acknowledged experts may be consulted for this purpose. Ultimately, the
application of any operationalization of analysts’ knowledge to given texts must yield valid inferences.

Content Analysis: An Introduction to Its Methodology


Page 12 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

Figure 4.4 ■ Operationalizing Expert Knowledge

4.2.2 Testing Analytical Constructs as Hypotheses

The most traditional way to establish a valid analytical construct is to test it empirically. To develop alternative
hypotheses of relations between textual and contextual variables and let empirical evidence select the most
predictive one. This is how researchers establish psychological tests, validate behavioral indices, and devel-
op predictive models of message effects. Once the correlations between textual and extratextual features are
known, content analysts can use these correlations to infer contextual correlates from given texts—provided
the correlations are sufficiently decisive and generalizable to the current context. This is why we speak of sta-
ble or relatively enduring relations operating in the chosen context. Osgood (1959), for example, conducted
word-association experiments with subjects before building the correlation he found between word co-occur-
rences in text and patterns of recall into his contingency analysis (see also Krippendorff & Bock, 2009, Chap-
ter 3.1). In a carefully executed study, Phillips (1978) established a correlation between reports of suicides of
important celebrities and the fatality rate due to private airplane crashes. He found that the circulation of such
suicide reports did predict an increase in airplane crashes (see also Krippendorff & Bock, 2009, Chapter 2.4).
Whether such an index has practical consequences is another matter.

To test such statistical hypotheses empirically, data must not only be large enough to support statistically

Content Analysis: An Introduction to Its Methodology


Page 13 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

sound analytical constructs, but they must also be generalizable to the textual matter currently content ana-
lyzed. It follows that this design applies mainly to repeatedly asked research questions or situations in which
the relations between texts and the answers to these questions concerning their context are stable, not unique
(see Figure 4.5).

Figure 4.5 ■ Testing Analytical Constructs as Hypotheses

4.2.3 Developing Analytical Constructs by Trial and Error

Most traditional content analysis designs develop their analytical constructs from sound background knowl-
edges or theories of the context of texts and apply them to obtain potentially valid inferences of interests.
However, there are many occasions where individual content analysts or larger scholarly communities stay
with an analytical effort for longer periods of time and systematically work on improving the quality of their
inferences. Such efforts may start by making tentative inferences from given texts, comparing them to validat-
ing evidence and using observed discrepancies to alter their analytical construct for subsequently improved
inferences from similar texts. In the absence of validating evidence, the corrective feedback may also come
from misleading decisions to which the inferences lead or from respected critics. The iterative process of this
content analysis design converges toward a “best fit” of the analytical construct and the context within which
the analysts work. This is how intelligent content analysts learn from their failures and improve the quality of
their research over time, as did the Federal Communications Commission propaganda analysts during World
War II, who simply became better analysts as they repeatedly corrected the assumptions of their predictions

Content Analysis: An Introduction to Its Methodology


Page 14 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

(George, 1959a).

A particularly transparent example of developing of analytical constructs is the iterative computation of dis-
criminant functions, sometimes called “machine learning” (Alpaydin, 2004). A discriminant function is an ana-
lytical construct, whether in the form of a procedure, decision rule, or algorithm which, when applied to a set
of texts, distinguishes among them according to well-defined criteria. In content analysis, these criteria are
tied to the inferences that a researcher wishes to make from texts.

Perhaps Earl B. Hunt was the first who developed the methodology of (1) starting with a sample of documents
categorized in ways to be generalized to other documents; (2) using an algorithm that iteratively searches
for the strongest relationships between the presence or absence of textual qualities in these documents to
their categorization; (3) applying the thereby acquired discriminant function to distinguish among documents
whose categories are known to the analyst but withheld from the function; and, if successful, (4) inferring the
categories of other documents. Hunt (Stone & Hunt, 1963) had two sets of suicide notes. One collected by
Shneidman (Shneidman & Farberow, 1957) at the San Francisco Suicide Prevention Center, hence real. The
other generated by a panel of writers from Osgood’s research group, hence simulated. Hunt relied on Stone’s
(Stone & Hunt, 1963) dictionary-based general inquirer, which gave him the frequencies of semantically re-
lated groups of words for each note. His discriminant function emerged in the process of examining 15 pairs
of notes, testing numerous hypotheses regarding how the real and simulated notes differed. After obtaining
a good fit, the difference function was applied to another set of notes and inferred the psychological states of
the writers in 17 out of 18 cases. This was a remarkable success, achieved without suicide-related precon-
ceptions.

It is important to note that this discriminant function emerged during a “learning” period prior to its application,
and was limited to a very small universe of existing notes with known authenticity. I do not know whether it
found applications outside the scholarly effort. However, the idea of developing an analytical construct by trial
and error before applying it to generate inferences contrasts with the most common practice of developing
analytical constructs from sound preconceptions or theories of the context under consideration. For exam-
ple, Osgood, who had posed the problem of distinguishing real from fake suicide notes and had generated
the simulated notes, before Stone and Hunt got hold of his data, applied an analytical construct derived from
a psychological theory to distinguish between these notes. Stone and Hunt reported that Osgood’s content
analysis was far less successful then theirs.

Content Analysis: An Introduction to Its Methodology


Page 15 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

While Hunt’s discriminant function contributed little to understanding the motivation of the authors of these
notes (see Section 9.2.1), the process of developing it became the prototype for formulating algorithms capa-
ble of distinguishing among textual characteristics from which inferences could be justified, now called super-
vised machine learning algorithms (see Section 11.4.2).

Hunt’s design was limited to very small sets of texts, one research team, and little interest in theories under-
lying their inferences, but the process of developing analytical constructs by trials and learning from errors
is quite common, especially in larger contexts and when time for learning is available. Besides the example
of slowly improving predictions from domestic enemy propaganda during World War II, already mentioned,
there are numerous occasions where analytical constructs evolve within research communities facing and
coping with emerging challenges. Efforts to distinguish between good and bad journalistic practices have
been around for a century. Media critics and governmental regulators of the public use of radio frequencies
had a long history of content analyzing media presentations using objectivity, public service, and balanced
reporting as the criteria for such evaluations. The measurability of these criteria became challenged by new
forms of communication media and institutions with commercial or political interests in bypassing criticism
and regulations. Today, content analyses that measure the quality of journalism in these three dimensions are
undermined by the very journalistic practices to be judged. The meaningfulness of the concept of objectivity
is challenged when journalists correctly quote lies by politicians. Balanced reporting faces limits when applied
to giving criminals and their victims equal attention. Whether news outlets report their sources is easy to es-
tablish. However, the distinction between trustworthy and unreliable sources is not always easy, and when
it is, reporting their identity may violate their privacy. When reporting on controversies, trusted sources may
well become the target of harassment by opponents when identified by name. Violating their privacy may
conflict with the value of transparency. In the United States, free speech is protected by the Constitution, but
public responsibility calls for excluding hate speech, the promotion of ethnic prejudices, and use of slander,
a not always easy distinction. Journalists are not mere reporters of facts; good journalistic research needs
to uncover barely noticeable and often deliberately hidden connections between them. Unless evidence of
these connections is unquestionably compelling, opponents can free themselves of apparent wrongdoing by
dismissing them as conspiracy theories or witch hunts. In other words, the analytical constructs that could en-
able the community of media scholars to evaluate journalistic practices have become exceedingly complex,
multidimensional, and require openness to continuous improvements as new media and rhetorical strategies
are invented that make journalistic accountability difficult.

Content Analysis: An Introduction to Its Methodology


Page 16 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

In all of these examples, content analyses, designed to improve their analytical constructs in the face of fail-
ures, need to be iteratively coupled with validating information from the contexts they are addressing, as
shown in Figure 4.6. Whether this feedback results in computable algorithms (discriminant functions, further
discussed in Section 11.4.3), analytical techniques, or improved abilities by scholarly communities to make
valid or usable inferences from available texts to their contexts of use, it always involves time and clear crite-
ria of successes and failures, even when their analytical objectives and contexts are evolving.

Figure 4.6 ■ Developing Discriminant Functions

4.3 Designs Going Beyond Content Analysis

Unfortunately, starting with Berelson’s (1952) account, the content analysis literature is full of insinuations that
content analyses are aimed at testing scientific hypotheses, which brings us back to the notion of content
as something inherent in or indistinguishable from text, a conception we abandoned in Chapter 2. According
to the definition of content analysis employed in this book, content analysts rely on hypothetical generaliza-
tions in the form of analytical constructs. But the proof of these generalizations lies in the validity or useful-
ness of the inferences they support. It becomes evident mainly after content analysts have answered their
research questions, made their abductive inferences, or interpreted their texts systematically. For example,

Content Analysis: An Introduction to Its Methodology


Page 17 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

to test a hypothesis concerning the behavioral correlates of anxiety, one must know the level of anxiety and
separately observe the behavioral correlates of interest. By inferring the level of anxiety from an individual’s
talk—from accounts of feelings, distress vocabulary, or speech disturbances (Mahl, 1959)—the content analy-
sis becomes a necessary part of a larger research effort. Despite what Figure 4.1 might suggest, content
analyses do not need to stand alone, and they rarely do. Below, I briefly discuss three research designs in
which content analysis is instrumental.

4.3.1 Comparing Similar Phenomena Inferred From Different Bodies of Texts

In this design, researchers have reasons to draw distinctions within a body of text and apply the same content
analysis to each part (see Figure 4.7). For example, to study speeches made before, during, and after a given
event—or trends—analysts must distinguish texts according to time periods. To compare the treatment of one
event in different media, analysts would have to distinguish texts by their sources. To examine how candi-
dates for a political office tailor their promises to different audiences, analysts would want to distinguish texts
according to audience demographics. And to test hypotheses regarding the impacts of competition between
newspapers on the papers’ journalistic qualities, analysts would want to distinguish texts by how their sources
are situated. What content analysts compare—the hypotheses they test—in this design do not concern differ-
ences among textual properties but differences among the inferences drawn from texts, which are a function
of the assumed context, not directly observed.

Content Analysis: An Introduction to Its Methodology


Page 18 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

Figure 4.7 ■ Comparing Similar Phenomena Inferred From Different Texts

4.3.2 Testing Relationships Among Phenomena Inferred From One Body of Texts

In this design, the researcher analyzes one body of texts from different perspectives, with reference to differ-
ent contexts, through different analytical constructs, or addressing different dimensions of meaning, and then
correlates the results (see Figure 4.8). In behavioral research, such separately inferred phenomena tend to
appear as different variables, which can be compared, correlated, or subjected to hypothesis testing. On a
micro level, examples of such designs are found in analyses of attributions (multiple adjectives that qualify
nouns), co-occurrences of concepts (inferred from word co-occurrences), KWIC lists (keywords in their tex-
tual contexts), contingencies (Osgood, 1959), and conversational moves (adjacency pairs or triplets). On a
macro level, examples include efforts to understand how public concerns—crime, environment, health, unem-
ployment, and politics—compete with or stimulate each other in the mass media. Such designs also enable
an analyst to compare different readings of the same texts by readers of unlike gender or from divergent so-

Content Analysis: An Introduction to Its Methodology


Page 19 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

cioeconomic, educational, ethnic, or ideological backgrounds. Here, the content analyst would define diverse
contexts in reference to which texts are being read and analyzed.

Figure 4.8 ■ Testing Hypotheses Concerning Relations Among Various Inferences From
One Body of Texts

4.3.3 Testing Hypotheses Concerning How Content Analysis Results Relate to Other Variables

Typically, this kind of design brings communicational or symbolic and behavioral variables together. For ex-
ample, the cultivation hypothesis, which asserts that there are correlations between media coverage and au-
dience perceptions, calls for comparing the results of a content analysis of mass-media presentations with
interview data on audience members’ perceptions of everyday reality. Gerbner and his colleagues have ex-
plored the relationship between the “world of TV violence” and how TV audiences perceive the world outside
television (see, e.g., Gerbner, Gross, Morgan, & Signorielli, 1995; and the debate reproduced in Krippendorff
& Bock, 2009, Chapter 6.6). In comparing newspaper coverage of crime with crime statistics and public opin-

Content Analysis: An Introduction to Its Methodology


Page 20 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

ion, Zucker (1978) found that the frequency of crime reports in the media correlated more highly with public
opinion than with official crime statistics. Conversation analysts usually are satisfied with their own accounts
of what they see in the transcripts of naturally occurring conversations; thus their approach conforms to the
design illustrated in Figure 4.8. However, if they were to relate their interpretations to participants’ awareness
of the phenomena being inferred, then they would compare inferences from texts with other accounts.

Such designs have three primary aims:

• To provide variables about the nature of communications that enable the testing of hypotheses con-
cerning the causes, correlates, and effects of such communications
• To enrich indicators of observed behavioral phenomena by adding measures that concern the mean-
ings of these phenomena (multiple operationalism), especially concerning individuals’ perceptions or
interpretations of social phenomena, which cannot be observed as such
• To substitute more economical measures for measures that are cumbersome (for example, using
content analysis of TV news instead of surveys of what the public knows)

This design is represented in Figure 4.9.

Content Analysis: An Introduction to Its Methodology


Page 21 of 22
Sage Sage Research Methods
© 2019 by SAGE Publications, Inc.

Figure 4.9 ■ Testing Hypotheses Concerning Relations Between Observations and Infer-
ences From Texts

I should emphasize that content analysts are not limited to the research designs distinguished above. Re-
searchers can combine designs to obtain more complex forms that embrace many variables, and they can
use any design in tandem with other techniques. There is no methodological limit to the use of content analy-
sis in large social research projects.

https://doi.org/10.4135/9781071878781

Content Analysis: An Introduction to Its Methodology


Page 22 of 22

You might also like