Coding an Overview and Guide to Qualitat
Coding an Overview and Guide to Qualitat
CODING
an overview and guide to qualitative data analysis for integral researchers
Version 1.0
What is Coding?
• The word coding is of Greek origin, meaning “to discover.” Yet coding
simultaneously highlights the constructivist dimension of research and is thus
as much about enacting—through the epistemic profile and methodological
orientation of the researcher—as it is about discovering. As Saldana puts it,
“the act of coding requires that you wear your researcher’s analytic lens. But
how you perceive and interpret what is happening in the data depends on what
type of filter covers that lens” (Saldana, 2009, p. 6). Thus, a key practice for
integral scholars engaging the coding process is to be conscious and self-
reflexive with respect to one’s own epistemological lens and methodological
approach. See the below section on integral coding for more detail.
1
Partly abstracted from The Coding Manual For Qualitative Researchers by Johnny Saldana (2009).
(Behar & Gordon, 1995; Stanfield & Dennis, 1993), and whether you
collect data from adults or children (Greene & Hogan, 2005); Zwiers &
Morrissette, 1999).
• Creswell (2007) notes that codes can emerge in response to not only expected
patterning, but also what you find to be striking, surprising, unusual or
conceptually captivating (p. 153).
• As you identify patterns and construct categories in the coding process, keep in
mind that, according to Tesch (1990), “a confounding property of category
construction in qualitative inquiry is that data within them cannot always be
precisely and discretely bounded; they are within ‘fuzzy’ boundaries at best
(pp. 135-8).
What is a Code?
• A code is a lower level of data analysis on the way to labels, categories, themes,
and theory. As Charmaz (2006) puts it metaphorically, coding “generates the
bones of your analysis…. [I]ntegration will assemble those bones into a
working skeleton” (Saldana, 2009, p. 45).
• Saldana, 2009 states, “…be aware that a code can sometimes summarize or
condense data, not simply reduce it…depending on the researcher’s academic
discipline, ontological and epistemological orientations, theoretical and
conceptual frameworks, and even the choice of coding method itself, some
codes can attribute more evocative meanings to the data” (p. 4).
• You don’t have to wait until all your data has been collected and assembled to
begin your preliminary coding.
o As you are transcribing interviews, writing up field notes, or filing
relevant documents, you can jot down preliminary phrases or words as
“analytic memos” in a research journal for future reference.
o Memory doesn’t always serve us as well as we might like! Make sure
that such analytic memos are distinctively marked so as to avoid mixing
them with the raw data.
o Researchers may also choose to “precode” (Layder, 1998) by
underlining, highlighting, circling, bolding, or coloring salient passages
that are deemed as striking.
• The magnitude of the data coded can range from a single word to a full
sentence to an entire page of text, to a stream of moving images. First cycle can
be repeated numerous times before proceeding to Second Cycle coding. If
higher-level categories, concepts, or themes pop out at you, that is fine, just
make note of them in a separate analytic memo (as it can sometimes inform
later analysis) and don’t muddle higher-level analysis with the first cycle (a
kind of ‘pre/trans fallacy’ in Wilber’s terms).
• In terms of portions, you can code the exact same units coded in First Cycle,
longer passages of text, or even a reconfiguration of the first cycle codes.
Second Cycle coding can be repeated numerous times, since coding demands
that researchers pay meticulous attention to the nuances of language and reflect
deeply on the emergent patterning embedded within the data.
• Some researchers argue that every detail of the raw field data should be coded,
while others feel that only aspects of the data that are deemed to be salient
should be coded and even up to half of the record can be summarized or
deleted all together. In my opinion, researchers should use caution when
deleting data, as what appears in one coding cycle to be irrelevant may turn out
to hold keys to unlocking the larger emergent pattern that may come forth in
subsequent coding cycles. Generally, beginner researchers may want to
consider erring on the side of conserving (and not deleting) data while there
“data sense” is developing.
• Abbott (2004) likens the cycles of coding to the process of decorating a room:
“you try it, step back, move few things, step back again, try a serious
reorganization, and so on” (p. 215).
Some Brief Coding Examples: (see the section “28 Coding Methods” below for a
more comprehensive and detailed list of coding methods)
• In Vivo Code: when a code is taken verbatim directly from the data and placed in
quotation marks. (e.g., I really feel inspired around him. Code “INSPIRED”).
• Simultaneous Coding: the application of two or more codes within a single datum
(e.g., when one code refers to an embedded or interconnected part of the single
datum).
o Such analysis or codification “is the search for patterns in the data and
for ideas that help explain why those patterns are there in the first
place” (Bernard, 2006, p. 452).
Themes:
Theory Development:
Quantities of Qualities:
• A typical qualitative research study might generate 80-100 codes, which will be
organized into 15-20 categories, which are ultimately synthesized into 5-7 major
themes/concepts.
Coding Software:
• Saldana (2009) recommends learning to code manually (pen and paper) before
delving into the world of coding software, such as CAQDAS. CAQDAS is
highly recommended for larger scale research projects (multiple participant
interviews, et cetera).
o ATLAS.ti: www.atlasti.com
o MAX QDA: www.maxqda.com
o NVivo: www.qsrinternational.com
* Refer to Lewins & Silver (2007) and Baseley (2007) for supporting literature on
these programs.
• As you code, Saldana (2009) recommends that you keep a record of your
emergent codes, their content descriptions, and a brief data example in a
codebook, separate file, or via a qualitative analysis software program such as
CAQDAS. It can be quite useful in the analysis process to be able to see all
your codes together in one place, without having to sort through your raw data
documentation.
• While most coding efforts will likely be engaged alone, researchers can bring
additional researchers into the process so as to widen the sphere of possibilities
in terms of interpreting and enacting the data. Multiple researchers may then
code the same raw data and then attempt to bridge or synthesize the spheres of
divergence and/or move towards interpretive convergence.
• Researcher can also bring their research subjects into the coding/analytical
process to varying degrees, actually inviting them to collaboratively code the
data, or to simply crosscheck your interpretations with theirs (so-called
“member checks”). Such member checks, or triangulation of interpretations in
the coding process, will tend to increase the validity of your knowledge claims.
• In choosing the most appropriate coding methods for your particular study, you
should clarify your a priori orientations or goals.
• Grammatical Methods:
• Elemental Methods: are primary coding methods for qualitative data analysis,
providing basic, focused filters which help lay the foundation for subsequent
coding cycles.
o In Vivo Coding: when a code is taken verbatim directly from the data and
placed in quotation marks. (e.g., I really feel comfortable around him.
Code “COMFORTABLE”). P. 3 In Vivo coding is particularly well
suited for extracting and highlighting “folk” or “indigenous” terms
(participant-generated words indicative of a group, culture, or sub-
cultures categories of meaning). In Vivo coding is associated with
grounded theory.
o Initial Coding (or “Open Coding” also known in earlier grounded theory
publications) as an open-ended approach to coding in which the
researcher codes for their “first impression” words or phrases in
response to engaging the datum (p. 4). (As such, all codes are held as
tentative and provisional formulations used to inform further iterative
coding cycles.) Initial Coding is a fundamental approach to coding that
is not formulaic, yet employs some broad guidelines. Initial Coding is an
opportunity for the researcher to begin to deeply reflect on the contents
of the data, break it down into discrete parts, and examine them for
similarities and differences. Initial Coding is an important method for
grounded theory studies, as its goal is to “remain open to all possible
theoretical directions indicated by your readings of the data” (Charmaz,
2006, p. 46). As such, the typically phenomenological practice of
“bracketing” or “epoché” can be appropriately coupled with this
method. Note that initial coding can also make use of other (typically
basic, descriptive) coding methods, such as In Vivo Coding or Process
Coding.
o Evaluation Coding: focuses on how data can be analyzed such that the
relative merit of various programs or polices can be assessed or judged.
Often is used in the context of program, organization, policy, or
personnel evaluation. Evaluation coding often focuses on the patterning
of participant responses relative to attributes or factors that can be used
to assess quality or performance. Particularly apt for policy, critical,
action, organizational, and evaluation studies.
Second Cycle Coding Methods: are more advanced ways of reorganizing and
reanalyzing the data coded through First Cycle methods. Second Cycle methods
involve the exploration of the interrelationships across multiple codes and categories so
as to develop a coherent synthesis of the data. Second Cycle coding serves to develop a
sense of thematic, conceptual, and/or theoretical organization and coherence from your
First Cycle codes.
material into a more meaningful and parsimonious unit of analysis. They are a
sort of meta-code. Pattern Coding is a way of grouping those summaries into a
smaller number of sets, themes, or constructs” (Miles & Huberman, 1994, p.
69). According to some theorists, Pattern Coding works well following a First
Cycle coding method, such as Initial Coding.
• Focused Coding: done subsequent to Initial Coding, searches for the most
frequent or significant Initial Codes to develop the categories that are deemed
to be the most salient, largely based on an evaluation of which Initial Codes
appear to generate the most analytic traction. Focused coding is associated with
grounded theory.
• Axial Coding: attempts to strategically reassemble data that were split via the
analysis of Initial Coding. Axial Coding relates categories to sub-categories and
defines the properties (i.e., characteristics or attributes) and dimensions (the
location of a property along a continuum or range) of each category. This
coding process is intended to develop constructs that inform the researcher
around “if, when, how, and why” something happens (Charmaz, 2006, p. 60).
This is a well-suited approach for grounded theory studies in particular.
1. Pre-coding
2. First Cycle: Initial coding
a. including In Vivo and Process coding;
3. Second Cycle: axial coding, theoretical coding