Interviews II - Data Analysis
Interviews II - Data Analysis
The aim of this session is to outline and evaluate the various stages and practices involved in
analysing interview data in qualitative research. In doing so, it aims to encourage a confident,
considerate, ethical and reflexive approach to analysing research interviews with individual and
group-interview participants.
Silverman (2010) outlines two common positions adopted by PhD students, one involving ‘drowning
in data’, and another based on a linear progression through stages of data collection, analysis and
writing-up. Citing Coffey and Atkinson (1996: 10), he argues that ‘both positions imply a woeful lack
of appreciation of what is and can be meant by analysis [which] is a pervasive activity throughout the
life of a research project’. With this in mind, Silverman advises students to begin data analysis early
(using data already available in the public sphere, other people’s data, analysing your own data as
soon as it becomes available, and asking key questions about data early on). He gives some useful
advice on getting started early by developing small and achievable tasks:
1. Offer a ‘snap’ characterization of what seems to be happening in your data, and write a
response to it to discuss with your supervisor. What are your ‘hunches’?
2. Take a manageable focus and begin analysis (e.g. one transcript, one theme, one day of
observation, one event).
A basic epistemological question underpinning data analysis is: are you trying to produce data that
will reflect an external ‘reality’, or which will have resonance with the ‘internal’ sense-making of
your research participants? If the former, or if you have a particularly large amount of data,
qualitative analysis software may help you to organise and analyse your material. However, if the
latter, and you are seeking depth of meaning, ‘immersion’ analysis is likely to be your preferred
1
method, or of course some combination of these two approaches. A more constructionist approach
to data analysis (involving immersion in the data) is based on the premise that interview participants
actively construct meaning, rather than simply report it, reflecting a pre-existing, external reality.
This latter perspective is emphasised in narrative methods, which focus on discursive shifts as
signifying a change in participant role: ‘speaking as a mother’, ‘speaking personally’, ‘with my
professional hat on’, ‘if I were in his shoes’ etc. These kind of prefaces indicate the range of subject
positions that participants in a research project occupy, and give indications of what triggers them to
‘switch’ roles, as well as providing insight into the identity work they undertake in moving between
their respective roles (see Holstein and Gubrium, 1995: 33-4, cited in Silverman, 2010: 227). As
Silverman (2010: 227) notes, this approach is a ‘useful antidote to the assumption that people have
a single identity waiting to be discovered by the interviewer. By contrast, it reveals that we are
active narrators who weave skilful, appropriately located, stories’. As Silverman also notes, however,
if we focus solely on participants’ own accounts as self-presentations or narratives, we risk losing
sight of broader issues that make up ‘the bigger picture’.
Bryman and Bell (2011: 584) note that ‘coding is the starting point for most forms of qualitative
analysis’. The immersion process they describe involves reading individual transcripts once, and then
reading them again in order to identify key words, coding and annotating the text, reviewing codes
and generating concepts. This allows for each transcript to be coded in several different ways. They
emphasize that coding is not analysis, just the first step towards it (this is important, particularly in
ethnographic research if you are to avoid a ‘see here!’ approach to presenting your findings –
Bryman and Bell remind us: ‘you are not just a mouthpiece’).
Coding refers simply to the process of categorizing qualitative data, involving the development of a
set of analytical codes that help make sense of the data. Codes can be predefined (based on the
literature, the research questions and the interview schedule), or can emerge from the data.
Predefined codes categorize findings you would expect to see based on your prior knowledge and
assumptions (they may reflect, refute or develop concepts established in published literature, for
instance). Emergent codes categorise findings that have emerged from the data, and which were not
necessarily expected or anticipated.
It is useful for more than one person to code so that it becomes an interactive process, and for more
than one person to discuss the data in relation to the codes. As you proceed, you may find that your
initial codes are too narrow or too broad, and need to be refined.
Read through individual transcripts and ‘batches’ as they become available, read again.
(a preliminary analysis may reveal important themes that are being overlooked in the
interviews, and prompt a revision of the sample and/or interview schedule).
Engage in data reduction: select, focus, simplify and abstract thematic ‘raw’ data (to subject
to coding). Coding simply allows you to reduce the complexity of the data into manageable
components that can be compared and connected to identify thematic patterns.
2
Re-assemble data thematically. Step back from it and view the ‘coherent’ whole, looking for
emergent themes and patterns, as well as similarities and differences.
Begin to make connections to identify links and gaps (discuss and disseminate emergent
findings – hold analytical workshops or meetings).
Identify ‘deviant’ (or simply unanticipated findings) – ‘outlying data’, and look for
explanations. Identify contradictions in the data, and consider the surprises. What are the
remaining puzzles?
Move from thematic to theoretical analysis (from what is happening to why – this may
involve going back to the data collection process).
Distil data into ‘findings’ (by relating data back to research questions emerging from
literature).
1. Coding is a highly subjective process (based largely on the interpretations of the researcher).
2. Data reduction involves losing the context of the data (but can be combined with field
notes).
3. Fragmentation – meaning and understanding can be taken out of context, and the totality of
a carefully constructed narrative disassembled.
Write summaries of your data, considering how it links to your research questions and to the
literature, illustrating key themes, including quotations from transcripts and/or field notes. Consider
what your data adds to what is already known about each theme/question. How does your data
develop what is known empirically, conceptually, methodologically and theoretically? Summarise
your overall results, and results by theme. Once you have reached this stage, you are ready to
synthesize your findings across multiple data sources, including visual data.
References
Bryman, A. and Bell, E. (2011) Business Research Methods. Oxford: Oxford University Press.
3
Immersion in the Data
Just as you immerse yourself in the field when you are collecting data, in
qualitative analysis it is essential to immerse yourself in the data. This happens
in a number of ways. First, there is no substitute for doing your own
transcription. When you transcribe interview data from tapes, you engage a
number of senses and begin the immersion process. It is a good idea to keep
your field journal nearby to make analytic notes as you transcribe. You will then
read transcripts many times in order to become completely familiar with the
data. How many times? There is no set number. However, a good guide is that
if you come up with an analytic insight, you know exactly where to go in any
data to find supporting or disconfirming evidence.
Even as you are interviewing a participant or observing in the field, you are
already thinking analytically. Fortunately, as thinking human beings, we can't
help it. Again, be sure your field journal is handy, and write down everything
that comes to your mind. Do not trust your memory. No idea or hunch is too
insignificant to write down.
Analytic Memos
A code is a concept that is given a name that most exactly describes what is
being said. Typically, in an interview transcript, the researcher might highlight
a word, phrase, sentence, or even paragraph that describes a specific
phenomenon. This word, phrase, sentence, or paragraph is a meaning unit.
After highlighting this segment of text, the researcher gives it a name (code).
The code should be as close to the language of the participant as possible. The
difference between a code and a theme is relatively unimportant. Codes tend
to be shorter, more succinct basic analytic units, whereas themes may be
expressed in longer phrases or sentences.
After identifying and giving names to the basic meaning units, it is time to put
them in categories, or families. Similar codes all can be gathered together into
a category, or family of codes, and one might give them a common code. Again,
stay as close as you can to the language of participants. However, as you
gather codes into categories, and then categories into larger more overarching
categories, you will find that you will necessarily have more abstract names for
4
the categories in order to make them more inclusive. A good guideline is, when
you move to greater abstraction, still use language that would be
understandable to participants. In grounded theory, the goal would be to
ultimately have all the data subsumed under one overarching core category.
However, for the most part having a very few top-level categories is fine. As
you code and categorize the data, also look for the interrelationships among
the various categories.
After extensive analysis, you will finally end up with an explanatory framework
that describes your results. It may be helpful to the reader to see a figure with
the interrelationships among the various parts of the model. In some instances,
researchers reconstruct a narrative to illustrate their findings. Even as you build
toward this framework, you will continue to reanalyze your data. The emerging
model should be examined and re-examined, compared and contrasted with all
of the data and between the various components of the model, and revised until
no new changes need to be made. Just as you collected data to the point of
redundancy, you are looking for a similar event in your analysis, called
saturation. When all the codes and categories are saturated, that means that
(a) all the data are accounted for, with no outlying codes or categories; and (b)
every category is sufficiently explained in depth by the data that support it.