100% found this document useful (3 votes)
3K views

Exploring Second Language Classroom Research PDF

Uploaded by

MEI MEI CHONG
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
3K views

Exploring Second Language Classroom Research PDF

Uploaded by

MEI MEI CHONG
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 501

Exploring Second Language

Classroom Research

A COMPREHENSIVE GUIDE

David Nunan
University ofHong Kong/Anaheim University

Kathleen M. Bailey
Monterey Institute ofInternational Studies

f \ HEINLE
1% CENGAGE Learnim

Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
\ HEINLE
t% CENGAGE Learning-

Exploring Second Language © 2009 Heinle, Cengage Learning


Classroom Research:
A Comprehensive Guide ALLRIGHTS RESERVED. No partof this work covered by
the copyright herein may be reproduced, transmitted,
David Nunan and Kathleen stored or used in any form or by any means graphic, electronic,
M. Bailey or mechanical, including but not limitedto photocopying,
recording, scanning,digitizing,taping, Web distribution,
Publisher: Sherrise Roehr
information networks, or informationstorage and retrieval
Acquisitions Editor: systems, except as permitted under Section 107 or 108 of the 1976
Tom Jefferies United States Copyright Act, without the priorwritten
Editorial Assistant: Cecile Bruso permission of the publisher.
Director,US Marketing:
Jim McDonough For product information and technology assistance,
contact us at
Product Marketing Manager:
Cengage Learning Customer &Sales Support,
Katie Keiley 1-800-354-9706
Academic Marketing Manager: For permissionto use material from this text or product,
Caitlin Driscoll submit all requests online at cengage.com/permissions
Content Project Manager: Furtherpermissions questions can be emailed to
[email protected]
John Sarantakis
PrintBuyer: SusanCarroll
Library of Congress Control Number: 2008935117
Cover Designer:
LisaMezikofsky/ ISBN-13:978-1-4240-2705-7
Dawn Elwell
ISBN-10:1-4240-2705-5
Compositor: ICC Macmillan Inc.

Heinle
25 Thomson Place
Boston, MA 02210
USA

Cengage Learning isaleading provider of customized learning


solutions with office locationsaroundthe globe,including
Singapore, the United Kingdom, Australia, Mexico, Brazil,
andJapan. Locate yourlocal office at:
international.cengage.com/region

Cengage Learning products are represented inCanada by


Nelson Education, Ltd.

Visit Heinle online at elt.heinle.com

Visit our corporate website at www.cengage.com

Printed in Canada
123456789 12 11 10 09 08
For my research students, who helped me refine my thinking
by trying out these ideas and activities.
David Nunan

ForRobMcMillan,
Sarah Springer,
Will Radecki,
Marie-Lise Bouscaren,
Analisa Bouscaren Radecki,
Noel Isaiah Cortez Radecki Bouscaren,
andFady—
thanksfor beinghere thesepastfewyears.
KathiBailey
CONTENTS

Preface v
Acknowledgments vii

PART I Second Language Classroom Research: An Overview 1


Chapter 1 Introducing Second Language Classroom Research 3
Chapter 2 GettingStarted on Classroom Research 26
Chapter3 Key Concepts in PlanningClassroom Research 55

PA RT II Research Design Issues: Approaches to Planning and


Implementing Classroom Research 81
Chapter 4 The ExperimentalMethod 83
Chapter 5 Surveys 124
Chapter6 CaseStudyResearch 157
Chapter 7 Ethnography 186
Chapter 8 Action Research 226

PART III Data Collection Issues: Getting the Information


You Need 255

Chapter 9 Classroom Observation 257


Chapter 10 IntrospectiveMethods of Data Collection 284
Chapter 11 Elicitation Procedures 312

PA RT IV Data Analysis and Interpretation Issues: Figuring Out


What the Information Means 337
Chapter 12 Analyzing ClassroomInteraction 339
Chapter 13 Quantitative DataAnalysis 371
Chapter 14 Qualitative DataAnalysis 412
Chapter 15 Putting It AllTogether 437

References 463
Index 486
Text Credits 496

IV
PREFACE

W e wrote this book with the intention of providing an introduction to


research in second and foreign language classrooms. It grew out of
courses and workshops on classroom research that we have devel
oped and taught for teachers' conferences and graduate programs inapplied lin
guistics. We aim to provide an introduction to language classroom research that
is accessible to readers who do not necessarily have specialist trainingin research
methods.
The bookhastwo overriding objectives. The firstis to provide an overview
ofandintroduction to language classroom research. To this end, we look at both
substantive issues (thatis,the topics andquestions that have beeninvestigated by
classroom researchers) and methodological issues (the techniques and methods
that researchers have employed for collecting data interpreting the data, and
presenting the results). The second objective isto help readers develop practical
skills for carrying out original empirical investigations. Although the context of
the bookis the second language classroom, our intentionhasbeento cover con
cepts and techniques that will be broadly applicable to a wide range of applied
linguistics contexts. j
Our intended readership includes language teachers, researchers, teachers-
in-training, and language teacher educators. Whenwe wrote thebook, we had in
mind a worldwide audience of language teaching professionals, including lan
guage teachers, teacher trainees, teacher educators,! and those who run courses
on classroom observation and research. We believe that it will be particularly
wellsuited for candidates in teachingcredential, master's degree,and Ph.D. pro
grams for language teachers, lb this end, the examples and studies reviewed
were designed to appeal to teachers of many languages—not justEnglish.
The structure of the book is transparent and will support people learning
aboutthe research process. There are fourmajorthematically organized sections
to the book. The first of these provides an overview of second language class
room research. The second deals with research design issues, looking at ap
proaches to planning andimplementing classroom research. In the thirdsection,
we examine a range of procedures for collecting data. In the final section, we
focus on data analysis and interpretationissues, providing toolsand perspectives
for makingsenseof the data that have been collected.
Each section is preceded byan introduction that maps out the territory and
sets goals forthe readers. Atthe endof each chapter, you will find questions and
tasks to help you solidify your understanding of the concepts covered.
Throughout the bookwe have critiqued our ownstudies, to show that language
classroom research seldom proceeds as smoothly as published accounts may
make it seem.
After reading thebook, you should bewell positioned to carry outyour own
classroom-based investigations, regardless ofthe research tradition inwhich you
choose to work. You will also be familiar with the findings ofseveral key studies
and be well prepared as critical consumers ofothers' publications in language
classroom research. We hope thatreading this book will positive and productive
for you.
David Nunan and Kathi Bailey

vi Preface
ACKNOWLEDGMENTS

ike most authors, we are grateful for the support ofthe many people who

L helped us launch this book.

At the Monterey Institute of International Studies in California, Kathi


Bailey's work was supported by a professional development grant from Joseph
and Sheila Mark, long-term supporters of higher education at MIIS.
At the University of Hong Kong, we were aided in our file management by
Sanny Kwok, who never grew impatient with the flood of e-mail traffic.
We appreciate the meticulous word processing done by Ryan Damerow,
JaalaThibault, and MicaTucci. We also benefited greatly from the Internet re
search, bibliographic skills, and computer prowess of Lawrence Lawson, our
"digital detective" and editorial assistant.
At Cengage (formerly Heinle & Heinle), we appreciate the encouragement
of Eunice Yeats, who got the project started; Tom Jefferies and Cecile Bruso,
who saw it to fruition; and Ian Martin, who believed in it from its inception.
Finally, we acknowledge the valuable input ofj our research students, who
have worked with us to refine these ideas over the years. They have asked ques
tions, raised concerns, and pushed us to new levels of awareness as they con
ducted their own language classroom research. In particular, we appreciate the
contributions ofJulie Choi, KevinJepson, Elaine Martyn, Will Radecki, Sarah
Springer, and John Thorpe, who allowed us to print excerpts from their origi
nal research projects. We hope that more teachers and graduate students like
them will be motivated to undertake language classroom research as a resultof
reading this book.

vu
PART I

Second Language
Classroom Research
An Overview

T h e three chapters in thissection provide anintroduction to andoverview


of the field of second language classroom research. The first chapter
takes a historical stanceand provides definitions of key terms, considers
the quantitative andqualitative traditions, andserves asan advance organizer for
the book as a whole. The second chapter jumps right into the research process
itself, oudiningsome of the typical procedures thai: allow us to plan, implement,
and evaluate classroom research projects, regardless of the traditions within
which we are working. Chapter 3, which concludes this section, builds on the
ideas discussed in Chapters 1 and 2 and introduces three important concepts:
variables, validity, and reliability.

Chapter 1: Introducing Second Language Classroom Research


By the endof thischapter, readers will ;
h gain an understandingof empirical research;
a havea clearunderstanding of what is meant by classrooms and classroom
research; '
h understand some similarities and differences among the psychometric
tradition, naturalistic inquiry, and action research;
• see somevalue in combining data collection and analyses procedures from
these various traditions;
o understand the differences among product studies, process studies, and
process-product studies;
si seemultiple possible roles for teachers in language classroom research.
Chapter 2: Getting Started on Classroom Research
Bythe end of this chapter, readerswill
ei have some ideas aboutdeveloping researchable questions of their own;
EJ understand the difference between an annotated bibliography anda
literature review;
ia understand the steps involved in writinga literature review of their own;
ei know the differences among nominal, ordinal, and interval data and
variables;
b understand the need for operationalizing constructsand be familiar with
three procedures for writing operational definitions of keyterms;
ra recognize controland experimental groups in a experimental studyand be
able to explain the logic behind them;
E be aware of some basic design considerations in both experimental and
nonexperimental research.

Chapter 3: Key Concepts in Planning Classroom Research


By the end of this chapter, readers will
0 understand the logic of hypothesis testing and be able to pose their own
research questions;
la understand the main types of variables that are used in language classroom
research;
@ be familiar with internal and external reliability measures for insuring
quality control;
ei have a clear understanding of internal and external validity as central
qualitycontrol concepts in any approach to language classroom research;
ei know the difference betweensamples andpopulations as these terms are used
in research;
^ be familiar with the correlation design and the criteriongroups design as
they are used in language classroom research;
ei be able to explain the difference among several researchdesigns that are
frequently used in language classroomresearch.

exploring second language classroom research


1

Introducing Second Language


Classroom Research

There are two kinds ofpeople inthis world—those that divide the world
into two kinds ofpeople and those thatdon't, (attributed toRobert Benchley,
as citedin Sreedharan, 2006, p.54).

INTRODUCTION AND OVERVIEW

The primary goal of thisbook is to provide an introc uction to research in second


and foreign language classrooms. Our interest in writingthis book grewout of
courses andworkshops on classroom research that wehave developed and taught
for teachers' conferencesand graduate programsin appliedlinguistics. The book
is intended for peoplewho want to do classroom research—specifically language
teachers, teacher trainees, teacher educators, researchers, and those who run
courses on classroom observation and research. We aim to provide an introduc
tion to language classroom researchthat is accessibl 2to readerswho do not nec
essarily havespecialist training in researchmethods,

Overriding Objectives
The book has two overriding objectives. The firjit of these is to provide an
overviewand introduction to language classroom research. To this end, we look
at both substantive issues (the topics and questions that have been investigated
by classroom researchers) and methodological issues (the techniques and meth
ods that researchers have employed for collecting data, analyzingand interpret
ing their data, and presenting the results). The second objective is to help read
ers develop confidence and practical skills for carrying out their own empirical
investigations. To this end,we will suggest topical foci and provide tasks to help
readers identify their own areas of interest. We will also provide guidance and
activities for developing research skills.
Many years ago, Long(1983a) advanced several reasons forthestudy ofsec
ond language classrooms, particularly on the part of teachers in preparation. In
thefirst place, Long argued thatclassroom-centered research can provide a great
deal ofuseful information about how foreign language instruction isactually car
ried out (in contrast to what people imagine happens in classrooms). Secondly,
classroom-centered research can promote self-monitoring by classroom practi
tioners. Third, the various observation schemes for classifying classroom inter
action can be used by teachers to investigate their own classes and the classes of
colleagues. Finally, involvement ofteachers in classroom research can help them
to resist the temptation to jump on thevarious methodological bandwagons that
come rolling along from time to time. Descriptive studies of what actually goes
on in classrooms can help teachers evaluate the competing claims of different
syllabi, materials, and methods.
Our hope, then, is that havingreached the end of this book, readers willhave
a clear idea ofthestate of the art in terms ofwhat researchers (including teacher
researchers) have looked at and how they have gone about their investigations.
We also hope that teachers and future teachers who read this book will be en
couraged to examine their own classroom contexts systematically. Having read
the bookand completed the questions and tasks set out in the various chapters,
readers should understand key concepts and methods in classroom-based re
search. They shouldalsobe able to relate these concepts and issues to their own
teaching situations.

Recurring Themes
This book has several recurringthemes. The first is that empirical research mat
ters.The second is that there are manyroles for teachers in the research process.
Even those teachers who do not plan to do research themselves should be knowl
edgeable,informed, and criticalconsumers of others' research. The third is that as
classrooms are specifically constituted to facilitate language learning, we should
develop skills for systematically finding out what goes on in them. Fourth, we
argue that no single approach to language classroom research is inherently supe
rior to others;instead, the choice of a research method should be determined by
the purpose of the study—a point we willreturn to throughout this volume.

REFLECTION

What areyourownpersonal goals in readingthisbook? What do youhope


to get out of it?

When we began our own teaching careers(more yearsago now than we wish
to admit), there was a preoccupation with a search for the one best method, and

4 exploring second language classroom research


a range ofmethods was presented to teachers for adoption. However, as it turned
out, methodological prescriptions were rarely supported by research. As teach
ers,we were supposed to take suchprescriptions on faith. Our view now is that
research isunlikely ever to provide a packaged solution to the challenges of lan
guage teaching. However, empirical research does have an important place,
alongside common sense and experience, in helping teachers to determine what
theycanand should do to facilitate language learning.
Our second theme is that there is a central place for teachers in the research
process. In many instances, teachers are the people best positioned to conduct
classroom investigations. We are not suggesting that teachers should replace ac
ademic researchers in all cases, but that it is often appropriate that they become
partners in the research enterprise to find answers ito questions of pedagogy. In
other cases, teachers themselves should investigate iJieirown teaching and their
students' learning. The ability to do research is not a matter of one's appointed
position, but rather of one's knowledge, skill, and attitude.
The third theme—developing skills for investigating classroom processes—
is one we hope readers will take to heart. We believe our own teaching has
improved through the systematic application of research procedures in our
work. Whether weare tapingand transcribing our learners' groupworkor mak
ing regular entries in our teaching journals, we have found answers to questions
and solutions to puzzles by viewing our ownclasses through research lenses.
The fourth theme is that of questioning the alleged superiority of one re
search method over another. We are not willingto take sides in what has histor
ically beenthe quantitative-versus-qualitative debate. Instead, we will argue that
appropriateness should be the guiding principle. That is, asresearchers wemust
be eclecticand choose data collection and analysis procedures that are appropri
ate for answering the research questions we pose. !
We begin this firstchapterby reviewing two broadtraditions that have been
importantin language classroom research overthe pastfifty years. This orienta
tion is followed by a section in which we trace die historical development of
classroom research. Here we look at the evolution of the field from large-scale,
highly expensive, and largely inconclusive methods comparison studies to more
localized, naturalistic (and often teacher-driven) studies. This section provides a
backdrop to the next, in which we define classroom research, starting with defi
nitions of the two concepts that are central to this book, namely classrooms and
research. We then introduce action research, which is emerging as the third main
approach to language classroom research.
The final section of the chapter briefly considers the impact of technology
on language education and, in particular, how technology is currently forcing a
redefinition of what we mean when we refer to classrooms. To examine a
technology-based context, we will summarize a sample study conducted about
online language learning. The chapter concludes with questions and tasks
designed to encourage the reader to contest the ideas presented in the chapter
against the reality of their own context and situation. Suggestions for further
reading anda Web site to explore are also provided.

Chapter 1 Introducing Second Language Classroom Research


REFLECTION

We have used the term empirical research in the text above. What does this
phrase mean to you? Is it the same as experimental research}

TWO MAIN CLASSROOM RESEARCH TRADITIONS

In general, two broad approaches or traditions to classroom research have been


historically dominant in language education. These are the psychometric and
the naturalistic traditions. (A diird approach, action research, has become in
creasingly important in the past three decades and will be introduced below.) In
laterchapters, we will explore specific research methods and procedures situated
within these broad traditions. Here we will provide a briefoverview to orientyou
to these two main approaches.

The Psychometric Research Tradition


Early classroom research was dominated by the psychometric tradition. Psycho
metric studies are studies that seek to measure psychological properties—such as
attitude—and psychological operations—such as language learning. In this ap
proach, the aim is to test the influence of different variables on one another. It is
sometimes called experimental research because it often involves setting up formal
experiments to test hypotheses using psychometric data collection and analytic
procedures. It is sometimes referred to as quantitative research because the data
are typically numeric in nature. Such data consist of measurements, tabulations,
ratings, or rankings.
In language classroom research, the psychometric tradition has often been
used to investigate the mental mechanisms hypothesized to underpin second lan
guage acquisition. Assuming that the more interaction there is, the greater the
language acquisition will be, researchers ask questions such as, "What kinds of
classroom tasks maximize student-studentinteraction?" These sortsof questions
are important because answering them can help us to understand how to guide
students in improving their language proficiency.
Language classroom researchers working in the psychometric tradition typ
ically investigate the effect of different methods, materials, teaching techniques,
types of classroom delivery, and so on, on language learning. A typical question
might be, "Does Method X result in more effective language learning than
Method Y?" In this sort of study, different groups of students are taught using
the two different methods. At the end of the process, the students are tested, and
the groups' average test scores are analyzed statistically to determine whether
any differences between the scores are powerful enough to be considered signif
icant. Making this determination depends on the researchers controlling the
variables involved in the study. If the research has been carefully designed and

6 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


carried out, the researchers can determine whether observed differences in the
groups' scores are due to the different teaching methods or are a matter of
chance.
In order to make claims about the effectiveness of one method over another,
or one set of materials over another, researchers working in the psychometric
tradition must be sure that it is the particular methodor the materials under in
vestigation thatare causing any observed differences in the students' testscores
rather than some other variable(s). For this reason,workingin the experimental
mode requires researchers to exercise a great deal of control over the variables
thatmight influence the outcome of a study. Such control is seldom possible in
real classrooms, where teachers and students are going about the business of
teaching and learning. For this reason, among others, classroom research has
broadened its methodological scope to include more naturalistic approaches to
data collection and analysis in recent years.

The Naturalistic Research Tradition


In naturalistic research, alsocalled naturalistic inquiry, the aim is to obtain insights
into the complexities of teachingand learningthrough uncontrolled observation
and description rather than to support the claim that Method X works better
than Method Y, or that Course Book A is better than Course Book B. In class
room research, therefore, this approach is centrally concerned with document
ingand analyzing whatgoes on in naturally occurring classrooms that have been
constituted for the purposes of teaching and learning rather than for the pur
poses of investigating teacher and learner behavior. Naturalistic research is
sometimes called qualitative research, as opposed to quantitative research, because
it is concerned with capturing the qualities and attributes of the phenomena
being investigated rather than with measuringor counting.
Naturalistic inquiry is sometimes seen as subjective in nature while psycho
metric research valuesobjectivity. Originally these terms, subjective and objective,
referred to point of view. That is, objective information was information about
an objectthat existed independent of the observer. In contrast, subjective infor
mation existed in the mind of the subject rather liian external to that person.
Over the years, however, these two terms, as they are used in research, have
taken on slighdydifferentmeanings. Objective informationhas come to be seen
as verifiable, factual, and, therefore, valuable, while subjective information is not
usually verifiable byan external observer and,therefore, issometimes considered
less valuable in the psychometric tradition. However, in action research and in
some realizations of the naturalistic inquiry tradition, the participants' view
points are indeed valued.
In his book on interpretive approaches to classroom research, van Lier
(1988, p. 37) justifies a focus on the subjective, qualitative tradition on five
grounds:
1. Our knowledge of what actually goes on in the classroom is extremely
limited.

Chapter 1 Introducing Second Language Classroom Research


2. It is relevant and valuable to increase that knowledge.
3. This can onlybe done bygoinginto the classroom for data.
4. All data must be interpreted in the classroom context, i.e., the context of
their occurrence.
5. This context is not onlya linguistic or cognitive one, but it is also essen
tially a social context.

Although van Lier's comments were written more than two decades ago, they
remain true today.
Naturalistic research is actually a cover term for several different research
methods. Two of the methods thathave frequently been used in language class
room research are ethnographies and case studies. Both of these methods will
be covered in detail in future chapters, but herewewill briefly consider them to
illustratenaturalistic inquiry.
Large-scale, long-term studies aimed at investigating classrooms as cultural
systems are called ethnographies. The roots of the ethnographic tradition in lan
guage classroom research can be traced to anthropology:
In anthropology, the ethnographer observes a little-known or 'exotic'
group of people in their natural habitat and takes fieldnotes. In addition,
working with one or more informants is often necessary, if only to de
scribe the language. Increasingly, recordingis used for description and
analysis, not just as a mnemonic device, but more importantlyas an es
trangement device, which enables the ethnographer to look at phenom
ena (such as conversations, rituals, transactions, etc.) with detachment.
The same ways of working are applied in classrooms. However, record
ing (and subsequent transcription) is of even greater importance here
than in anthropological field work, since many more things go on at the
same time and in rapid succession, and since the classroom is not an
exoticsetting for us but rather a very familiar one, laden with personal
meaning, (van Lier, 1988, p. 37)

Thus, a classroom ethnography views the language classroom (or program or


school) as a culturalsystem whosepatterns can be discovered through longitudi
nal data collection and analysis.
Case studies are another prevalent form ofnaturalistic research in classrooms,
but this term is a bit tricky to define. In a case study, we investigate a unit of
something. That unit can be a learner or a teacheror a teacher-in-training. It can
be a particular class or a program or a school.A casestudy is often characterized
as being an in-depth analysis of one particularexemplar of the thing we wish to
understand—one teacher-in-training, one disruptive learner, or one after-school
program. However, there are also case studies that examine parallel cases—two
learners, for example, or three classrooms—and yet still use the case study
methodology.
Typically, case studies involve the researcher's long-term, or longitudinal,
involvement in the research context, as well as detailed data collection about the

8 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


person or entity being investigated. Case studies are not as broadly contextual-
ized asethnographies andwill often not work at the levels of cultural description
that characterize ethnographies. Instead, case studies focus on one case (or a few
cases) to explore a research issue in depth.
There is increasing interest in applying naturalistic inquiry techniques to
the investigation of language classrooms. In 1996, we published a collection of
original classroom-oriented research studies. In our introduction to thevolume,
we wrote:

Our hope was to bringtogether a series of rich descriptive and interpre


tive accounts, documenting the concerns of teachers and students as
theyteach, learn and use languages The book was born partly out of
frustration as we sought in vain for appropriate qualitative studies as
models for our own students, and partly out of respect for and fascina
tion with teaching and learning. (Bailey and Nunan, 1996,p. 9)
In the years since that collection was published, many more classroom studies
have been conducted in the naturalistic inquiry tradition, and there are now
methodological resources available for researchers and teachers whowish to use
the data collection and analysis procedures of naturalistic inquiry. (See, for in
stance, Henze, 1995;K. Richards, 2003.) We will deal with some of these proce
dures in detail in subsequent chapters of this book.

Combining Traditions in Language Classroom Research


The gap between the two dominant research traditions discussed above may
seem impossibly wide. However, they represent the two ends of a continuum
rather than two mutually exclusive domains. In fact, both the psychometric tra
dition and the naturalistic inquiry tradition represent different families, or cul
tures, of empirical research. That is, empirical research is the cover term, meaning
research based on the collection and analysis of data.

REFLECTION

In an earlier reflection task, we asked you to think about the meaning of


empirical research. How closely did your understanding of that term
match the point of view presented here? If there were differences, what
were they?

Both the more quantitativelyoriented psychometric tradition and the more


qualitatively oriented naturalistic inquiry tradition consist of a wide range of
research methods and procedures that can be helpful in investigating language

Chapter 1 Introducing Second Language Classroom Research


Quantitative Data Collection Qualitative Data Collection

Quantitative Computing statistical Tabulating the observed


Data Analysis comparisons of learners' test frequencyof certain errors
scoresto see if there are any in language students' writing
statistically significantdiffer samples
ences between groups taught (Samples of language
with different methods
students' writing = data
(Test scores = data that have that have been qualitatively
been quantitatively collected; collected; tabulatingthe
computing statistical compar occurrence of certain errors =
isons = a quantitative data a quantitative data analysis
analysis procedure) procedure)
Qualitative Categorizinglanguage Summarizing written field
Data Analysis students as advanced, upper notes to yield prose profilesof
intermediate, intermediate, various teachers' teaching
or lower intermediate on the styles in an observational
basis of the learners' test study
scores
(Field notes = data that have
(Test scores = data that have been qualitatively collected;
been quantitatively collected; summarizing = a qualitative
proficiencycategorizations = data analysis procedure)
a qualitativedata analysis
procedure)

FIGURE 1.1 Examples of combined qualitative and quantitative


procedures in data collection and analysis

classrooms empirically. In fact, Allwright and Bailey (1991) argue against the
oversimplistic contrast of quantitative-versus-qualitative research. They state
that classroom data can be collected either quantitatively or qualitatively, and
that it can also be analyzed quantitatively or qualitatively. Figure 1.1 shows
some examplesfrom language classroom research.

REFLECTION

Based on what you already know about language classroom research and
aboutyourself, doyouhave a predisposition foreitherquantitative or qual
itative data collection? What about quantitative or qualitative data analy
sis? If you do favor one approach overanother, explain your position to a
colleagueor classmate.

10 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Skim through a recent issue of a professional journal that publishes


research in our field. (Likely journals include TESOI. Quarterly, Modern
Language Journal, Prospect, Language Learning, System, Language Teaching
Research, and the Asian Journal ofEnglish Language leaching.) Do the data-
based articles involve quantitative or qualitative data collection, or both?
Do they entail quantitative or qualitative data analysis, or both?

Many published studies contain elements of psychometric research and


elements of naturalistic research. In fact, Grotjahn (1987) argues that the
quantitative/qualitative distinction can refer to three different aspects of re
search. These are (1) the design (whether the study is based on an experimental,
quasi-experimental, or non-experimental design); (2) the form of data collected
(whether the study yields quantitative or qualitative data); and (3) the type of
analysis (whether the data are analyzed statistically or interpretively). Combina
tions of these three elements define the two pure research designs described
above—the psychometric approach (experimental design, quantitative data, sta
tistical analysis) and the naturalistic approach (non-experimental design, qualita
tive data, interpretive analysis). Grotjahn calls these the analytical-nomological
paradigm and the exploratory-interpretive paradigm, respectively.
However, Grotjahn's framework also yields six mixed or 'hybrid' forms. For
example, R. L. Allwright (1980) investigated the length of learners' speaking
turns and how they got those turns in two lower-intermediate KSI. classes. I le
tape recorded the classes and transcribed the audiotapes. There was also a class
room observer present to take notes during the taping. In this case, the design
was non-experimental, the data were qualitative in nature (audiotapes and tran
scripts thereof), and the analysis was both interpretive and statistical. We will re
turn to these issues in Chapter 15, when we consider mixed-methods research.
When we look at published research, we find Grotjahn's claims about
research design possibilities are borne out. As Bailey (2005) notes, "When the
experimental tradition was dominant, and alternative research paradigms were
scorned for yielding 'soft' data, one seldom saw researchers combining proce
dures drawn from the different traditions" (p. 30). In recent years, however,
language classroom research has employed a range of procedures to address
research questions and test hypotheses.
In fact, the emerging trend has been to combine various means of data col
lection and analysis. For example, Donato, Antonek, and Tucker (1994) describe
theirstudy of aJapanese FLES (foreign language in the elementary school) pro
gram as a multiple perspectives analysis because it captured numerous points of
view. These researchers collected a wide range of data derived from question
naires completed by parents and learners, oral interviews, reflections from the
Japanese teacher, questionnaires from otherteachers at theschool, and an obser
vation system. The authors conducted statistical analyses with some of the data

Chapter 1 Introducing Second Language Classroom Research 11


and did a descriptive analysis ofthechildren's interviews. Donato et al. conclude,
"To understand the complexity of FLES programs requires diverse sources of
evidence anchored in the classroom and connected to the wider school commu
nity" (p. 376). In otherwords, the language learning and teaching these authors
investigated was too complex to betreated satisfactorily with asingle type ofdata
or a single analytic method.
The multiple perspectives analysis described by Donato et al. (1994) typifies
a methodological paradigm shift in language classroom research. Attempts to
combine various types of data collection and analysis from different research
traditions used to be rather uncommon. Nowadays, such efforts are frequently
viewed as appropriate and helpful. The general trend in language classroom
research has been to a broadened acceptance of varied research approaches.
Atthis point, a brief historical overview ofthis transition will behelpful in terms
of understanding the various data collection and analysis procedures we will
explore in this book.

A BRIEF HISTORICAL BACKGROUND

In thissection, we will take a historical approach and briefly review some illus
trative investigations into language acquisition in classroom settings. These
studies typify the development of second language classroom research and show
how the different research traditions have been realized over the years.
An early example of classroom research in the psychometric tradition was
conducted by Scherer and Wertheimer (1964). The project was a classic
'methods-comparison' study in which the researchers set out to compare the
grammar-translation method of teaching with the then innovative audiolingual
method. The research question guiding thestudy was asfollows: Is audiolingual-
ism a more effective method of learning a foreign language for college-level
learners than grammar-translation?
The subjects in this study were two groups of college students learning
German asa foreign language. One group was instructed in listening, speaking,
reading, and writing using translation and grammar studies. The other group
was taught bytheinnovative audiolingual method, in which the emphasis was on
listening and speaking rather than on reading and writing. In audiolingualism,
translation was avoided and grammatical rules were learned inductively rather
than deductively. At the end of the two-year experimental period, both groups
were tested, and the scoreswere analyzed to decidewhether differences were sta
tisticallysignificant.

REFLECTION

Whatdoyou think these researchers found? Before reading further, try to


predict the results of theirstudy based on your own reading and experi
ence, both as a teacher and as a language learner.

12 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


As it turned out,the study byScherer and Wertheimer did not demonstrate
conclusively that one method was superior to the other, as students' scores re
flected the strengths ofthe respective methods. Students instructed according to
the grammar-translation method did significantly better than the audiolingual
students on tests of reading and translation, while the audiolingual students did
significantly betterat listening and speaking.
Numerous criticisms were leveled at the Scherer and Wertheimer study.
One of these wasthat the researchersdid not look at what actuallywent on in the
classrooms themselves. (See Clark, 1969.) They simply assumed that the teach
ers were following the particular method under investigation. Nor did they
collect data on the teachers' understanding of the principles underpinning the
respective methods. This type of research is referred to byLong (1980) as black
box research because we have no way of knowing what actually happened in the
classroom itself. It is also called a product study because the researchers only
looked at the outcomes—the products of the teaching methods investigated.
In contrast, there are many published classroom-based investigations that
can be called process studies. That is, they focus on the classroom processes of
teaching and/or learning, but they do not try to measure learning outcomes in
any way. This type of research (e.g., the study of students' speaking turns by
R.L. Allwright [1980], described above) attempts to understand what happens in
classrooms without making causal claims asto onesetofmaterials, a given teach
ingmethod, or a particular curriculum being better than another in promoting
languagelearning.
One methods comparison study that did go inside the 'black box' is that by
Swaffar, Arens, andMorgan(1982). These researchers set out to evaluate the rel
ative efficacy of audiolingualism in comparison with cognitive code learning in
the teaching of German asa foreign language. This research project hadseveral
similarities to Scherer and Wertheimer's in that both studies attempted to com
pare two competing instructional methods. However, it differed from the earlier
study in that Swaffer et al. were aware of the need to look at what was actually
happening in the classrooms rather than making assumptions about what was
happening. In addition, before going into the classrooms, the researchers sur
veyed the teachers bygettingthemto indicate howoftentheyused certain prac
tices, such as the explicitteaching of grammar.

REFLECTION

Which conclusion below do you thinkSwaffer tad her colleagues found?


• audiolingualism wassuperior to cognitive co< lelearning
• cognitive codelearning was superior to audic lingualism
• the two methods were equal intheir effectivejness
What determinedyour choice?

Chapter 1 Introducing Second Language Classroom Research 13


In fact, Swaffar et al. (1982) found that at thelevel ofclassroom action, the
concept ofteaching 'method' was questionable because teachers used arange of
techniques rather than adhering slavishly to either audiolingualism orthe cogni
tive code method:

Methodological labels assigned to teaching activities are, in themselves,


not informative because they refer to a pool of classroom practices
which are uniformly used. Thedifferences among major methodologies
are to be found in the ordered hierarchy, the priorities assigned to the
tasks. Not what classroom activity is used, but when and how form the
crux ofthe matter in distinguishing methodological practice, (p. 31)
Swaffer et al. went onto recommend a fundamental rethinking ofthe search for
thebest method because their investigation ofactual classroom practices showed
that the distinction between different teaching methods at the level of classroom
activity was not a salient one.
In terms of researchdesign, these authors were able to demonstrate the im
portance ofcollecting process data from inside theclassroom as well as product
data inthe form oftest scores. Thus, thereport by Swaffar etal. isanexample of
what has come to be known as a process-product study (Long, 1984) because it
combined observational data about what actually occurs in language classrooms
with measures of learning outcomes.
Despite the emerging awareness that it was useful to incorporate both
process and product data into the design of classroom research, the debate be
tween proponents of quantitative and qualitative research remained heated. This
opposition continued even though books were beginning to appear that at
tempted to demonstrate the mutual dependence of the two traditions. (See, for
example, Chaudron, 1988.) It seemed that, forsome, the psychometric approach
and naturalistic inquiry represented fundamentally different andpossibly incom
patible ways of looking at the world.
The discussion of product studies, process studies, and process-product
studies did lead to positive developments, however. Recognizing the importance
of collecting data from inside the 'black box'stimulated the development of ob
servation instruments of various levels of complexity. One of the most compre
hensive instruments is the COLT (Communicative Orientation to Language
Teaching), which was originally developed for investigating different kinds of
second- and foreign-language programs in Canada. This scheme was theoreti
cally motivated by communicative language teaching, as can be seen in the
following quote (Allen, Frohlich, and Spada, 1984):
Our concept of communication feature has been derived from current
theories of communicative competence, from the literature on commu
nicative language teaching, and from a review of recent research into
first andsecond language acquisition. The observational categories are
designed (a) to capture significant features of verbal interaction in
L2 classrooms and (b) to provide a means of comparing some aspects
of classroom discourse with natural language as it is used outside the
classroom, (p. 233)

14 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


By reducing classroom behavior tosets ofquantifiable categories, the COLT
observation system enabled researchers to make direct comparisons between dis
parate language classes (and even language programs), and then to link the be
haviors to learning outcomes. Spada (1990), for example, used the scheme to
compare theways inwhich three different teachers interpreted theories ofcom
municative language teaching in their classroom practice and "to determine
whether differences in the implementation of communicative language teaching
principles had any effect on learning outcomes" (p. 301).
As mentionedabove, research likethis is known asprocess-product research
because it attempts to link classroom processes with learning outcomes. For in
stance, in Spada's (1990) study, significant differences were found between the
listening test scores of students in the three different groups, and these were
linked directlyto the teachers' behavior:
In interpreting the different performance of learners on the listening
test, the investigator examined both quantitative and qualitative differ
ences in thelistening practice offered in the threeclasses. The quantita
tive results revealed that class A spent considerably more time in
listening practice than the other two classes, yet class A improved the
least. However, because the listening practice in this class did not
prepare learners for the listening input as carefully as the listening
comprehension instruction didin classes BandC, the investigator con
cluded thatqualitative ratherthanquantitative differences in instruction
seemed a more plausible explanation for significantly more improve
ment in listening comprehension in classes Band C. (p. 303)
Thus, Spada's study connected theprocess variables (what students and teachers
actually did during lessons) with the product variable (the measurement of
students' improvement in listening comprehension).
This section has provided a briefhistorical overview of trends in language
classroom research. We will now turn to definitions of several key terms which
will be used throughout the remainder of this book. These terms include class
room, data, and classroom research.

DEFINING CLASSROOM RESEARCH


In theforegoing sections we have repeatedly alluded toclassroom research as ifthe
term were totally transparent. Indeed, the concept of classroom is probably
straightforward to most readers. A classroom is a place in which teachers and
learners aregathered together forinstructional purposes: "The L2classroom can
be defined as the gathering, for a given period of time, of two or more persons
(one of whom generally assumes the role of instructor) for the purposes of lan
guage learning" (van Lier, 1988, p. 47). This definition encompasses everything
from one-on-one tutorial sessions to a professor lecturing to hundreds of stu
dents. However, withthe development ofdistance learning, and,in particular, the
use of technology, the "gathering together" may happen in a virtual classroom

Chapter 1 Introducing Second Language Classroom Research 15


rather thanin a physical space. (Wewill lookat the natureofthe onlineclassroom
later in the chapter.)
The general term research has been defined as "the organized, systematic
search for answers to the questions we ask" (Hatch and Lazaraton, 1991, p. 1). A
somewhat more elaborated description says thatresearch isa "systematic process
ofinquiry consisting ofthree elements or components: (1) a question, problem,
or hypothesis; (2) data; and (3) analysis and interpretation" (Nunan, 1992, p. 3).
The term data, as it is used in this bookand elsewhere, refers to records of
events (Bateson, 1972). In language classroom research, data are not limited to
test scores or other measurements. Theycan include audiotapes orvideotapes of
lessons, transcripts ofinteractions, entries from students' or teachers' journals,
responses to interview questions, recordings ofstudents' speech as they dotasks,
responses to questionnaires, samples of students' written work, and so on.
To the three components of research discussed above, wewould add that the
results ofthe inquiry should be published (in the sense ofbeing made public) so
that they can be subjected to critical scrutiny and can inform the field. Sharing
research results can also provide us with new ideas and alternative ways ofana
lyzing andinterpreting the data. Publication canbe formal and written, in book
chapters or in print or electronic journal articles. But it can also be less formal
and unwritten, as inconference presentations or progress reports to colleagues.
Figure 1.2 illustrates various possible combinations of formal or informal, and
writtenor unwritten dissemination of research projects.
Classroom research is sometimes called classroom-based or classroom-centered
research. These terms refer to the procedures described above being conducted
in classrooms:

Classroom-centered research is just that—research centered on the


classroom, as distinct from, for example, research that concentrates on
the inputs to the classroom (the syllabus, the teaching materials) or the
outputs from the classroom (learner achievement scores). It does not
ignore in anyway or try to devalue the importance of such inputs and
outputs. It simply triesto investigate whathappens inside the classroom
when learners and teachers come together. (D. Allwright, 1983, p. 191)

Formal Informal

Written Refereed or invited book Listserv postings about


chapters, journal articles classroom researchprojects
(electronic or print medium)
Unwritten Refereed or invited conference Discussions of research find
presentations ings at in-house teachers'
meetings

FIGURE 1.2 Possible types of publication of research results

16 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


You will also find references in the literature to classroom-oriented research.
This research consists of studies conducted outside the classroom, in laboratory,
simulated, or naturalistic settings, but whichmakeclaims for the relevance of the
outcomes for classroom teachingand learning. In this book,we will typically use
the cover term classroom research, which encompasses both classroom-based and
classroom-oriented studies. While the great bulk of the work we look at is
classroom-based, use of the slightlybroader term enables us to refer to relevant
studies conducted outside classrooms as well.
In 1991, Nunan carried out a detailed review of classroom-based and
classroom-oriented studies. Of fifty studies reviewed, only fifteen were actually
classroom-based (Nunan, 1991a). In the years since that review, however, many
more classroom-based studies have been conducted, and we will summarize
several of those research projects in the various chapters of this book. One
particularly interesting development during the past two decades has been the
advent of classroom action research, the topic of the next section.

REFLECTION

What does the term action research mean to you 5Have you read any action
researchreports in the past? Ifso, what do you recall about this approach?
If not, what can you predict?

ACTION RESEARCH

Action research is an emerging tradition in language classroom research. This


method consists of the sameelements as regular research, that is, questions,data,
and interpretation. What makes classroom action research unique is that it is
conducted by classroom practitioners investigating some aspect of their own
practice. In other words, itiscarried outprincipally by those who are best placed
to change and, as a result, improve whatgoes on in the classroom.
This is not to sav that research carried out by non-classroom-based
researchers doesn't lead to change. Nor does it suggest that researchers
and teachers might not collaborate in the research process. However,
nonpractitioner-driven research is often motivated by a desire to identify
relationships between variables that can be generalized beyond the specific sites
where the data are collected. The primary motivation for action research is the
more immediate one of bringing about change and improving teaching and
learning in the classrooms where the research takes place
Action research as a method involves systematic procedures for collecting
data and understanding their meaning in a local context. Carr and Kemmis
(1986) describe action research as "a form of self-reflective enquiry undertaken
by participants in social situationsin order to improve the rationalityand justice
of their own practices, their understanding of these practices, and the situations
in which these practices are carried out" (p. 1). The name action research comes

Chapter 1 Introducing SecondLanguage Classroom Research 17


from the fact that planning and taking action are central components of the
action research method.
Carr and Kemmis (1986) emphasize the ongoingcyclical nature of action re
search. The firststep is to develop a planfor action to improve whatis happen
ing. The next stepsare to implementthe action plan and to gather data in order
to observe the effects of the action taken in the context in which it occurs. The
final steps in the cycle are to reflect critically on the process and then plana sec
ond round of research, and so on. The cyclic investigation continues until the
actionresearcher has accomplished his or her goals.
Another definition comes from Cohen and Manion (1985), who describe
actionresearch assmall-scale interventions "in the functioning of the realworld"
(p. 208). Such research is intimately connected to the contexts in which it is con
ducted. It can involve the collaboration of teachers and other researchers. Cohen
and Manion outline eight stages in the action research process:
1. Identify the problem.
2. Develop a draft proposal based on discussion and negotiation between
interested parties, i.e., teachers, advisors, researchers, and sponsors.
3. Review what has alreadybeen written about the issue in question.
4. Restate the problem or formulate hypotheses; discuss the assumptions
underlying the project.
5. Select research procedures, resources, materials, methods, etc.
6. Choose evaluation procedures.
7. Collect the data,analyze the data,and providefeedback to the researchteam.
8. Interpret the data, draw out inferences, and evaluate the project, (pp.220-1)
While there are other models of action research, this list provides a helpful
overview of the steps involved in conducting this sort of classroomresearch.
Bailey (2005) contrasts action research with both experimental and natura
listic research as follows:

While experimental research is often directed at hypothesis testingand


theory building, and naturalistic inquiry aims to understand and de
scribe phenomena under investigation, action research has a more im
mediate and practical focus. Its results may contribute to emerging
theory, and to the understanding of phenomena, but it does not neces
sarilyhave to be theory-driven, (p. 25)

To note that action research has a practical focusdoes not demean its value.Asvan
Lier (1994a) has observed, "We must never forget that it is ... important to do
research on practical activities andforpractical purposes, suchasthe improvement
of aspects of language teaching and learning" (p. 31). (See also van Lier, 1994b.)
In this section, we have given a brief account of action research, a topic that
is elaborated upon and described in greater detail later in this volume. To sum
marize, it is a method of conducting research that involves the participants (such

18 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


as teachersand learners)in ongoing,cyclical investigations of their own contexts
(in our case, classrooms). However, action research should not be confused with
classroom research or teacher research. \
As noted above, the term classroom research refers to research conducted in
classrooms—regardless of who doesit or what methodsare used.Action research,
in contrast, is an actual research method, in that it involves a codified sequence
of steps. It canbe effectively employed in language classrooms but has also been
used in other settings—in neighborhoods, in community centers, etc. Teacher
research, in contrast, is characterized by who conducts the investigation. Of
course, teachers can conduct action research in classrooms (Bailey, 2001a), but
they are not limited either bymethod or locale. For instance, a university-level
EFL teacher might wish to investigate his students' use of English outside the
classroom—talking to tourists, using the Internet,
and differences between classroom research, teacher research, and action
research are summarized in Figure 1.3.

Type of
Research What Who

Classroom Investigations carried University-based To generate insights


Research out in classrooms researchers, graduate and understanding, to
utilizing a range of students, and/or ! test hypotheses, to
qualitativeand quan teachers generate theory,
titative methods of and/or to produce
data collection and outcomes that can be
analysis generalized
Teacher Investigations carried Teachers lb improve practice,
Research out in or out of class and/or to generate
rooms, utilizing a insights and under
range of qualitative standing to related
and quantitative practice and theory
methods of data col
lection and analysis
Action A cyclical process of Participants in a To improve one's
Research identifying practical setting, including own practice, to solve
problems or chal teachers (sometimes problems, and/or to
lenges, formulating a in collaboration with satisfycuriosity
plan for addressing others)
them, taking action,
evaluating the results,
and planning subse
quent rounds of
investigation

FIGURE 1.3 Comparing classroom research, teacher research, and


action research

Chapter 1 Introducing Second Language Classroom Research 19


REFLECTION

Think of a study—one that you have read or that you could imagine
doing—that would involve classroom action research conducted by a
teacher (or a team of teachers). Explain what makes that study simultane
ously (1) action research, (2) classroom research, and (3) teacher research.

ACTION

Skim the reference list for this book. Put a check mark byany items which
sound particularly interesting to you. Can you tell from the titles which
items are likely to involve the psychometric tradition, naturalistic inquiry,
or action research?

THE CLASSROOM REDEFINED —'VIRTUAL*


CLASSROOMS

Atthe present time, technology is having a profound impact on all aspects of life.
In education, the ease with which computers can bring people together across
time and space is forcing a redefinition of the classroom. In the opening section,
we discussed ideas from D. Allwright (1983) and van Lier (1988), whosuggested
that classrooms could be defined as places where individuals were gathered to-
getherfor purposes of teaching and learning. I Iowever, through technology, this
"gathering together" no longer requires the individuals to inhabit the same
physical space.
The following vignettes illustrate some of the changes wrought by technol
ogy in language education and teacher preparation:

» Astudent in Toronto who wasunable to attend class reviews a transcript of


the lesson that is posted on the Web after the conclusion of the class.
• A teacher educator in Auckland conducts a graduate class on second lan
guage acquisition through a text chat site with students in San Diego,
Bangkok, Istanbul, and Buenos Aires.
• Asecondary school teacher in Hong Kong posts all of herassignments and
class handouts onto the class Web site. Students can work with these mate
rials online and download those that they want to keep in hard-copy form.
Using voice chat, EFL students in Beijing, Seoul, and Tokyo take part in a
conversation class with a teacher based in Bogota.
Aschool in Osaka has its students complete an online placement test which
automatically assesses and places them into instructional groupings in a
fraction of the time it used to take using a pencil-and-paper test.

20 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


• A student in an academic writing program on a field trip in Rio de Janeiro
submits a draft of his assignment to his teacher in Los Angeles as an e-mail
attachment. The teacher inserts comments and returns the assignment to
the student via e-mail.

Changes brought about by technolog}- "challenge our self-concept as foreign


language teachers, because, much more than in the past, we are now called
upon to redefine our roles as educators, since we need to mediate between the
world of the classroom and the world of natural language acquisition" (Legutke,
2000, p. 1).
This redefinition of the classroom is also having an effect on language class
room researchers who now have to go beyond the four walls of the traditional
classroom to conduct research. For example, a recent study into the discourse
features of classroom interaction broadened its focus from face-to-face class
rooms and included virtual classrooms. The researchers looked at similarities
and differences in the two kinds of discourse as well as at the advantages and dis
advantages of these two kinds of classrooms (Christison and Nunan, 2001). We
will return to these issues and other ideas about online teaching and learning
when we consider discourse analysis and other approaches to analyzing class
room interactions.

REFLECTION

Have you ever taken or taught a course online? If you have, what were the
differences in the online interaction and what you might experience in a
similar course taught in a face-to-face classroom context? If you haven't,
what do you diink the differences might be?

SAMPLE STUDY

Kevin Jepson was an experienced language teacherwith a strong interest in tech


nology. As he learned about research methods in our field and studied second
language acquisition (SLA), he became intrigued with the research about con
versational interaction promoting language learning.Jepson wanted to investi
gate what happened during conversations in online classes. He was particularly
interested in negotiated interactions, since a great deal of SLA research suggests
that negotiation processes are key factors in language learning. Negotiated inter
actions are those in which the interlocutors—the people who are speaking—
must try to understand one another's meaning. Based on his review of the litera
ture, Jepson (2005) defined negotiation of meaning as "a cognitive process that
speakers use to better understand one another, that is, to increase the compre-
hensibility of language input" (p. 79).

Chapter 1 Introducing Second Language Classroom Research 21


Jepson (2005) compared the interaction in two kinds of online English chat
rooms: those with spoken (voice) chats and those with typed (text) chats. The
technology permitted the students in the voice chat rooms to actually talk to one
another in real time. The text chat rooms allowed the students to communicate
in real time but they had to type their ideas and comments on the computerand
then post them to the chat. He posed two research questions about this context:
1. Which types of repair moves occur in text and voice chats?
2. What are the differences, if any, in the repair moves in text chats and the
repair moves in voice chats when time is held constant?

The repair moves Jepson studied had been identified in previous research. They
consisted of two main categories: negotiation of meaning and negative feedback.
The negotiation of meaning category had five types: (1) clarification requests,
(2) confirmation checks, (3) comprehension checks, (4)self-repetitions and par
aphrases, and (5) incorporations. The negative feedback category also had five
types: (1) recasts, (2) explicit correction, (3) questions, (4) incorporations, and
(5) self-corrections.
The data for this study were recorded simultaneously in the voice chat
roomsand the textchat rooms of an online English program using twocomput
ers. Jepson audio-recorded five minutes of voice chats at the same time he saved
the texts generated by the participants in the typed chat rooms. He repeated this
procedure five times, fora total of twenty-five minutes of typed chat and twenty-
five minutes of voice chat. Jepson later transcribed the interaction in the voice
chat rooms. He counted the number of times the participants engaged in nego
tiation of meaning and in negative feedback.

REFLECTION

Imagine being a language learner participatingin an online chat conducted


in a language other dian your native language. What do think the differ
ences would be for you as a participant if the chat were typed compared to
a chat in which you were actually talking? What differences do you think
Jepson found when he compared the repair moves in the voice chats and
the typed chats?

The results of this study are interesting and complex and too numerous to
list in detail here. The main findings can be summarized as follows: (1)There
were more repair moves in general in the voice chats than in the text chats.
(2) Likewise, there was more negotiation of meaning in the voice chats than in
the text chats. (3) There were fewer negative feedback repairs than negotia
tion of meaning repairs. All of these differences were statistically significantly
different.

22 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Here we are mtentionaUy using a keyphrase that often appearsin research


reports—statistically significantly different. In the culture of research,
significant (or significantly) has a special meanir g. Perhaps you have seen
thisphrase in other research reports. What do3ou think it means?

PAYOFFS AND PITFALLS

There are a numberof pitfalls in choosing to conduct research, perhaps particu


larly classroom research. It is time-consuming, and it can be quite challenging.
In addition, the entire process can be frustrating if you don't have the proper
tools, training, or guidance. You may work in a system thatdoes notvalue teach
ersdoing research, soyou might wonder what the importance isofundertaking
the effort.
Research can also be frustrating if you conduct a study and the results are
not what you had hoped for. Or perhaps you complete your study and give a
presentation about it at a conference, buttheaudience is unimpressed with your
work, or even hostile. You mightwrite a formal report on the research and sub
mit it to a journal, only to have the paper rejected forpublication and to receive
harsh criticism from someunknown, anonymous reviewers. (Don't worry! This
ispartoftheprocess. Wehave bothhad numerous papers rejected byreviewers.)
Nevertheless, there are many payoffs associated withconducting research—
especially research in language classrooms. As teachers and teacher educators,
we are constantly amazed and renewed bythe interesting things we learn about
teaching and learning through the research process. And on those occasions
when a paper or conference presentation has been accepted, we have benefited
from the processes of writing and delivering our ideas, and getting feedback
from others. j
We have also found that our research efforts influence our teaching, and vice
versa. Particularly when we have conducted research, including action research,
in our own classrooms, we have gained new insights that we might not have
gained if we had simply gone on teaching day after day. Through the years, we
have developed new skills, both as teachers and as researchers, and have made
many new friends and acquaintances bygetting involved in research.

CONCLUSION

The aimof thischapter has beento set the groundwork for the rest of the book.
We have described the main characteristics of two predominant traditions—
psychometric research and naturalistic inquiry. We have considered how those
traditions have underpinned the evolution of classroom research. We have
provided a briefhistorical overview, as well as a definition (and redefinition) of

Chapter 1 Introducing Second Language Classroom Research 23


classrooms and of research. We have contrasted action research, teacher re
search, and classroom research as concepts, and have also explored the notion
of teacher-initiated action research. We have suggested that teachers have
many possible roles in the research process, including being active researchers
themselves.
Wewill end the chapter with discussion questions and tasks thatyoucanuse
to deepen your understanding of theissues presented here. We will also suggest
some readings that should be helpful to you if you wish to pursue these topics
further.

QUESTIONS AND TASKS

1. Identify a research question that would be most appropriately studied in


the psychometric tradition and one thatwould be best investigated using
naturalistic inquiry. Compare yourideas with a classmate or colleague.
2. What mental image does the term teacher-researcher evoke foryou?
3. What isyour attitude towards the idea of teachers doingresearch?
4. What skills would be needed by a teacher who wanted to do research?
Make a listandcompare it with the listof a classmate or colleague.
5. Given your current teaching situation, or a situation with which you are
familiar, what do you see as the advantages and disadvantages of teacher
involvement in language classroom research?
6. Think of a situation in a language class (whether you are the teacheror a
learner) thatyou would like to change. What would be the action(s) you
could take if you were going to conduct an action research project about
that situation?
7. How have developments in technology caused you to redefine your own
notion of 'the classroom'?
8. List three to five key issues that you'd like to see addressed by classroom
researchers, or that you yourself would like to investigate in your
classroom.
9. What is your view of objectivity andsubjectivity in language classroom re
search? Can you see a place for subjective data and interpretations or
should all research be objective in nature?
10. In your view, what is the relationship between teaching and research—
specifically languageclassroom researchas we havedefined the term in this
chapter?

SUGGESTIONS FOR FURTHER READING

Ifyou would like to learn more about the early history oflanguage classroom re
search, werecommend the articles by D. Allwright (1983), Brumfit andMitchell
(1990a), Gaies (1983), Long (1983a; 1984), Pica(1997), and Seliger (1983a). For

24 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


book-length treatments of classroom research, see D. Allwright and Bailey
(1991), Brumfit and Mitchell (1990b), Chaudron (1988), McKay (2006), Nunan
(1992), and van Lier (1988). |
For comparisons of the various traditions in language classroom research,
see Bailey (1998a; 2005). For good introductions to action research, we
recommend Burns (1998), Nunan (1990, 1993), van Lier (1994a), and Wallace
(1998). A classic methods comparison study is by Smith's (1970) report on
the Pennsylvania Project, which contrasted cognitive language teaching and
audiolingual method. Seealso Otto (1969).
If you would like to read Jepson's (2005) paper comparing voice chat and
typed chat, you can find it easily online. It is available at the Web site for the
onlinejournalLanguage Learning and Technology:
http://llt.msu.edu/vol9num3/jepson.

Chapter 1 Introducing Second Language Classroom Research 25


CHAPTER

Getting Started on
Classroom Research

Writing iseasy. Allyou do issitand stare ata blank sheet ofpaper until
drops ofbloodform onyourforehead. (Fowler, as cited in Applewhite, Evans, &
Frothingham, 2003, p.300)

INTRODUCTION AND OVERVIEW

Most books on classroom research eventually get around to the practical sideof
the business, usually culminating witha chapter on doingresearch. However, we
decided to jump in and introduce some of the practicalities, along with the diffi
culties, of doing research fromthe verybeginning of this book. There are several
reasons for thisdecision. First, we wantyou to start thinking aboutthe research
process from your own perspective right at the outset. We also hope that you
might "get your feet wet" by doing some relatively informal problem posing,
data collection, and data analysis before you getto theend ofthebook. Thirdly,
we hope that thischapter will provide a lens through which youcan view the rest
of the book. We want you to be able to read about data collection and analysis
issues with a stronggrounding in the practicalities of the research process.
In essence, this chapter outlinessome of the typical procedures that under
pin the planning andimplementation of a classroom research project, regardless
of the research traditionyou choose. In anysuchproject, whetherit iscarriedout
by experienced university researchers with large grants or classroom teachers
who are simply interested in understanding and improving the quality of what
goes on in their classrooms, researchers needto consider the following questions:
© What aspect of classroom teaching and learning am I interested in, and
what specifically is it about this issue that I reallywant to know?

26
© Does anybodyelse have an answer to my question?
© How do I get started?
© What kinds of data will be relevant to my research interest and question?
© How will I gather those data?
© What techniques existfor analyzing my data:
© Howcan I make myresearch available to anyone else who might beinterested?
These questions correspond to the following phases of a research project:
(1) areaidentification, (2) question formation, (3) literature review, (4) planning
and implementation, (5) data collection, (6) data Analysis, and (7) reporting. In
this chapter, we will deal with each of these areas although we will mainly focus
on area identification, question formation, the literature review, and defining
variables during the planning stages of researcli. Additional issues in data
collection and analysis will be dealt with further in subsequent chapters about
particular researchmethods.
We also want to introduce the idea that research is a kind of culture. It has
its own values, norms, artifacts, rules, and procedures. Like others sorts of cul
tures, research has subcultures. If you keep this metaphor in mind as you read
this book, you can imagine yourself as being an anthropologist, exploring the
culture(s) of language classroom research. We will return to this theme from
time to time aswe cover the concepts of research design and research methods.

DEVELOPING A RESEARCH QUESTION


We said in Chapter 1 that a minimum requirement for an activity to qualify as
research is that it needs to contain three keycomponents: "(1)a question, prob
lem, or hypothesis, (2) data, and (3) analysis and interpretation" (Nunan, 1992,
p. 3). We also added a fourth element—that is, theactivity should be published.
We areusing the term publish here in its original and broadest sense of'to make
public' If we don'tmake ourwork public, thenit isnot available to others. And
if it is not available to others, then it cannot be subjected to critical scrutiny.
Without critical scrutiny, the activity is more like reflective teaching than
research. (Of course, reflective teaching is a valuable undertaking in its own
right, but that is not our focus in thisbook.)
As suggested by the components above, the irst step is posing a research
question. Getting theresearch question rightisoneofthemost important things
to doin beginning a research project. Unless you getthequestion right, thesub
sequent steps in the research process are often impossible to carry out. In this
section, we look at some of the issues you need to consider in formulating a
research question.
In some approaches to research (such astheexperimental method in the psy
chometric tradition), the question is posed first, followed by the data collection
and the analysis. The research question istypically written in question form, but
it may be posed as a particular form ofsentence called a hypothesis. Ahypothesis

Chapter 2 Getting Started onClassroom Research 27


is a carefully worded statement that the researcher sets out to test by collecting
relevant data and analyzing those data. In the culture of the psychometric
tradition—and especially the experimental method—the skills of hypothesis
posing and testing are highly valued. (We will return to hypothesis testing in
Chapters 3 and 4.)
In other kinds of research, such as ethnography, the researcher may start out
with a research question that is somewhat general and broadly formulated. The
researcher then collects some initial data that help to refme and shape the re
search. In other words, there is an interaction between the data and the research
question. It's almost as if the data and the question are having a conversation.
For many teachers who wish to do classroom research, the research idea is
initially prompted by a problem, puzzle, or challenge that arises in the course of
their teaching. For example, a foreign language teacher may find that when the
students get really involved in communicative group work tasks, they tend to
switch to their first language. So, the teachers initial questions might be, "What
prompts students to switch from the target language to their first language, and
how can I encourage them to work in the target language?" This view of the sit
uation may then lead the teacher to exploring the complexities of code mixing
and code switching.

REFLECTION

Tike a few moments to reflect on your own classroom, or a classroom with


which you are familiar. Brainstorm a list of issues that might provide the
basis for classroom research.

If you are having difficulty coining up with ideas lor research, you might
follow Hatch and Lazaraton's (1991) suggestion that you keep a research journal.
Such a journal can he a useful resource when it comes to defining what interests
you and identifying a research area. Here is how such a journal might be
developed:

Each time that you think of a question for which there seems to be no
reach' answer, write the question down. Someone may write or talk
about something that is fascinating, and you wonder if the same results
would obtain with your students, or with bilingual children, or with a
different genre of text. Write this in your journal. Perhaps you take
notes as you read articles, observe classes, or listen to lectures. Place a
star or other symbol at places where you have questions. These ideas
will then be easy to find and transfer to the journal. Of course, not all
these ideas will evolve into research topics. Like a writer's notebook,
these bits and pieces of research ideas will reformulate themselves
almost like magic. Ways to redefine, elaborate or reorganize the
questions will occur as you reread the entries, (pp. 11-12)

28 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 2.1 General areas of interest and more specific research topics
Area Topic

Teacher questions The relationship between teacher questions and


learner responses
Closed versus open questions
Display versusreferential questions
Direct Inductive versus deductive teaching
instruction Teacher input and learner output
Wait time and learner output
Teacher speech modifications
Error correction What types of errors to correct
and feedback When to correct errors
i
How to correct errors
Student self-correction
Classroom Departures from the lesson plan
management Managing mixed-ability groups
Teaching large classes
Student interaction LI versus L2 talk
Monitoring language vise in groupwork
Increasing the amountjofstudent talk
Task analysis The different demands that tasks make on learners
Task type and difficulty
The relationship between task types and learner
language j
Learning strategies Strategy differences oflgood and poor learners
The effect of consciously teaching strategies
Strategyuse by learners of various proficiency levels
i
Affective factors Enhancing learners' motivation
Managing learners' anxiety
Learners' attitudes and achievement

Other sources ofideas can be found at the ends of articles, theses, or dissertations.
Thesereportsusually include asection called "Suggestions forFurtherResearch."
Table 2.1 contains some general research areas and topics for classroom re
search. This list isveryselective and is intended onlyto be illustrative sincethere
are manyother topicsthat could form the basis for classroom investigation.
As we have suggested, research questions can come from many different
sources—a problem that arises in the classroom, an interesting article, discus
sionswith colleagues, a journalof research ideas that we havekept over time, and
so on. One advantage of reading about the topic in question is that we maywell
find a study that can help us take our research interests forward. For example,
although we are particularly interested in listening comprehension, wemay find
a study on the effect of background knowledge on! reading comprehension. This
study might prompt us to ask whether the effects of background knowledge

Chapter 2 Getting Started onClassroom Research 29


function in listening activities as well. As another example, we may read a study
on the development of morphemes in German as a foreign language that sug
gests the acquisition order of these morphemes can't be changed by instruction.
This idea may cause us to wonder whether there are certain grammatical forms
in English that are also impervious to instruction.
After identifying a general area that is of interest to us, such as listening
comprehension or grammatical development, the nextstep is to pose and refine
the question, so that answering it is both doable and worthwhile. We may well
have a worthy question such as, "Is there a list of grammar items in English that
appear in a certain order that can't be changed by instruction?" However, when
we reflect on the question, we may decide that it is beyond our ability to deal
with at the moment, perhaps because of lackof time, expertise, and/or resources.
I laving identified a general area and a topic within that area, the next task is
to formulate a question or questions. This process is much more challenging than
it might seem to someone who hasn't done it before. Researchers and graduate
students sometimes takemonthsto get theirquestions right. For example, recog
nizing the breadth of the morpheme acquisition issue described above, we may
decide to workon a small part of the puzzle. After further thought and additional
reading, we may decide to restrict our attention to the acquisition of question
forms. Our new question might read, "What is the order of acquisition oiwh-
questions in English? Do learners acquire questions with is (such as, 'Where is
yourschool?') before they acquire do questions (such as, 'Where do you live?')?"

ACTION

Selectone of the topics in Table 2.1 that interests you and turn it into a re-
searchable question.

Wiersma (1986) recommends a three-step procedure in developing a ques


tion: (1) general identification of area issue or problem, (2) restating the issue
(almost like writinga title for your study), and (3) refining the issue in forming a
specific question to be answered bythe research. Table 2.2 provides examples of
area identification through to question formation.
In another treatment, Seliger and Shohamy (1989) suggest that the prepara
tory stage of a research project should invoke four phases:
Phase 1: Formulating the general research question (which may emerge
from the experience and interests of the researcher, other research in lan
guage acquisition, or sources outside of second language acquisition)
Phase 2: Focusing on the question (in which the researcher decides on the
importance and feasibility of the question)
Phase 3: Deciding on an objective
Phase 4: Formulating the research plan or objective
As you can see, there is a great deal of thought and work involved in research
before we even begin to collect data!

30 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 2.2 Deriving questions from original issues (adapted from
Weirsma, 1986)

Original Issue Restatement Refined Question

Achievement and A study of the effects of Do three different teaching


teaching techniques three teaching techniques on techniques have differing
reading achievement scores effects on the reading
of junior high school achievement scores of junior
students high school students?
A study of the characteristics What are the characteristics
Bilingual education of bilingual education in the of hilingual education as it
elementary schools of Citv A is implemented in the
elementary school of Citv A?

ACTION

Think about the following important topics in language education. Try


doing Phases 1 and 2 of Seliger and Shohamy's (1989) listabove, using one
of these topics.
Content-based instruction Teacher questions

Having formulated a question or questions, the next step is to evaluate the


question. Not all questions are researchable, and not all researchable questions
are worthwhile. It is therefore important, as you begin to develop a research
plan, to keepyour motives in mind. Ask yourself ~uhy you propose to investigate
a particular question. Coming up with a satisfactory justification before you
invest a lot of intellectual, physical, and emotional energy in the research may
save you a great deal of frustration once you actually embark on the research.

REFLECTION

Which of the following questions are researchable? Which, in youropin


ion, are worth researching? Are there any that, while being researchable.
are not really feasible in practical terms?
1. Do learners of English as a foreign language acquire the ability to form
yes/no questions by inverting the subject and die auxiliary verb before
they acquire t^/j-questions formed through do insertion?
2. Should culture be part of the foreign language curriculum?
3. Are language teachers who wear formal dress more likely to work for
private language schools or university language centers?

Chapter 2 Getting Started on Classroom Research 31


You probably decided that all of these questions are problematic for differ
ent reasons. The first is researchable, but probably only for someone who has a
grant or is doing doctoral research in second language acquisition, since answer
ing this question thoroughly would require considerable time and resources.
(Eor example, it might entail testing learners over a period of time, perhaps tape-
recording and transcribing their speech, or collecting and analyzing then-
writing samples, and so on.)
The second question is not really aresearch question because it primarily in
volves making a value judgment. It seeks an opinion rather than information
based on data. In empirical research, answers to research questions come directly
from the data that are collected and analyzed in the course of the research. Of
course, some people (and we would include ourselves in this group) believe that
it is impossible to learn a language without also learning the culture. Others,
including some of our acquaintances who have developed multilingual compe
tence themselves, believe that language and culture are separable. A more
researchable question in this area would be, "What attitudes do teachers hold
towards teaching culture as part of a foreign language curriculum?" However,
this question is quite different from Question 2 in the Reflection box above.
The third question is certainly answerable, but one would have to ask
whether doing so would be worthwhile. Sociologists may be able to read deep
meaning into the proclivity for formal business clothing on the part of teachers
who work in private schools, but forusthe relationship is essentially meaningless.
Here are two important questions to ask yourselfwhen you are formulating
your research questions: What will I have learned if I answer this research ques
tion? What will the language-teaching field have gained when I have answered
this research question? If the answer is "Nothing," then it's probably best not to
proceed with your research question in its present form. It may require more
refinement or elaboration, or you may want to focus on other topics instead.
We have seen that at the outset of classroom research projects, it is impor
tant to pose our research questions carefully. They should be researchable and
important questions that lead to an improved understanding of language, lan
guage teaching, and/or language learning. We turn now to the literature
review—an important part of the overall research process and one that can
inform the posing of appropriate research questions.

REFLECTION

Here is a story that Kathi Bailey tells her research students. Read the story
and see what it has to do with posing appropriate research questions.

My mom took a course in ikebana—Japanese flower arranging—at


night school in California. The European and North American traditions
of flower arranging often involve large bouquets of flowers, but the
Japanese tradition uses fewer flowers and emphasizes the harmony and
simplicity of the arrangement.

32 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


The students (all American women) were told to bring flowers to the
first night of class. When the Japanese instructor entered the room, her
eager pupils were sitting atthe large tables with mountains offlowers piled
in front of them. In true Anglo-American fashion, they had collected huge
bouquets to bring to class.
The Japanese instructor looked at the flowers, looked at the women,
and dien said, "Ladies, this is a course on Japanese flower arranging.
Please divide your stack of flowers in half and set one half aside." She
waited while the students divided their flowers. Then she said, "Now
look at the flowers that remain. Divide the flowers in half and set half
aside." Again she waited while the students separated their flowers into
two smaller groups and put half of them off to the side. When they had
done so, the instructor said, "Now we are ready to being Japanese flower
arranging."

WRITING THE LITERATURE REVIEW

Novice researchers often equate die process of doinga literature review with doing
research. It is true diat reviewing the literature is important, but locating and sum
marizing what others have written about a topic is often called library research or
secondary research Q. D. Brown, 1988). In this book, we are concerned with empirical
research—that is, investigations inwhich the researchers collect and analyze original
data of their own in order to answer the research questions they have posed.
A literature review can be published as a stand-alone piece or as part of a
reporton empirical research. There are at least four main reasons fordoing a lit
erature review when conducting empirical studies. We will examine each in turn.
The first reason for doing a literature review is to obtain background infor
mation on the area that you have chosen to investigate. A systematic literature
review will acquaint you with previous work in the field and should alert you to
problems and potential pitfalls in your chosen area. It will also help you locate
clear definitions of key terms.
The second reason is to help you to identify research gaps (Cooley and
Lewkowicz, 2003), which involve work that hasn't yet been done. Having
reviewed and summarized the work of others, you will be able to spell out a
research space or gap in the research literature that yon propose to fill. Finding
and articulating such gaps is part of developing the rationale for your research.
The third reason is to discover tools that could help you answer your own
research question. For example, you ma}- find a questionnaire someone else has
developed that is directly related to your interest, or you might learn about a
classroom observation instrument that would be useful in your study. So, as long
as you choose judiciously and cite your sources properly, it is acceptable to
utilize research procedures and tools developed by others.

Chapter 2 Getting Started onClassroom Research 33


Finally, reading widely in the field can help to reassure you that your pro
posed research question has not already been answered by someone else. There
are few things more disheartening than to invest time and energy into a research
project, only to find that someone else has been there before you. However,
there is a value to replicating studies in new contexts. By replicating, we mean
repeating a study (whether your own or someone else's) with different partici
pants, and perhaps with some improvement upon the earlier study.
The first step in developing the literature review is to create an annotated
bibliography. This resource lists those studies you have consulted that have a
bearing on your own research. These may range from brief reports to entire
books. Each entry in the bibliography would be annotated with a concise sum
mary of its contents as they relate to your proposed study. The annotated bibli
ography should contain a full reference in whichever style vou favor. We prefer
the American Psychological Association (APA) guidelines, which can be
obtained from their Web site at www.apa.org. The .APA format is relatively
straightforward and is used by many of the leading journals in our field.

ACTION

Visit the APA Web site and note the format for citing books, articles in
journals, and chapters in books.

An annotation is a briefsummary of a book, article, or chapter that you can


consult in planning and writing aboutyourown research. Annotating a research
report can help you analyze and synthesize what you have read, and it will help
you understand and remember the information. Here is an example of an anno
tation based onjepson's (2005) research comparing the interaction in voice and
textchat rooms, which served as the sample studyin Chapter 1:

]epsotij Kevin. (2005). Conversations—and negotiated interaction—in te,vt and voice chat
rooms. Language Learning andTeclinologij, 9(3), 79-98.

Jepson wanted to investigate tfte "quality ofinteraction among Entjfisft L2 speakers in con
versational text or voice chat rooms" (p. 79). Me compared' the voice chats and text (typed)
cfiats ofnon-native speakers interacting on the Internet, lie investigated which types of re
pair moves occur in tc,vt and voice chats, and looked for any differences 6etween the repair
moves in the typed chats and the voice cfiats. Jepson recorded ten 5-minute, synchronous chat
room sessions (jive of text-chats and five oj voice chats). "Significant differences were found
Between the higher number oftotal repair moves made in voice chats and the smaller number
in t&tt cfiats" (p. 79). The repairs in the voice cfiats were often refatedto students' pronun
ciation issues. He used chi-squarc to analyze the quantitative data.

34 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Annotations like this can be entered into a computer database and filed al
phabetically inaword-processed document. Your annotated bibliography is then
ready tobe drawn onwhen you create your literature review. You will see that we
have put the full reference at the top of this annotation. It is very important to
keep careful records ofyour sources, making sure your reference list is complete
and accurate, soyou don'thave to spend valuable time searching for missing or
incomplete references.
The difference between an annotated bibliographyand a literature review is
that the former consists of separate entries arranged alphabetically by author,
while a literature review is thematically organizedk It extracts, records, and syn
thesizes the main points, issues, findings, and research methods of previous
studies. We like to use the analogy of a quilt to Explain this relationship. The
annotations are like bits of cloth, the raw materials, assembled and organized
before you startquilting. An effective literature review, in contrast, is more like
a well-designed andcarefully executed quilt. It is a unified whole.
In ordering and arranging your literature re\jiew, Merriam (1988) suggests
that it is a good idea to differentiate between data-based research andconceptual (or
non-data-based) research (referred to above as secondary research). Data-based
studies drawn on and report empirical information collected by the researcher.
Non-data based writings, on the other hand, "reflectthe writers' experiences or
opinions and canrange from the highly theoretical to popular opinions" (p.61).
When writing literature reviews, some researchers distinguish between
substantive andmethodological issues, and discuss theseideas in separate sections
of the review. Substantive issues refer to what researchers actually investigated,
that is, the content of their research(e.g., teacher talking time, decision making,
task types, etc.). Methodological issues refer to wow the researchers did their
research,that is, the data collectionand analysis techniquesthey used (e.g.,obser
vation, checklists, transcript analysis, teaching journals, stimulated recall, etc.).
Weirsma (1986) listed the following useful steps for creating a literature
review:

1. Select studies that relate most directly to the problem at hand.


2. Tie together the results of the studies so that their relevance is clear. Do
not simply provide a compendium of seemingly unrelated references in
paragraph form.
3. When conflicting findings are reported across studies—and this is quite
common in educational research—carefully examine the variations in the
findings and possible explanations for them.Ignoring variation andsimply
averaging effects loses information andfails to recognize the complexity of
the problem.
4. Make the case that the research area reviewed is incomplete or requires
extension. This establishes the need for research in this area. (Note: This
does not make the case that the proposed research is going to meet the
need or is of significance.)
5. Although the information from the literature mustbe properly referenced,
do not make the review a series of quotations.

Chapter 2 Getting Started onClassroom Research 35


6. The review should be organized according to the major points relevant to
the problem. Do not force the review into a chronological organization,
for example, which may confuse the relevance and continuity among the
studies reviewed.
7. Give the reader some indication of the relative importance of the results
from studies reviewed. Some results have more bearing on the problem
than others and this should be indicated.
8. Provide closure for the section. Do not terminate with comments from the
final study reviewed. Provide asummary and pull together themost impor
tant results, (pp. 376-377)

In our experience, these guidelines can be very helpful, especially to novice writ
ers. You can also use these guidelines as criteria in evaluating literature reviews
written by other researchers.

ACTION

Find a journal article or a book chapter about classroom research. Skim the
literature review. How have the authors organized the information in their
literature review?

SOME KEY CONCEPTS IN STARTING


RESEARCH PROJECTS
In this section, we will describe and define some important concepts related to
research design. These concepts are fundamental to understanding research of
all kinds, not just classroom research. They are central to designing research
projects of your own. There are man}- concepts that you will need to master for
an advanced knowledge ol research, and we don't plan to cover all of these here.
In this section, we will restrict our attention to the following: types of variables,
types of data, and operational definitions.
In Chapter 1, we noted that the psychometric tradition and naturalistic in
quiry have been the historically dominant approaches to conducting language
classroom research. We also said that action research is gaining ground as a
viable approach to doing classroom research. No matter which of these ap
proaches you use, the concepts of data and variables will be very important in
planning your research.

Types of Variables and Types of Data


The notion of a variable is intuitively straightforward although students new to
the research process often find this concept tricky, particularly when they are
required to identify different kinds of variables in a research report. Simply put,

36 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


a variable is something that is free to vary. A moment's thought will therefore
reveal that the number of variables in the world is infinite. Think of the possible
variables that might be included underthe category of"personal characteristics."
These might include, but would not be restricted to height, weight, eye color,
handedness, income, number of siblings, number of hobbies, first language,
country of origin, and so on.
Turning to language classrooms, the number of variables isalso great. '1 his
action box lists several teachers' comments that are about variables that may
influence teaching and learning.

ACTION

Think about the following statements made by teachers and identify the
variables implicated in the statements.
Teacher A: I always used to teach grammar inductively, but I've recently-
started experimentingwith deductive techniques.
Teacher B: Your grades on the listening test I gave you Friday are much
improved.
Teacher C: I used to teach in an all girls'school, but I now teach in a coed
school.
Teacher D: I justcan't get my reading group motivated.
Teacher E: I've been trying to wait longer between asking a question and
repeating the question or giving students the answer.
Teacher F: I seem to have a class of holistic learners this semester.

Here are the variable labels we would apply to these teachers' comments:

Teacher Variable

A lnstruetioiv.il method
B Listening test scores
c; Students' gender
D Learners' motivation

E Instructional wait time

F Learning style

These variables are different in kind because the data that comprise them are
different. The simplest kind of data is known as nominal data. The word nominal
is an adjective formed from noun. A nominal variable is something that can be
named. Nominal data are also called categorical data because they involve putting
things into categories to create variables. With nominal variables, we can create
a series of 'boxes' that can be given a label and to which instances of each

Chapter 2 Getting Started on Classroom Research 37


variable can be assigned. Gender, for example, is a nominal variable because we
can label the boxes to which individual learners will beassigned as either male or
female, flack goes in the male box; Jill goes in the female box.). Instructional
method is another nominal variable. In this case, the boxes could be labeled
inductive and deductive. Usually there is an "either/or" quality to nominal
variables—people or data fit in only one category or another. For instance,
students often come from one first-language background—say, Spanish or
Arabic or Swahili or Dutch.(Students raised in bilingual homes would be classi
fied differently.)

REFLECTION

Think of threeother examples ofvariables made up of nominal data.

Not all variables arenominal to begin with although theycan beturned into
nominal variables. Consider testscores. Such data will typically bea sequence of
numbers: 26, 74, 33, 52, 88, 81,45, 62, and 49. These data cannot beassigned to
boxes in the way gender and instructional method can. We could, if we wished,
turn them into nominal variables by creating two (or more) boxes, say "high
achievers" and "low achievers," and deciding, somewhat arbitrarily, that all
learners who scored 50 and above would be placed in the high achievers' box,
and those scoring less than 50 would be placed into the low achievers' box.
Notice that this data reduction exercise results in less detailed information than
do the original scores. Saying that Billy is a low achiever, while Nancy is a high
achiever gives us less precise information than providing the students' actual
scores. If we knew, for instance, that Billy had scored 49 and Nancy had scored
51, our interpretation of the students' proficiency would be much different than
if Billy had scored 24 and Nancy 89.
Test scores and other measurements are typically considered to be interval
data because theyaremeasured onwhatiscalled an interval scale. This concept is
easy to understand in terms of measurements of length, such as kilometers or
inches. An inch is an inch whether it is the difference between fourteen inches
and fifteen inches or between sixty-five inchesand sixty-six inches. A kilometer
remains the samelength whetherwe are talking about the distance between four
and five kilometers, or between 320 and 321 kilometers. The unit of measure is
a constant interval. In the psychometric tradition, researchers oftentry to collect
intervaldata because there are powerful statistical analyses that can be done with
interval (or interval-like) data.
A third concept that is important in our field is oi'dinal data—that is, data
that are represented in an order. In the example above, we can say that Billy
scored lower than Nancy, whether their scores were 49 and 51, respectively, or
24 and 89, respectively. Ordinal datalack the precision of interval data, but they
are often useful.
Ordinal measures are sometimes used in classroom research when precise
counts or measurements are not appropriate or are not available. For example, if

38 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


we lack a clear means of measuring fluency, we might ask a teacher to rank four
students from the most fluent to the least fluent in 'speaking the targetlanguage.
Let us say the teacher tells us that Keiko's spoken English is more fluent that
Juan's, butJuan's is more fluent than Ahmed's. Tht teacher feels that Lee is the
least fluent of the four students. The rank ordering of these students' names by
fluency—Keiko, Juan, Ahmed, and Lee—is a set ofordinal data. Note thatwe
cannot tell from rank-ordered data how much more fluent Keiko is than Juan, or
how much less fluent Lee is than Ahmed. We onfy know their relative ranks in
terms of the teacher's view of their fluency.

REFLECTION

Think of three more examples of interval data and three more of ordinal
datathat mightoccurin studies of language leahung and teaching.

A final type of scale for measuring variables i; the ratio scale, which meas
ures absolute values such as temperature. Ratio scales are veryimportant in the
physical sciences, but they are of little value in applied linguistics because con
structs such as language proficiency do not exist in absolute quantities. (Some
onewhoobtains a score of 50on a grammar test does not necessarily know twice
as much grammar as someone who obtains a score of 25.) As they are of little
practical interest to us in classroom research, we will not deal with ratio scales
any further in this book.

REFLECTION

Which of the following variables involve interval, ordinal, and nominal


(or categorical) data?
Testtakers'beingput into twogroups on the basis of the initials of their
first names: those from A to N and O to Z
A list of test takers' names from the highest to the lowest scores (but
without the scores being listed)
A list of students' actual scores on the Hest of English as Foreign
Language

An interesting property of interval, ordinal, and nominal data is that they


can beappropriately altered in onedirection, butnotin theother. That is, inter
val data can be converted to ordinal data (with some loss of precision, of course)
butnot the opposite. Forexample, ifwe know that Chin is sixty-five inches tall
and that Maria is sixty inches tall, we can say that Chin is tallerthan Maria. But,
ifwe are simply told that Chin is taller than MariaJ we cannot tell exactly how tall
Chapter 2 Getting Started onClassroom Research 39
the two students are unless we see them and measure their height. Likewise, in
terval data can be converted into categorical data once the categories have been
established—tall, medium, and short, for instance. But just knowing thatsome
one has been classified asbeing tall or of medium height or short does not tell us
exactly what that persons height is.

ACTION

Here is a list of scores on a 100-point French grammar test. (lonsider these


scores to be interval data, measuring the construct of French grammar
knowledge.
Fred: 82 points George: 95 points
John: 67 points Marilyn: 98 points
Andrew: 84 points Paul: 72 points
Meredith: 78 points Whitney: 58 points
Leslie: 100 points Steven: 52 points

Rank the students in order from highest to lowest in terms of their gram
mar test scores.

Now divide the students into those whose names start with letters in
the first halfof the alphabet and those whosenamesstart with letters in the
second half of the alphabet.

REFLECTION

Answer the following questions and compare your answers with a col
league or classmate.

1. What type of data were the students' actual French grammar test
scores?

2. What type of data did you get when you ranked the students' names
according to their grammar test scores (but did not include the scores
themselves)?
3. What typeof data didyouhave when you divided the students into those
whose names start with letters in the first half of the alphabet and
those whose names start with letters in the second halfof the alphabet?

We have spent a fair amount of time discussing variables and the sorts of
data you might use in your research (or read about in other people's reports)
because these concepts influence a great many decisions about how we will col
lect and analyze our data. Another related and important issue is that of
constructs and how they are operationali/.ed, the topic of our next section.

40 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Operationalizing Constructs
Constructs are a bit trickier to define than variables, In this book, we will use the
term constructs to refer to all human qualities such as motivation, proficiency,
aptitude, intelligence, and acculturation. Constructs are typically qualities that
cannot be seen directly, but that have to be inferred from behavior. They are
called constructs because we construct or 'invent' the labels to account for the
behavior. For example, weobserve a certain studentspending three afternoons a
week studying in the self-access learning center and say, "Wow! He's really
motivated." You may have noticed that all constructs are variables because they
are qualities that vary from one learner to another] (However, not all variables
are constructs because some variables—like height, for instance—are directly
observable.)
Constructs such as attitude, motivation, and language proficiency are not
directly observable, so they pose challenges for people who want to investigate
them. How do you investigate something like learner motivation that can't
be seen? You have to create a way of eliciting behavior and/or information from
the learner that can be observed and use the resulting data to make inferences
about the unobservable quality. Questionnaires, for example, havebeen created
to measure attitude, motivation, anxiety, and so on. (We will examine question
naires in Chapter 5.)
This process of creatingor choosingan instrument for elicitingor otherwise
recording behavior that is subsequently used for making inferences about psy
chological constructs is known as construct operationalization. In the examples we
sawabove,students' scores on the Test of English as a Foreign Language provide
one way of operationalizing the construct of language proficiency. Likewise, a
teacher's ranking of students' speaking fluency is one way to try to capture, or
operationalize, the construct of spoken fluency. Another way would be to define
speaking fluency in terms of speed of speech. For example, fluency could be
determined by the average number of syllables uttered per minute. If we were
to operationallydefine spoken fluencyin this way, we would need to collect data
by recording speech samples and counting the number of syllables uttered per
minute.
Regardless of the approach to research we take, when we operationalize
constructs that interest us, it is important to define key terms (both variables and
constructs). In fact, researchers use a particular approach to defining key terms,
which involves writing operational definitions. These definitions are based on the
on particular (often observable) characteristics of the things being defined.
Operational definitions must be clear enough that the study could be replicated
by someone else.
Operational definitions are usually developed in one of three ways
(Tuckman, 1999). First, operational definitions can be based on experimenter
manipulation. That is, a researcher generates or creates the phenomenon under
investigation: "Individualized instruction can be operationally defined as instruc
tion that the researcherhas designed to be delivered by a computer (or book)so

Chapter2 GettingStarted on Classroom Research 41


that students can work on it by themselves at their own pace" (p. 114). Second,
someoperational definitions are based on whata thing or person does, or how it
operates. So, for instance, a researcher might define a directive teacher as one who
"gives instructions, personalizes criticism or blame, and establishes formal
relationships with students" (p. 116). Third, operational definitions can be
developed using the internal properties of an individual or entity (and these are
often documented through self-report). For instance, course satisfaction "might
be the perception—as reported by subjects on questionnaires—that a course
has been an interesting and effective learning experience" (p. 117). When you
operationally define key terms in your own research, choose the approach that
best suits the term(s) vou are trying to define.

ACTION

Look at a research article on a topic that interests you. What key terms
havebeen defined? Which of the approachesdescribed byTuckman (1999)
were used to write the operational definitions?

Here is an example of how one researcher operationalized the constructs


he was investigating. You will recall that Jepson (2005) investigated repair
moves in voice chat rooms and text chat rooms. Based on his review of the
literature (particularly papers by Long [1983b and 1983c]; 1996] and Y. Lin
and I ledgcock [1996]),Jepson operationalized repair moves in two categories:
"Negotiation of Meaning (NOM)" and "Negative Feedback (NF)," as shown
in Table 2.3.
Jepson used these categories in his data analysis. It was important to have
clear definitions and examples of the various types of repair moves because he
wanted to have a second researcher code the data from the chat rooms as a qual
ity control check. Also, if someonewanted to replicateJepsons study (either with
similar or dissimilar participants), they could use this table to code their data.
Doing so would allow other researchers to compare their results to Jepson's.
Having clear operational definitions and examples is also helpful to readers of
research reports.
As Table 2.3 reveals, doing your literature review can provide you with
clear operational definitions of key terms and appropriate ways to operational
ize constructs important in your study. Making sure you are clear and consis
tent in defining terms and operationalizing constructs improves the entire
research process—from articulating the research question or stating the
hypothesis, to designing the study, choosing or crafting the data collection
procedures, and analyzing the data. We will now turn to considerations of
research design.

42 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TTABLE 2.3 Codes and operationalizations jfor repair moves in text
and voice chat sessions (from Jepson, 2005, pp. 88-89)
Negotiation of Meaning Negative Feedback
(NOM; Long, 1983c) (NF; Long 1996)

Interlocutors CR: clarification requests R: recasts (The


(responding to text or (e.g., What do you mean interlocutor corrects the
speech initiated by byX?) speaker'sword or utterance
another speaker) by repeating it in its
correct form, e.g., This city
isbeautiful in response to
the speaker's This city
beautiful)

CC: confirmation checks EC: explicitcorrection


(e.g., Didyoumean/say X?) (The interlocutor tells
the speaker of his/her
mistake, e.g., You should
say, "This city is
beautiful.")
Q: questions (The
interlocutor asks a
question in order to
prompt the speaker to
make a correction, e.g.,
Can you try that again?)
Speakers (initiating COMP C: | I/F: incorporations
text or speech) comprehension checks j (Speaker repairs
(e.g., Doyou understand?) \ utterance based on
SR/P: self-repetition or interlocutor feedback;
paraphrase i Y. Lin and Hedgcock,
(e.g., Which /pli:s /uh,/ j 1996),
plus/, uh, which landmark] e.g., in response to an
canI visit?) explicit correction: Sorry,
thiscity is beautiful.)
I: incorporations |
(Speaker repairs SC: self-corrections
utterance based on ! (The speaker initiates
interlocutor cues; Y. Lin i adjustments to her or his
and Hedgcock, 1996, | own previous errors
e.g., in response to a without assistance from
confirmation check: Yes, the interlocutor

1 mean X.) e.g., Thishasbeeb, I mean


been, great.)

Chapter2 GettingStarted on Classroom Research 43


INITIAL RESEARCH DESIGN CONSIDERATIONS

A research design is analogous to an architect's plan for building a house. It lays


out, in advance, approximately whatthe end productshould looklike. It suggests
measurements for the different components of the structure, materials to be
used, andsteps foraccomplishing the desired goal. Fortunately, in research there
are a number of existing research designs and methodological tools that we can
drawupon in planninga study, so that we don't haveto start totallyfrom scratch.
The design you choose and develop is directly related to the research question(s)
you are posing, and it will be influenced by your literature review.
So, asa nextstep,wewill sketchout someof the initialdesignconsiderations
that follow on from the question identification and literature review. The issues
foreshadowed here aretakenup and elaborated in succeeding chapters, so thissec
tion is meant to provide only a brief overviewof research design considerations.

Design Issues in Experimental Research


One of the first things researchersneed to consider is whether the research ques
tion suggests an experimental or a non-experimental research design. If the re
search involves testing the strength of one variable's influence on another, then an
experiment may be needed. If the research question is more exploratory in nature,
then action researchor somesort of naturalisticinquiry maybe more appropriate.
You will recall that in the psychometric approach, researchers typically set
out to measure psychological constructs in some way. Often they do so because
they want to investigate issues of causality. That is, they want to determine
whether a change in one variable can influence, or cause, a change in another
variable. To do so, researcherslook for significantdifferences betweengroups on
some common measurement. Remember the large-scale methods comparisons
described in Chapter 1? Those researchers were trying to determine whether
one particular method caused better language learning than a competing
method. That is, they were investigating the effect of one variable (method of
instruction) on another (language learning).
There are many types of research designs, but in the experimental method,
most of the designs for investigating causality share four characteristics. First,
the researcher pre-specifies and defines the variables of interest. Secondly, the
researcher intervenes in and manipulates the learning environment in some way
to see if he or she can cause a change to occur. In experimental research, the
researcher's intentional intervention is called the treatment. In the strongest
experimental designs, a third characteristic is that the subjects who participate in
the experiment are randomly selected from the population they represent and
then randomly assigned to the samplegroups.The fourth characteristicis that at
least one group of subjects, called the control group, does not get the treatment.
When these conditions are met, the researcher is using what is called a true
experimental design.
There are reasons for randomly selecting participants, which are related
to the logic of the experimental method. Suppose you teach Japanese in

44 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


New Zealand and yon have about forty beginning studentsofJapanese. Two be
ginning classes ofapproximately twenty students each have been scheduled. One
of the beginnerclasses meets at 8 a.m. and the other meetsat 3 P.M. The curricu
lum lor the two beginner classes is the same, but you have considerable freedom
to deliver it however you think best. At a conference, you attended a workshop
on vocabulary development through the use of Total Physical Response (TPR)
(Asher, Kusudo, and de la Torre, 1993)—a language teaching procedure that
involves the students in intensive listening practice in which they respond to
commands through physical actions. You would like to see whether using TPR
with your learners will help them to learn and retain Japanese vocabulary items
better than your regular teaching method does. So, you decide to conduct a lit
tle experiment. For one lull month, up until the first examination, you will use
TPR with one class of beginning students but not with the other class.

ACTION

Write the research question for the study about using TPR with the
beginning-Japanese students. What terms would you need to operationally
define in preparingto carryout this study?

You could, of course, use TPR with both beginner classes and then see how
well they do on the examination, hut that would give you no point of compari
son. You could test all the students about their knowledge of Japanese vocabulary
before the TPR lessons and again after the six weeks of TPR lessons, but there
would be little reason to do so since you know the students are true beginners.
So, you decide to useTPR techniques with one class ofJapanese learners and to
use your regular teaching method with the other group. That way you can com
pare the vocabulary test scores of the two groups of students—those with whom
you used TPR and those with whom you did not. The purpose of withholding
the experimental treatment from one group is to be able to compare the learners'
test scores and see if those who receive the TPR lessons outperform those who
do not. (In other words, you want to see if using TPR causes a difference in the
learners'Japanese vocabulary test scores.)

REFLECTION

Imagine that as you are planning your research project about TPR lessons
and Japanese vocabulary learning, you find out that atiiletic practice for
school sports will be held at 3 P.M. every afternoon, so any beginning
students of Japanese who wish to participate in sports will enroll in your
8 A.M. class. You also learn that orchestra, band, and choir practices are
held every morning from 7 to 9, so any of the music students who wish to
take beginningJapanese will be in your afternoon class.
What are the implications of these facts for interpreting the outcomes
of your study?

Chapter2 GettingStarted on Classroom Research 45


As you can imagine, given the different sorts of students who will probably
enroll, these two beginning Japanese classes may be different from the outset
(i.e., the more musical students will be in the afternoon class and the more ath
letic students will be in the morning class). Remember that yourpurpose in set
tingup the comparison is to see ifstudents who received TPR training would do
better than those without such training in terms of theirJapanese vocabulary
learning. But you will not be able to make strong claims thatany differences you
observe in the two groups' test scores are due to the use of TPR. The students'
preexisting differences may influence the outcome as much as—or even more
than—the TPR treatment.
For this reason—to counteract the possibility of preexisting differences
influencing the results of a study—in formal experiments, researchers ran
domly select people to be in the experiment (and they are called the sample).
They are selected from the population they represent. The researchers then
randomly assign subjects to the different experimental and control conditions.
The logic here is that randomly assigningsubjects to different conditions dis
tributes any possible 'contaminating' learner factors to both the experimental
and control groups. If you were able to use random selection (from the popula
tion to the sample) and random assignment (from the randomly selected sample
to the two different class periods) in your TPR experiment, you would then be
in a better position to argue that any differences on the Japanese vocabulary
test are due to the experimental treatment because—thanks to randomiza
tion—the other variables that might have had an effect (such as intelligence
and aptitude) presumably existin equal quantities in both the experimental and
control groups, and therefore cancel one another out.
In most real-world educational contexts, teachers typically don't have the
resources or the power to randomly select subjects from the population and/or
randomly assign them to groups. (In the Japanese class example above, you
cannot choose secondary school students at random to study Japanese. The
students themselves make this choice. Nor can you select at random who will
attend the morning and afternoon classes due to the schedule of music and
sports in the curriculum.) As a result, relatively little language classroom
research has used the true experimental designs. Instead, researchers working
with regularly occurring classes often use what is called the intact groups design.
That is, the groups are preexisting, before the experiment was set up, or for
some reason they cannot be randomly selected and/or randomly assigned.
Comparing the vocabulary scores of the morning and the afternoon beginning
Japanese classes (when one group receives TPR training and the other does
not) is an example of an intact groups design.
We will return to the types of research designs used in experimental
research in Chapter 4, where we will examine that method in more detail. Keep
in mind that there are many, many variations on the types of research design
used in psychometric research, and here we have just begun to explore the
possibilities.

46 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Design Issues inNonexperimental Research
Nonexperimental research is often underpinned by what-ifand whats-going-on
questions rather than by questions aimed at investigating causal relationships
between variables. Typical nonexperimental classroom research questions
include the following:
1. What are the classroom experiences of trainee: teachers in inner-cityclass
rooms?
2. What do learners believe about the nature of language and learning?
3. What happens whenteachers increase wait time (thetime between asking a
question and then either reformulating the question or answering it)?
The best way to address these questions may not entail conducting an experi
ment. For instance, answering the first question might involve observing and
interviewing teacher trainees completing their practice teaching assignments in
inner-city schools. The second question could be addressed by interviewing
language learners abouttheir beliefs. There are no issues of causality involved in
these two research questions that would necessitate comparing one group of
teachers or learners to another.
The third research question above is somewhatmore complex in terms of
design issues. We could, for instance, set up an experimentin which some lan
guage teachers (those teaching the experimental groups) would intentionally
lengthentheir waittime (bycountingto ten before! reformulating a question or
answering it ifstudents do not respond). Other teachers (those teaching the con
trol groups in the experiment) would just continue with their usual wait time.
We could then compare the two sets of students on some measure of language
learning or satisfaction.
However, it could be just as valuable NOT to set up an experiment. This
question could be explored with action research. For example, a teacher might
read that increasing one's wait time enhances both the quantity and the quality
of student responses. She might then try to see if this would be true in her own
class. She could plan an action research project in which she systematically
increased her wait time and tape-recorded her lessons to see if there was an
apparent difference in how the students answered questions or responded to
comments. In this case, action research would be an appropriate choice of how
to address the research question.
In discussing designissues in experimental research, we introduced the con
cept of sampling. This term often means the selection of research subjects from
the wider population to be in the sample (i.e., to be participants in the experi
ment. However, there are sampling considerations in nonexperimental research
as well.
If you are conducting a case study in the naturalistic inquiry tradition, for
instance, you may choose a particular class, teacher, or learner as the unit of
analysis because that class or person is typical in someway. For instance, you may
chooseto investigate the experiences of trainee teachers in inner-cityclassrooms

Chapter2 GettingStarted on Classroom Research 47


by choosing one trainee who typifies the others (e.g., a female teacherwith less
than oneyear of experience doing herstudentteaching in a junior high school in
New YorkCity)-The wider group that interests you (here, teacher trainees work
ing in inner-city contexts) is called the. population. The typical teacher you select
may be chosen on the grounds that she apparently represents the population.
Likewise, if you want to know what learners believe about the nature of
language and learning, you would first have to decide what the population is that
interests you. Do you want to find out about adult learners, adolescents, or
children? Do you want to find out about learners in second language contexts
(where the target language is used in the surrounding society as the general
language of communication) or foreign language contexts (where the target lan
guage is not commonly used in that environment)? Answers to these questions
will strongly influence the choice of what learners to interview.

REFLECTION

Think about some classroom research that you would like to do. What case
or sample might you study? What population would be represented?

In other situations, you may choose a particular case as the object of your in
vestigation because that entity is somehow unique or special. For example, in an
early classroom investigation, R. L. Allwright (1980) wanted to investigate how
ESL learners got speaking turns in class. In the process, he found that some
learners got many more than the typical number of turns and others got far
fewer. So, after discussing the quantitative analysis of the number and length of
turns taken by all the students, Allwright investigated a particular learner who
got far more than his "fair share" of speaking turns in class.The close analysis of
that learner's conversation with the teacher allowed Allwright to discover how he
got so many turns. In this case, the learner's uniqueness was what caused the re
searcher to focus on him.
Sometimes sampling decisions are influenced by practical issues such as the
availability of subjects or of a research setting. If you are livingand working in
Sao Paulo, for instance, it will be much easier and cheaper to observe and inter
view teacher trainees working there than in New York City. If you are teaching
adult EFL learners in Seoul, it will be much more practical to learn about their
beliefs regarding language learning and teaching than about the beliefs of learn
ers in Cairo. Making sampling decisions on the basis of availability is known as
opportunistic sampling.
In the experimental tradition, opportunistic sampling is frowned upon be
cause it is not as powerful as random sampling in controlling variables. However,
in some naturalistic inquiry, opportunistic sampling can increase our access to
subjects and add depth to the database. For example, many of the earliest case
studies of children acquiring two languages simultaneously were done by those
children's parents. The parents had broad exposure to the children's emerging
speech and could record their utterances at any hour of the night or day. In those

48 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


instances, opportunistic sampling was an effective way of choosing subjects for
in-depth observations over time.
Another sampling concern has to do with time. If you are collecting data
from students in a language program, when is the besttimeto do so? Is it reason
able, for instance, to ask beginning learners of Chinese as a foreign language
about their beliefs regarding language learning and teaching before they have
had any experience with foreign language learning? Or should you wait until
theyhave beenenrolled in the Chinese class for a monthor a semester? In fact,
the answer depends on your research question(s). Think about the following
options for articulating the research question that embodies an interest in lan-
guagelearners' beliefs:
1. What do learners believe about the nature of language and learning
2. What do beginning learners believe about the nature of language and
learning before starting to study a foreignlanguage?
3. What do beginning learners believe about the nature of language and
learning after one semester of studying a foreign language?
4. What is the difference, if any, between learners' beliefs about the nature of
language and learning before and after one semester of studyinga foreign
language?

The way you articulate your research question(s) will clearly influence your
data collection (whether you do so through questionnaires, classroom observa
tions, etc.).
To summarize, research design issues, such as sampling, are important in
nonexperimental research as well as in experimental studies. The design you
choose is direcdy related to how you word your research question(s).The design
will also be influenced by the literature you review.

SAMPLE STUDY

In this chapter, rather than summarize an entire study,we will report on how one
classroom researcher got started on an investigation. This study was by Sarah
Springer, a teacher who had enrolled in a graduate program in TESOL (Teach
ing English to Speakers of Other Languages). Through her course readings, she
became interested in what factors promoted second language acquisition. Here
are Springer's (2003) commentsabout choosinga researchfocus:
My own training and previousexperience as an EFL teacher in Europe
had involved courses organized around grammatical syllabi that pre
sented discrete linguistic forms in a linear progression. The first
opportunity I had to work with groups of students using content- and
project-based syllabi was in the summer of 2002. The... summer
program had been designed to provide both ja cultural exchange and
academic research experience forJapanese university students who over

Chapter2 GettingStarted on Classroom Research 49


the course of three short weeks investigated a socially relevant topic of
their choice. The research methods employed by the students include
library and Internet background research, participant observation, pub
lic opinion surveys and focused interviews. In addition to visiting their
community placement sites and attending classes and workshops in sup
port of their developing research and technology skills, the students
were also enrolled in courses on American Life anil Communication
Skills, and spent a considerable amount of time with their American
host families. At the endof the session, each student must synthesize the
results of their research by preparing and delivering a six- to eight-
minute PowerPoint presentation.
My experience as a teacher in this program was very rewarding.
However, as the session came to a close and I was attempting to write
the questions for an end-of-session studentsurvey, I began to reflect on
the extent to which my previous conceptions of'the Kngiish language'
as a subject of study and the language learning process itself were only
marginally relevant to the complex and clearly quite engaging experi
ence in which the students . . . had been involved. Without a list of
grammatical items that had been 'covered' I was at a loss as to how,
specifically, their experiences in the program might have contributed to
their language development. I was confronted by the discrepancy be
tween my conviction that it had been a valuable and productive experi
ence for them and my inability to answer a question which I initially
framed as. "What did they learn?"
In the following semester I encountered numerous concepts in my
Sociolinguistics and Second Language Acquisition courses that pro
vided me with names for what I had experienced and gave me new
frameworks with which I could analyze my summer experiences. By
the cm\ of the semester my question had evolved into the following:
What implications does a much broader view of language and the lan
guage learning process have for my role and responsibilities as a lan
guage teacher? What changes come about in my expectations concerns
the roles and responsibilities of the students? What impact does the
shift to content- or project-based syllabi have on these classroom roles
and on the language produced by course participants—both teachers
and learners?" (pp. 2-3)

REFLECTION

Have you ever had an experience like Springer's, where a job change or a
new opportunity led you to reflect on what you were teaching and why? If
so, what issues arose for you? If not, can you imagine a context in which you
might face such issues? (If you are a new teacher with limited experience, it
would be worthwhile to ask an experienced teacher these questions.)

50 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Predict what Springer might have done to address these elaborated ques
tions. Sketch out a plan of how you would proceed to investigate these
issues in a language classroom.

We will return to Springer's study in future chapters. At this point, we will


consider some of the payoffs and pitfalls of designing language classroom
research.

PAYOFFS AND PITFALLS

There are many practical problems that occur in language classroom research.
Our purpose here is not to intimidate or worry you about conducting your own
research but rather to help you prepare to overcome such problems by anticipat
ing them. Careful planning in the design stages will help you avoid difficulties
and disappointments in the data collection and analysis stages.
In fact, once you get to die point of fleshing out your research ideas, it is a
good practice to spend some time anticipating the practical problems likely to
get in the way of successfully completing the project. Anticipating problems and
thinking about solutions can help smooth the research path. Problems encoun
tered by our own graduate students include the following:
1. Lack of time (a particular problem for students who are also working or
taking several courses)
2. Lackof expertise, particularly at critical points in the research process such
as formulating a researchable question, determining the appropriate
research design, and, in the case of quantitative research, selecting the
appropriate statistical tool
3. Identifying subjects willing to take part in the research
4. Negotiating access to research sites(Unless you are collecting data in your
own classroom or your own school, getting permission to collect data can
be both time-consuming and frustrating.)
5. Issues of confidentiality
6. Ethical questions relating to data collection, which become acute when
you want to collect data without alerting your subjects beforehand
7. The sensitivity of reporting negative findings, particularly if these relate to
individuals you work with or know well
8. The difficulty of actually writing up the research
The last problem can becomeacute, particularlyfor researchers who are not
native speakers of the reportinglanguage or who lack confidence (and sometimes

Chapter 2 Getting Startedon Classroom Research 51


skill) in producing an extended piece of writing. We recommend writing regu
larly from thevery beginning of the research process. This writing could be sum
maries of the background reading you are doing thatmay eventually find its way
into the literature review in some shapeor form. Or it could be reflections on the
research process. Even ifyou eventually discard much ofyourearlier writing, the
process itself will be an important stimulus to thinking.
It has been our experience that carefully planning the studv in advance can
reduce frustration, save time, and generate better research. In particular, getting
the research question focused and worded correctly is central to all other deci
sions. I low-ever, it isalso important to be flexible and to adapt to new conditions
that may arise during your investigation.

ACTION

Think of some research you would like to conduct. Make a list of the prob
lems you think might arise in your own research situation.

There are numerous payoffs to doing classroom research and particularly to


investing time in the planning stages. 'Hikingan idea from an initial interest to a
research focus and then to a refined research question is intellectually intriguing
and helps to clarify our thinking about the issue. Making research design and
sampling decisions based on our refined research questions leads to all sorts of
intriguing possibilities about how to best to collect and analyze data to investi
gate our topic.
Doing a thorough literature review can be a form of professional develop
ment because there is so much to learn about the issues that interest us!
Likewise, if we return to our research-as-culture metaphor, reading widely and
writing a strong literature review is a means of beginning to understand and then
enter a particular part of the culture. It can also help you get up-to-date on new
developments that will have occurred since you last took a course or read about
a topic of interest.
Finally, although we have not gotten to this part of the story yet, doing
the research itself—collecting and analyzing the data—can be great fun and
extremely rewarding. As we will see when we revisit Sarah Springer's and Kevin
Jepson's studies, as well as encountering others in future chapters, doing original
research can help us make connections to the vast body of theory and existing
research in our field.

CONCLUSION

In this chapter, we have looked at some of the practicalities of doing research,


focusing on posing important and doable research questions, reviewing the liter
ature, and considering some of the basics of research design. We have also

52 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


reviewed important concepts including types of variables, types of data, con
structs and construct operationalization, and sampling decisions. Finally, we
have looked at initial design issues in experimental and nonexperimental con
texts. The chapter closes with the usual questions and tasks to help you internal
ize these concepts, as well as suggestions for additional readings in case you
would like to pursue these topics in more depth.

QUESTIONS AND TASKS


1. Find a classroom research article on a topic that interests you. Did the
researchers pose research questions, hypotheses, or both? What key terms
and constructs did they operationalize?
2. Find a teacher who has conducted some kind of classroom research,
whether ornot that study has been published] Ask the teacher what his or
her research questions were and how he or she arrived at them.
3. Read a published account of an experimental study. How were the partici
pants in the study (i.e., the sample) selected? What population were they
supposed to represent?
4. Read the published accounts of two or more case studies. How did the
researchers in those situations locate their subjects?
5. Find a report about an experiment on language learning or teaching.
Which characteristics of the so-called true experimental designs did the
study involve?
A. The researcher prespecified and defined the variables of interest.
B. The researcher intervened and provided a treatment.
C. The subjects in the experiment were randomly selected from the
population they represented and randomly assigned to groups.
D. At least one group of subjects (the control group) did not get the
treatment.

6. Ask an experienced researcher to discuss some of the challenges he or she


has encountered in trying to set up a study. Find out what research tradi
tion^) he or she was using and what solutions emerged as the study
progressed.

SUGGESTIONS FOR FURTHER READING

Freeman's (1998) book DoingTeacher Research: From Inquiry to Understanding is a


good starting point for novice researchers. It uses extended examples of teacher
research to illustrate the issues involved in getting started.
A helpful guide for preparing a literature review is Galvan's (1999) book
Writing Literature Reviews: A Guide for Students of the Social and Behavioral
Sciences. We also recommend Dissertation Writing in Practice: Turning Ideas into
Text by Cooley and Lewkowicz (2003).

Chapter 2 Getting Startedon Classroom Research 53


Chaudron's (1988) book Second Language Classrooms: Research on Teaching and
Learning provides good background reading about the state of the art in
language classroom researchin the late 1980s.
Sometimes our students get discouraged because the articles they read in
professional journals seem to suggest that everything went smoothly during the
research process. In fact, thisisseldom the case. Schachter and Gass (1996) have
edited a collection of papers by experienced language classroom researchers.
Their purpose was to give readers "an honest, behind the scenes look at what
happens from the beginning to the end of a research project within a classroom
context" (p. vii). The authors provide a candid look at the problems that almost
inevitably occur in conducting classroom research.

54 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


CHAPTER

Key Concepts in Planning


Classroom Research

The blacksmith cannot criticize the carpenterfor notheating apiece of


wood over afire. However, the carpenter must demonstrate aprincipled
controlover the materials, (van Lier, 1988, p. 42)

INTRODUCTION AND OVERVIEW

In this, the final chapter of Section 1 of this book, we will continue to build on
the ideas presented in Chapters 1 and 2, with a special focus on planning your
research designso that you can make good decisions about your data collection
and analysis procedures. Having posed a research question and done your litera
ture review, you are wellset to begin planningyour study, but what you actually
do depends on the research questions or hypotheses you have posed and what
your goalsare in conductingthe investigation.
For instance,if you wanted to conduct an action research study, like the pos
sibleinvestigation of a teacher's wait time described in Chapter 2, your research
questions and procedures would be different from those you would use if you
developed an experimenton wait time in which one group of teachersincreased
their wait time and another group did not, and you subsequently assessed the
quantity and quality of verbal turns taken by the students of those two sets of
teachers. Likewise, if you wanted to do a case study of one particular teacher
trainee's experiences while doing his or her practice teaching in an inner-city
school, the procedures would be quite different than if you wished to contrast
the questionnaire responses of 200 trainees in Manchester and another 200 in
Johannesburg.This questionnaire study would be more like classroom-oriented
research than actual classroom-based research (see Chapter 1), but it would still
be relevantand important to teacher educatorsand trainees.

55
REFLECTION

Based on what you've read so far and what you already knew about
research, think of a classroom-based or classroom-oriented study that you
would like to do. Keep that idea in mind as you read this chapter.

The key point here is that there is no perfect research design or research
method. As van Lier (1988) notes, the tools you choose depend on the goals you
wish to accomplish. So, as we proceed through the next sections, please keep in
mind that we do not advocate any of the main research traditions over the oth
ers out ol context. However, we will begin this chapter with concepts associated
primarily with research design in the psychometric tradition for two reasons:
(1) historically it has been (until recently) the dominant tradition in classroom
research, and (2)some of the vocabulary associated with quantitative data collec
tion and analysis are used in other approaches as well. So, the psychometric
tradition serves as a useful point of departure for discussions about several
approaches to classroom research. For this reason, we will begin with the con
cept of hypotheses and hypothesis testing. We will then consider the sorts of
threats to research design that can invalidate a study. Finally, we will discuss the
ex post facto class of designs. This class consists of two important research
designs, one of which enables us to investigate correlations of two or more vari
ables. The other leads to nonexperimental comparisons of different groups.
These designs are useful in language classroom research, and there are several
examples of both in the published literature of our field.

HYPOTHESIS TESTING

In Chapter 2, we discussed writing clear research questions and operational def


initions. We noted that in some instances researchers use hypotheses in addition
to or instead of research questions. Hypotheses are especially important in ex
periments conducted in the psychometric tradition, but the concept has valuable
applications to other types of research as well.
A hypothesis is a precisely worded statement about the expected outcomes of
a study. Hypotheses are most closely (but not exclusively) related to research in
the psychometric tradition. You will recall that this approach to research gets its
name from the fact that researchers seek to measure psychological constructs.
They do so largely to determine if there are causal relationships between vari
ables. As a result, hypotheses in the psychometric tradition are frequently state
ments about possible causal relationships between variables, which arc identified
by looking for substantial differences between groups ol subjects who have un
dergone different experiences.
In the psychometric approach, there are three main ways that hypotheses
are worded. Kach choice of wording has a certain underlying logic that will help
you understand why researchers choose to articulate hypotheses as they do. The

56 EXPLORING SKCOND LANGUAGE CLASSROOM RESEARCH


first of these is called the null hypothesis because it predicts that there will be no
significant difference between groups of subjects. ]
The logic ofposing a null hypothesis can beconfusing, sowe will examine it
in some detail before proceeding. The idea ofthe null hypothesis isbased on the
concepts of proof and disproof. As the logic goes, in empirical research, we can
never actually prove anything: We can only disprove, or falsify, claims. This
philosophy offalsificationism in science was advanced bythe philosopher Popper
(1968; 1972). Popperargued that wecannever prove anything throughobserva
tion; wecanonly disprove tentatively established hypotheses. His famous exam
ple is called the white swan argument. According to Popper, 1,000 sightings of
white swans does not entitle us to claim "all swans are white" as a scientific fact.
We can tentatively put forward the hypothesis "all swans are white," but we
should then go in search of a single disconfirming black swan. If your experience
of seeing swans is limited to North America or Europe, you might assume that
all swans are white. But you could never prove that all swans are white because
you could never possibly see allthe swans in the world, nor could yousee allthe
swans that have ever lived, are livingnow, or will livein the future. In fact, if you
were to go to New Zealand and visit the lake at Rotorua, you would see thou
sands of black swans there, but the existence of even one black swan disproves, or
falsifies, the hypothesis that all swans are white.
The null hypothesis in experimental research works with this same logic.
Researchers pose a null hypothesis and then set out to reject it. For example, in
designing the hypothetical study discussed in Chapter 2 (about using Total
Physical Response withlearners ofJapanese), wemightposethe following nullhy
pothesis (symbolized by a capital H with the subscript zero, to represent "null"):
Ho: There will be no statistically significant difference in the Japanese
vocabulary test scores of beginningstudents taught with Total Physical
Response and those taught with traditional methods.
In setting out to test this hypothesis, researchers would actually collect data and
conduct analyses that will allow them to reject (i.e., disprove) the null hypothesis.
There are two other ways to word hypotheses. The first is called the alterna
tive hypothesis because it is an alternative to the null hypothesis. Such a statement
is, infact, the position that the researcher would probably like to support but can
never really prove. (Remember the black swans!) The alternative hypothesis is
simply an affirmatively worded restatement of the null hypothesis. If we can re
jectthe null hypothesis, wecanacceptthe alternative hypothesis. The alternative
hypothesis wording is often symbolized by an H (for hypothesis) with the
subscript A (for alternative), like this:
Ha: There will be a statistically significant difference in the Japanese
vocabulary test scoresof beginningstudents taught with Total Physical
Responseand those taught with traditional methods.
Posing a null hypothesis is a conservative approachto getting started on re
search design. It is typically used where there is no previous research or theory
to suggest that there will be a significant difference between groups. The null

Chapter 3 Key Concepts in Planning Classroom Research 57


hypothesis format is also used when there is existing research, but the results of
that research have been mixed or contradictory. (You would discover that situa
tion in reviewing the literature.)
In the case where we do have theory and/or research findings suggesting
that there will be a significant difference between groups, researchers often use
the third option and pose what iscalled an alternative directional hypothesis. In this
context, researchers feel a bit more confident about the likely outcomes and
therefore take the risk of making a specific prediction about the outcomes of the
study. I lere is an alternative directional hypothesis based on the example above.
It is indicated with the symbols 1IA_i:
H.vi: There will be a statistically significant difference in theJapanese
vocabulary test scores of beginning students taught with Total Physical
Response and those taught with traditional methods, with the students
who had TPR instruction outscoring those who did not.
This wording is referred to as directional because the alternative directional
hypothesis predicts the direction the difference will take (i.e., which group will
perform better than the other in a two-group comparison).

ACTION

Compare the null hypothesis, alternative hypothesis, and alternative


directional hypothesis discussed above. Circle or underline die specific
differences in their wording.

I low a researcher chooses to word the hypothesis in a study is important.


The choice ol the null hypothesis, the alternative hypothesis, or the alternative
directional hypothesis is part of what influences the choice of statistical analyses
of our data—a point we will return to later. In the meantime, just be aware that
(1) sometimes researchers (especially those working in the psychometric tradi
tion) |iose hypotheses instead of (or in addition to) posing research questions,
and that (2) such hypotheses can be worded in various ways. Whether the re
searcher poses a nidi hypothesis or an alternative hypothesis or an alternative di
rectional hypothesis is largely a Iunction ol how much existing research and/or
theory is available to guide the prediction.

ACTION

Skim a research report in a professional journal or anthology from our


field. Did the author(s) use research questions, hypotheses, or both? If they
used hypotheses, were the statements written as null hypotheses, alterna
tive hypotheses, or alternative directional hypotheses? (It is not uncom
mon for researchers to pose the null hypothesis they wish to reject and also
the alternative hypothesis they wish to accept.)

58 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


MORE ABOUT VARIABLES AND RESEARCH DESIGN
Agreat deal of research is aimed at establishing some kind of causal association
between variables of the kinds we have just been discussing. For example, a
teacher may be interested in seeing if there is a causal relationship between
methods of grammatical instruction (inductive or deductive) and students1
test scores. His or her research question might be, "Do students who receive
inductive instruction do better on standardized grammar tests than those who
receive deductive instruction?" Of course, both inductive and deductive grammar
instruction would need to be operationally defined.
In this study, themethod ofinstruction would becalled theindependent variable.
while the students' test scores would be called the dependent variable because they
presumably depend on, or are a result of, die independent variable. It is the inde
pendent variable (method of grammatical instruction, in this case) that is the focus
of our attention in conducting an experiment because we want to blow if manipu
lating that variable (by using eitheran inductive or a deductive method of grammar
teaching) influences students' learning. If, for instance, onegroup offorty students
outscores anothergroup of forty students, and if the experiment has been carefully
set up, the teacher-researcher can infer that the difference in scores "depended" on
the two different teaching methods. For this reason, the difference we are looking-
for(by counting, rating, ranking, or measuring) is called the dependent variable.
The different subcategories, or levels, of the independent variable are what de
fine the groups being compared. In this example the independent variable is the
method of grammar instruction and it has two levels: inductive grammar teaching
and deductive grammar teaching. To return to our earlier discussion about vari
ables and data types, the independent variable here is nominal (or categorical) in na
ture: the leaching method iseither deductive or inductive. The dependentvariable,
the standardized grammartest,yields interval data (the students' actual testscores).
This design can be depicted by a simple box diagram, asshown in Figure 3.1.

Deductive Grammar Teaching Inductive Grammar Teaching


(u = 40 students) (n = 40 students)

FIGURE 3.1 A box diagram of a study with two levels of the


independent variable

ACTION

Write the null hypothesis, the alternative hypothesis, and a directional al


ternative hypothesis for this example study about deductive and inductive
grammar teaching. (Note: If you were really going to do this study, there
would be two directions the directional alternative hypothesis could take.
How you would state that alternative directional hypothesis would depend
on your literature review.)

Chapter 3 Key Concepts in Planning Classroom Research 59


Ofcourse, conducting research in real classrooms, which are full oflively
language learners with all sorts ofinteresting human characteristics, isseldom as
neat as conducting anexperiment in a laboratory setting. Often thevariable that
interests us most is some kind of important but hidden construct, such as moti
vation, language learning experience, oraptitude. Inany given class (as you know
from your own history as a language learner and/or teacher), the students may
have a wide range ofaptitude, motivation, and experience with language learn
ing. So,wewantto be careful aboutmaking claims that the treatment(oneof the
levels of the independent variable in an experiment) is the thingthat has caused
any difference we may observe in thestudents' scores rather than some preexist
ing or uncontrolled factor.
Uncontrolled factors that influence the outcome ofa study(inaddition to or
instead of the levels of the independent variable) are called confounding variables
because they confuse or confound the interpretation of the results. In our
comparison of deductive and inductive grammar teaching, for instance, if one
group (say the treatmentgroup) consists of more experienced language learners
or studentswith higher aptitudes for language learningor studentswho are bet
ter test takers, those factors may cause them to excel on the grammar test rather
than (or in addition to) the teachingmethod used.
Often you will be able to identify some factors that might become con
foundingvariables in your studyon the basis of your literature review. For exam
ple, suppose you read that left-handed learners do betterwithinductive teaching
methods than with deductive teaching methods. If there were only a few left-
handed people in the groups you were teaching and investigating, you might
chooseto eliminate their test resultsfrom the study. Doing so wouldmakehand
edness a control variable—that is, a variablewhose possibleinfluence is controlled
for by excluding it from the study. If researchers can anticipate possible problem
atic influences like these and control for them by planning them out of the
research design, we say that those factors "have been controlled for."
Another way that researchers deal with possible influences on the relation
ship between the independent and dependent variables in a study is to use what
are called moderator variables. These are factors that are identified at the outset of
the study and are intentionally built into the data analysis to see if they have an
influence on the outcome. Suppose you readsomeprevious research on teaching
methods that suggests that girls do better with inductive teaching and boys do
better with deductive teaching. In that case, you might decideto build in gender
as a moderatorvariable. To do so, you would make sure that there were roughly
equal numbers of male and female students in the groups taught with the induc
tive and deductive methods, as shown in Table 3.1.
Note that the level numbers for the levels of the independent and modera
tor variables are often arbitrary. Inductive grammar teaching could just as easily
be listed before deductive grammarteaching. Male studentscould be considered
level 1 and female students level 2. In this situation, the numbers 1 and 2 are just
labels—theyare not data or scores of any kind.
When you add a moderator variable to your research plan and prepare to
determine the effects of that moderator variable in your data analysis, you have

60 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


'tfiMJk 3.1 Abox diagram ofa study with amoderator variable
Independent Variable: Teaching Method
Level 1: Level 2:
Moderator Variable: Deductive Grammar Inductive Grammar
Gender Teaching (n = 40) Teaching (n = 40)

Level 1: Female (n = approximately 20 (n = approximately


students) 20 students)
Level 2: Male (n = approximately 20 (n = approximately
students) 20 students)

created whatiscalled afactorial study (i.e., onehaving a moderator variable). The


word factorial here refers to the idea that another factor has been added to the
research design. Some statistical procedures are equipped to handle factorial
designs (i.e., those with both independent and moderator variables). Otherstatis
tics work onlywith designs that are limited to independent variables. (If you
want to try doingsomestatistical analyses, there are several good resources that
can helpyou do so, and wewill listsome later in this book. At this point, weare
just focusing on variables and research designs.)
In anyexperiment, tryingto determine causality—the relationships between
the independent and the dependent variable—should be clear enough that we
can saywith confidence that the treatment caused any observed difference. For
example, where substantial differences occurbetween the controland the exper
imental groups' performance on the dependent variable (the standardized gram
mar test in our grammar teachingexample), we want to know those differences
were the result of the type of instruction and were not caused by some other,
unknown or unforeseen factors. In other words, researchers working in this
tradition want to be able to say with great certainty that it was the treatment that
caused the observed differences in the scores.
Why does this issue matter so much? It's because decisions are often based
on researchresults—sometimes quite important decisions that affectthe lives of
teachers and learners. Supposeyou conducted the study of deductive and induc
tive grammar teaching described above without imposing any controlvariables.
If youfound observed differences in the students'scores, and if the confounding
variable(s) had influenced the outcome,you couldn't saywith any degree of cer
tainty that it was the teaching method that had led to those differences. This
state of affairs could potentially lead to you claim that one teaching method is
superiorto the other, when in fact the teaching method is not what caused the
observed differences. This problem leads us to the concept of validity, which is
an issue in all sorts of research (not just those approaches associated with class
room research or the experimental method).
Validity means many different things in different contexts (and there are
many different types of validity). In research design, validity has to do with the
truth, or value,of our claims. Discussions of validityare linked to the concept of

Chapter 3 Key Concepts in Planning Classroom Research 61


reliability. In fact, it is often said that there can be no validity without reliability.
What these terms mean and why they are important in language classroom
research is the topic ofthe next section. Validity and reliability are both concepts
related to the quality (i.e., the value) of a study, particularly in the psychometric
tradition.

ENSURING QUALITY CONTROL IN YOUR RESEARCH


Researchers take many steps to ensure that the outcomes of their investigations
are useful. Two key concepts in this endeavor are reliability and validity. In fact,
theseconcepts are so important that theyare almostalways discussed in research
methods textbooks. Most researchers agree that data collection and interpreta
tion must be reliable and valid in order for the activity to be consideredviable re
search. So, what do these two terms mean?

Reliability as a Criterion of Quality


Reliability is a somewhat easier concept to grasp than validity, so we'll deal with
it first. Reliability has to do with consistency. A familiar example is that of a scale
used to measure one's own weight. If measurement on the scale changes each
time you step on it, and you have not gained or lost any weight in the interim,
the scale is not consistent. It is unreliable.
Similar problems can occur in language-related research. Suppose you and
your teaching colleagues were administering placement tests to a group of new
students entering your program, and you had agreed on a set of interview ques
tions to ask each student as well as on a scoring system for rating their speaking
skills. If one teacher was very strict in applying those standards and another was
very lenient, the students could get quite different evaluations, depending on
which teacher interviewed them. This issue is referred to as inter-rater reliability
in the language assessment literature. A parallel problem that occurs is variabil
ity in one raters application of a rating scale. Imagine that you began interview
ing at 9 A.M. and interviewed thirty students during the day. By the time vou got
to the thirtieth student, you might be tired, bored, or rushed, and you might
therefore be more stringent (or more lenient) with the rating scale than you had
been with the first three or fourstudents. The issue of a single rater's consistency
over time is called intra-rater reliability. Both problems can be crucial in research
projects that incorporate raters' assessments of students' skills.

ACTION

Skim a research report in our field. Did the author(s) take any steps to
ensure the reliability of data collection instruments, of ratings, or of data
coding processes?

62 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Aparallel problem can occur in classroom research. Suppose two different
observers are analyzing the frequency ofturn taking in language classrooms. If
one observer chooses not to count utterances such as "Uh-huh" and "Okay" as
turns buttheother does, they will end upwith very different counts oftheturns
taken in classroom discourse. In order to be reliable they need to start with the
same clear operational definition of a speaking turn!
Whether we are doing action research, some kind ofnaturalistic inquiry (like
acase study), oran experiment inthe psychometric tradition, it is important tobe
consistent in recording and analyzing our data. In subsequent chapters, we will
look insome detail at ways toenhance reliability in collecting and analyzing qual
itative data (such as intei-observer agreement and inter-coder agreement). For now,
let's just take anexample. Look back atTable 2.3, Jepson's operational definitions
and coding categories in his study of repair moves in voice and text chats. One
reason for carefully operationally defining his terms and providing examples was
so that another researcher could independently code a portion of his data. That
process would allow Jepson to determine how sound his categories were.
In general, there are two types of reliability in research design: internal
reliability and external reliability. Reliability isgenerally established through repli
cation. If, in carrying out a study, youcollect data twice (with the same students,
who have not learned or forgotten anything in between the two data collection
points and getthesame results both times, then you can claim thatyour data are
reliable, andyour research has internal reliability. If, on the other hand, someone
else replicates yourstudy in similar circumstances and with similar subjects and
gets similar results, thenyou can claim that yourstudy has external reliability. In
fact, replicability is a desirable quality in the psychometric tradition. It is defined
as "the degree to which a study could be performed again on the basis of the
information provided by the author" (J. D. Brown, 1988, p. 43).
There are three types of replicability as defined by Mitchell and Jolley
(1988). The first is direct replication, or repeating the originalstudy with virtually
the same procedures and population. The second issystematic replication in which
the subsequent studydiffers from the original in someintentionaland systematic
way. The third isconceptual replication, which intentionally uses different methods
(e.g., a different treatment) in order to improve uponthe original study. All three
types of replicability are useful in language classroom research.

Validity asaCriterion of Quality


Validity is a little trickierthan reliability to understand. There are two aspects to
the concept that we will deal with here: internal yalidity and external validity.
Internal validity, as described above, has to do with whether a study has been
designed in sucha way that the claims made bythe researcher canbe confidently
upheld. In an experiment, for instance, the key question is, "Has the studybeen
designed in such a way that the researcher can claim that the results are due to
the causes manipulated by the researcherin administeringthe treatment?"That
is, are differences in the dependent variable attributable to the treatment (i.e.,
are they internal to the design) and not to some other confounding variable(s)?

Chapter 3 Key Concepts in Planning Classroom Research 63


Let's put the issue ofinternal validity in the context ofthe study described
above, which focused on the effect ofinductive and deductive grammar teaching
methods on students' standardized grammar testscores. Ifthe students in the de
ductive classes get substantially higher scores than the students in the inductive
grammar classes (or vice versa), we want to make sure those differences are due
to the method of instruction and not due to other uncontrolled factors (such as
the students' aptitude for language learning, gender, handedness, or test-taking
skills). Saying that a study has internal validity means that the results on the de
pendent variable are directly and unambiguously attributable to the treatment.

ACTION

Read the following vignette and identify why the study has weak internal
validity.
AnESL teacherin Australia wanted to see if using role plays would help
improve the speaking fluency of his intermediate students in his adult
evening class. So, he used role plays in class once a week for the second to
fourteenth weeks of the semester. There were twenty-three students at the
beginning of the term but only eighteen at the end of the semester, as two
of the older students had to drop out due to health reasons and three of the
younger ones had to stop, either because the class was too difficult for them
or because they had work-related issues that prevented them from coming
to class. At the end of the course, the teacher rated the students on their
speaking fluency during the final role-playactivities. He tape-recorded the
role plays and asked another teacher to discuss with him her impressions of
the students' fluency. Based on his success with role plays in that semester,
he decided to incorporate role plays in his beginners' course the next term.

There is also a concern referred to as external validity, or generaHzability. A


study is said to have external validity if the findings can be extrapolated from the
sample in the study to the broader population it represents. In classroom re
search, this question refers to whether the results of an experiment can be trans
ferred to learning environments in the real world.
In order to deal with external validity, we must revisit two terms introduced
in Chapter 2:population amisample. A population is simply a group of individuals
who share a certain characteristic. We all belongsimultaneously to multiple pop
ulations, the most inclusive of which is the human race. Other examples of pop
ulations would include Australian passport holders, billionaires, year-twelve
students, people who are parents, and so on. These populations are all fairly
easily operationalized.
Populations that are defined in terms of constructs are more abstract. For
example, in classroom research, we might be interested in studying the popula
tion of "intermediate-level learners of Spanish" or the population of "highly

64 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


motivated language learners" or "Chinese-English bilinguals." We would need
to operationalize each of these constructs.
What does this idea of populations have to do with validity? When re
searchers want to investigate some aspect of the behavior of a given population,
they need to workwith a subsetor sample rather than with an entire population.
(Most populations are far too large to be studied directly.) However, when those
researchers report their results, they usuallywant to make claimsthat are relevant
to the entire population, not just to the sample of individuals who took part in the
study. If the studyhas beenset up in sucha waythat it is difficult or impossible to
make such an extrapolation, then the research has weakexternal validity.

ACTION

Read the following vignette and identify at l<:ast three issues that con
tribute to the study'sweak externalvaHdity.
A teacher of French as a foreign language at a prestigious secondary
school in a wealthyneighborhood believed hei use of dialog journals was
helping the students in her two honors classes improve their spelling and
grammatical accuracy in French. (Dialogjourads are personalwritingsby
the students in the target language, to which tl Le teacher repliesin an on
going exchange. Typically, the teacher responds to the content and does
not overtlycorrect the students' errors.)The studentsin the honors classes
were selected because of their high achievement in French the previous
year, lb test her assumption, the teacher used dialog journals with the six
students in her 9 A.M. class, but not with the eight students in her 10 A.M.
class, for the entire fifteen-week semester. Shefread and responded to the
six students' dialog journals three times eaclJ week. At the end of the
semester, the students in both classes submitted twenty-page term papers
written in French. Three of this teacher's colleagues each read and rated
all the term papers without knowing about either the investigation or
that only one of the two honors classes had engaged in the dialogjournal
project. T

Making claims about a population on the basisof results from research on a


sample drawn from that population is calledgeneralizing the findings. To gener
alize the findings of an experiment is to say that the results of that experiment
should hold up in the real world. For this reason, external validity is also called
generalizability.
lb summarize, we can think of internal and external reliability and internal
and external validityas different issuesin quality control. Each of these terms deals
with a key question (Nunan, 1992). Internal reliability deals with the question,
"Would an independent researcher, on reanalyzing the data, come to the same
conclusions?" (p. 17).External reliabilityaddresses the question, "Would an inde
pendent researcher, on replicating the study, come to the same conclusion?"

Chapter3 Key Concepts in PlanningClassroom Research 65


(ibid.). Next,internalvalidity isinvolved withthe concernofwhether, based on the
research design, "wecan confidently claimthat the outcomes are a result of the ex
perimental treatment" (ibid.). Finally, external validity addresses the question, "Is
the research designsuch that we can generalize beyondthe subjects under investi
gation to a wider population?" (ibid.).
In setting up a study, researchers must deal with a tension between internal
and external validity. One of the interesting paradoxes of research design is that
strengthening internal validityweakensexternal validity. Why is this so? It is be
cause internal validity depends on carefully controlling all the variables that
might influence the outcomes of an experiment. But exerting too much control
over variables creates laboratory-like conditions that cannot be duplicated in
classroomsin the real world, so the more pristine and carefullycontrolled an ex
periment is, the less it resembles actual classrooms. To the extent that the treat
ment in an experiment cannot be applied in the real world with the same results,
the study is said to lackexternalvalidity. This tension, or trade-off, is represented
in Figure 3.2.
Let's return to our hypothetical investigation of the effectiveness of induc
tive versus deductive approaches to grammar instruction. As a researcher work
ing in the psychometric tradition, you might want to generalize your findings as
widely as possible. However, based on your review of previous research it seems
that girls often do better than boys in grammar learning, you might decide to
only include boysin your study. Byscreening out the girls,you wouldstrengthen
the internal validity of the research because you would rule out gender as a
possible confounding variable. However, you would also be weakening the

Increasing control Decreasing control


over variables over variables
leads to leads to
increasing increasing
internal validity externalvalidity
but lowering but lowering
external validity. internal validity.

FIGURE 3.2 The trade-off between internal and external validity.

66 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


external validity ofthestudy because you could no longer extrapolate the results
to all learners—only to boys. In fact, you could only make claims of relevance to
boys in contexts thatare similar to the one in which thestudy takes place. (This
problem illustrates the importance of the moderator variable.)

REFLECTION

How important are reliability and validity to you when you are reading a
research report?

ACTION

Skim a report about a classroom research project in a professional journal


or anthology. Are the terms reliability and validity (orgeneralizability) men
tioned at all? Did the author(s) seem concerned about quality control in
any way?

The issue of generalizability (that is, extrapolating findings from samples to


populations) is controversial. Some researchers working in the naturalistic in
quiry or action research traditions, particularly those who are interested in cap
turing insights from authentic classrooms, argue that generalizability is not a cri
terion to which classroom researchers must aspire (Larsen-Freeman, 1996; van
Lier, 1988). Such researchers argue that if we are to capture insights into the ways
that real teachers and learners behave, then we ncci\ to study intact classrooms,
not those that have been speciallyset up with randomly selected subjects for ex
perimental purposes. For such researchers, insight rather than proofshould be the
standard for research. (We will explore these issues when we discuss data collec
tion and analysis in the traditions of naturalistic inquiry and action research.)

THREATS TO VALIDITY

In the experimental tradition in psychometric research, the goals of internal and


external validity are very important. This tradition is so well developed that over
the years the kinds of problems that can occur have been described and cata
loged. These problems are called threats to validity. This term encompasses any
sort of difficult}' that can cause problems in the interpretation of the results.
Here we will consider a few of the threats that are especially prevalent in lan
guage classroom research. For the moment, we will continue discussing experi
mental research since experiments that try to determine causality provide many
clear-cut cases where such threats can occur.

Chapter 3 Key Concepts in Planning Classroom Research 67


Threats to Internal Validity
You will recall thatin an experiment, internal validity is the ability to claim that
your findings are due to thetreatment. That is, ifthere isa significant difference
between thecontrol and experimental groups, itshould beunambiguously attrib
utable to thetreatmentthat theresearchers planned andcarried out.There should
be no other possible explanation for the observed differences to have occurred.
In fact, thisdegree of confidence is rare. Let's go back to the study ofdeduc
tive and inductive grammar teaching discussed above. We talked about control
variables (those potential problems you recognize before beginning your re
search and therefore control by planning them outofthestudy) and confounding
variables. Any confounding variable that you don't discover before conducting
your research can influence the outcomes and therefore become a threat to va
lidity in interpreting your findings. Imagine that you wish to conduct a study
comparing inductive and deductive grammar teaching in the secondary school
where you teach, but you mustconsider the following conditions:
1. You don't have the power to randomly selectand randomly assign students
to the deductive or inductive grammar teaching groups, so you choose
your 8 A.M. intermediate class for the deductive teaching and your 2 P.M.
intermediate class for the inductive teaching. When the term begins you
learn that, due to the schedulingof the auto-shop class, there are no males
in the 8 A.M. class, and, due to the schedule of the modern dance class,
there are no females in the 2 P.M. class.
2. The textbooks are purchased by the school and you have no control over
their selection. The required book for your intermediate classes is very
heavily grammar-oriented with regular grammar rules and explanations in
every chapter.
3. As the semester goes on, you get more and more enthusiastic about teach
ing grammar inductively. Likewise, teaching grammar deductively seems
less interesting to you and less engaging to your students.
4. At the end of the semester, all the members of the 8 A.M. class askif you will
coach them and be their language club advisor. They have decided they
like studying languages so much that they want to form a language club
and they need a faculty advisorto do so.
5. At the fifth week of the term, you counsel four of the male students to drop
out because they are doing so poorly they will certainly fail the course. As
a result, you end up with ten males in the 2 P.M. class and twenty-four
females in the 8 A.M. class.

Can you see how each of these conditions might influence the outcomes of any
sort of grammar test you could give the two classes? All of these conditions rep
resent threats to the internal validity of the study because each one could influ
ence how well the two groups do on the test (the dependent variable).There may
have been differencesattributed to gender (1 above).Allof the students got some
inductivelyoriented input from the textbook (2 above).The quality of your own

68 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


teaching may have differed in the two courses (3 above), and it appears thatthe
students in the deductively taughtclass may be moremotivated and enthusiastic
than the others (4 above). Finally, the groups are of very unequal sizes, which
might have influenced the type and frequency of participation during lessons,
and some of the weakermale students have left one group (5 above).
In fact, this last condition—the unequal loss of research subjects from one
group)—is such an important issue that it has a special name: mortality. The term
in this context doesn't mean that subjects actually 'die; it just means they disap
pear from the sample. Mortality is particularly problematic for longitudinal
projects in which the passing of time can bringits own problems. For example,
subjects (either teachers or students) can leave the school or changetheir mind
about being involved in the project.
These and other kindsof problemscaninfluence the performance of the con
trol and experimental groupson the dependentvariable(s) and, thus, damage the
internal validity of a study. But there are also threats to the external validity, or
generalizability, ofclassroom research. Wewill consider justa few suchproblems.

Threats to External Validity


You will recall that the issue of external validity has to do with whether the re
sults of a study will apply to (generalize to) other people and other contexts be
sides the research sample and the research conditions. Sometimes experimental
treatments are so specializedor so expensive that even if they were very effective,
it is unlikely that they could be implemented in the "real world." At other times,
the research subjects were unique in some way. Consider the following hypo
thetical research outcomes and think about why these studies might be weak in
generalizability.

1. A kindergarten teacher finds that TPR is very effective with the bilingual
five- and six-year-olds in her morning class. There are twelve pupils in the
class and alltwelve havetested quite high on a test of intelligencefor children.
2. A pronunciation teacher works for a small company that handles the tele
phone service requests for an international computer manufacturer. The
teacher conducts a study that shows that a particular software packageused
by the individual employees of the telephone service firm is highly effective
in improving their English pronunciation. Each employee is required to
use thesoftware package ten hours perweek for a period of three months
as a condition of his or her employment.
3. A teacher at an international secondary school uses extended reading as a
way of building her pupils' target language vocabulary and reading skills.
The students are predominantly children of diplomats, and most of them
have lived in four or five countries even though they are just teenagers.
Each of these situations includes interesting issues and for each we can imagine
a viable research question that could be asked or a reasonable hypothesis that
could be posed, but each situation is problematic in terms of its generalizability.

Chapter 3 Key Concepts in Planning Classroom Research 69


REFLECTION

For each of the three situations described above, identify the threat(s) to ex
ternal validity inherent in thecontext. Ifyou are working with a group, com
pare your ideas to those of a classmate or colleague from a different group.

CORRELATIONS AND NONEXPERIMENTAL


COMPARISONS

The comments above about experimental research design describe the strongest
true experimental design. It hastwo or more groups, at leastone of which gets the
treatment and one of which serves as the control group. You will recall that to
qualify asa true experimental design, the study must involve random selection (from
the population to the sample) and random assignment (from the subject pool to the
groups). The groups in the experiment are determined by the research question
or hypothesis—and specifically by the levels of the independent variable. There
may also be groups defined by the levels of one or more moderator variables.
But some very important types of research do not involve a treatment and
cannot be called "experiments" in the strongest sense of the term. In some situ
ations, rather than comparing groups, we want to determine the relationship, or
correlation, between two or more variables as measured in one group of people.
The research design for this situation is called a correlation design, and there are
several statistical procedures that can be used to detect correlations across vari
ables. Still other studies involve the use of statistical logicor other analytic pro
cedures to look lor differences among groups defined by preexisting conditions
rather than by experimental treatments. Such studies are called criterion groups
designs because they compare groups defined by some criterion.
Correlation designs and criterion groups designs are both part of the ex post
facto class of designs. The phrase "ex post facto" is used because the researchers in
vestigate the possible influence of conditions after the fact. In this section we will
examine both of diese designs, but we will begin with the criterion groups design
because—like the other designs we have discussed so far—it allows us to compare
two or more groups (even though there is no formal experiment going on).

Criterion Groups Designs


Think about the following researchable issues:

1. Two secondary school teachers share the opinion that their female students
are naturally better at foreign languages than their male students. They
plan a study to investigate this possibility systematically.
2. The members of an FSL faculty observe that students from some first lan
guage backgrounds seem to have less difficulty with English spelling than
do students from other first languages. They decide to test this hypothesis.

70 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


3. Over the years, a language teacher notices that her left-handed students
seem to have consistently better pronunciation than do mostof her right-
handed students. Shewants to do a studyto determine whetheror not this
is the case.

In the examples given above, the researchers did not make thestudents male or
female; the researchers did not cause them to bej native speakers of particular
languages, nordid the researchers cause thesubjects to beleft-handed or right-
handed. Instead, the researchers willstudy the possible influences of these con
ditions on some dependent variable(s) after thefact—that is after the conditions
of interest in the comparison already exist. In other words, the students are al
ready male or female (Situation 1), they already come from a particular L-l
background (Situation 2), and they are already left-handed or right-handed
(Situation 3) before these studies begin. That is why we say that the criterion
groups design is part of the expostfacto class of designs.

ACTION

Write the null hypotheses being tested in tjie three criterion groups
designs described above.

You will recall one key characteristic of the true experimental designs is that
they use random selection from the population to the sample and random assign
ment from the sample pool to the variousgroups in the sample (the control and
experimental groups). In criterion groups designs, we may have random selec
tion but we cannot have random assignment. Why? Think about it for a
moment. In the study comparing the pronunciation of left-handed and right-
handed language students (Situation 3 above), we might be able to randomly
select left-handed and right-handed students from the entire population of
foreign language learners. But we could not randomly assign them to groups
in the study because the groups are defined by the subjects' handedness. This
is, in fact, the defining characteristic of criterion groups designs: Subjects
are grouped according to the criterion of interest, not according to random
assignment.
When we use the ex post facto criterion groups design to test hypothesesor
answer research questionswith quantitativelycollected data in the psychometric
tradition, we often compare the average scores ofj the groups on the dependent
variable. Statistical tools are used to make inferences that help us decide whether
those differences are significant. In other words, statistics can be used to com
pare groups defined by preexisting conditionsjust as they can be used to identify
significant differences between control and experimental groups in formal ex
periments. We will return to this point in subsequent chapters where we discuss
quantitative data analysis.

Chapter 3 KeyConcepts in Planning Classroom Research 71


Correlation Designs
Sometimes researchers areinterested inseeing to what extent variables co-occur,
or coirelate. Correlation designs help us determine what sorts of factors seem to
"go together" in a patterned, predictable way. For example, imagine yourself as
a teacher of Spanish as a foreign language. Perhaps you notice that among your
intermediate students, those who seem more confident and outgoing have better
pronunciation. If you had a valid and reliable measure of confidence and extro
version as well as a valid and reliable way to measure Spanish pronunciation, you
couldadminister both those measures to your studentsand then correlate the re
sults. If you found a notable tendency of pronunciation scores to increase as ex
troversion scores increase, you could say there was a positive correlation between
these two variables.
There are also instances where two variables seem not to correlate and even
seem to be at odds. You might notice, for example, that the students who have
betterspeaking German vocabularies seem to take less time finishing theirread
ing assignments. You could investigate this apparent pattern with a correlation
study by measuring the students' vocabulary and their reading speed. If you
found a negative coirelation, it would mean that as measures of one variable in
crease, measures of the other variable decrease. That is, the finding that as stu
dents' vocabulary increases, their reading speed decreases would be an example
of a negative correlation.
Note that positive correlations are not inherently good, nor are negative
correlations inherently bad. The wordspositive and negative just refer to the di
rection of the relationship between the two variables. In positive correlations,
the measurements of the two variables increase/decrease together. In negative
correlations, the measurements of one variable increase as the measurements of
the other variable decrease.
Like the research designs that compare two or more groups, correlation
designs can use either (or both) research questions or hypotheses.And, as we saw
earlier, those formal statements can be worded (1) as null hypotheses, or (2) as
non-directional, alternative hypotheses, or (3)as alternative-directional hypothe
ses.Here are the three typesof hypotheses for the study described above regard
ing students' Spanish pronunciation and their tendency toward extroversion.

Ho: There will be no statistically significant correlation between


students' extroversion scores and their scores on a Spanish pronuncia
tion test.

Ha: There will be a statistically significant correlation between


students' extroversion scores and their scores on a Spanish pronuncia
tion test.

Ha-i: There will be a statistically significant positive correlation


between students' extroversion scores and their scores on a Spanish
pronunciation test.

72 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


As we saw in studies that compared groups, the choice of the null, the alternative,
or the alternative directional hypothesis depends on your literature review.
Where previous research and/or theory suggests a clear direction, you may pose
the alternative directional hypothesis. Where the evidence for a direction is not
strong, the culture of psychometric research requires a more conservative
approach, soyou would pose the null hypothesis.

REFLECTION

Look at the three hypotheses above. Notice die differences in their word
ing, compared to one another. Underline the key words that distinguish
the three hypotheses.

ACTION

Write the null hypothesis, the alternative hypothesis, and the alternative
directional hypothesis for the correlation design used to investigate the
relationship between German reading speed and German vocabulary.

In summary, when using correlation designs, a researcher typically works


with one group of people and measures two or more variables, collecting quan
titative data. (There must be one group of subjects, each of whom is measured on
at least two variables, in order to do the statistical analyses related to correlation.)
The data are then analyzed to see if there is a statistically significant correlation
between the two variables under investigation. That correlation could be either
positive or negative, depending upon the relationship of the variables.
One very important point to understand is that correlation does not equate
to causality. Determining that two variables co-occur is not the same as deter
mining that manipulating one variable, the independent variable, causes a
change in another, the dependent variable. In fact, in correlation studies, we
don't really use those terms. Instead, we simply talk about the X variable (for
example, good Spanish pronunciation) and the Y variable (e.g., confidence/
extroversion). If we did indeed find a positive correlation between measures of
good Spanish pronunciation and confidence/extroversion, we could not tell
whether (1) good pronunciation causes learners to be confident and extroverted,
or (2) being confident extroverted leads to good pronunciation. In this regard,
correlation studies sometimes lay the groundwork for subsequent research to
investigate causality. However, correlation studies are also valuable in and of
themselves, as there are many variables in language learning and teaching whose
relationships we need to understand.
So far, we have briefly discussed four different types of designs that are used
in classroom research. These are the true experimental designs (actually, a class
of designs), intact groups designs, criterion groups designs, and correlation
designs. Table 3.2 compares these designs on three categories.

Chapter3 Key Concepts in Planning Classroom Research 73


TABLE 3.2 Acomparison oftrue experimental, intact groups,
criterion groups and correlation designs
True Intact Criterion Correlation
Experimental Groups Groups Design
Designs Designs Designs

Comparing two
groups (or more)
Treatment for at
least one group
Random assignment
to groups

A CHECKLIST OF INITIAL DESIGN CONSIDERATIONS !


In reading this chapterand the previous chapters, you have encountered a great
deal of information and many new terms. Here we hope to pull some of that
information together in a manageable way.
A checklist of questionsfor evaluating the designof a researchproject is pre
sented in Table 3.3. Some of these questions have been coveredin this chapter
and previous chapters. Others will be dealt with in greater detail in the chapters
to come.

Youmay use the questions in this checklist as you design your own research.
You can also use them for reflecting on and analyzing research that you read.

SAMPLE STUDY

In this section, we will briefly summarize one of the earliest published classroom
research studies conducted by a teacher in our field. Although this study was
published many years ago, it dealt with an issue that still concerns language
teachers today: language learners' classroom participation.
Hearing her colleaguesdiscuss the difficultyof getting Asianstudents to talk
in English classes at a university in the United States, a teacher decided to com
pare in-class participation patterns of Asian and non-Asian learners of English.
(The researcher, Charlene Sato, was herself a Japanese-American ESL teacher
and was very interested in ethnic styles in classroom discourse.) To investigate
this perception, Sato (1982) decided to conduct some research to determine
whether "ethnic patterns of participation were observable, as reflected in aspects
of turn-taking" (p. 14).
To address these issues, Sato videotaped her own class during three fifty-
minute lessons while she was teaching. (The students were told that the
videotape process was a regular part of teacher training in that program, which

74 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 3.3 A checklist for evaluating research designs in classroom
research

Area Evaluative Questions

Research Is my question worth investigating?


Question Is answering my question feasible?
What are the constructs underlying my question?
How can these constructs be operationalized?
Design Does the research question imply a causal relationship
between two or more variables, or does it suggest some
other research focus?
Doesthe question suggest an experimental or a nonexperimental
design? (Am I after insightor evidence?)
Research What methods are available for investigating my question?
Method Which of these are feasible, given available resources
and expertise?
Is it desirable and possible to use more than one data
collection method?
What are the possible threats to the reliability and validity
of my research? How can I deal with these threats?
Analysis Will my data require me to do statistical analyses, interpretive
analyses, or both? I
Will I haveto quantifyqualitatively collectedresearchdata?
How might I do this?
What skillsdo I have for doing the analyses? What skillsdo
I need?
Presentation What is the best way for me to 'publish' my research?
Should I try to speak at a conference?
Should I try to submit the studyasan article?

was true.) In another teacher's class, three lessons were tape-recorded as Sato ob
served and took notes to record which students took speaking turns. In Sato's
class, there were fifteen Asian students and eight non-Asian students, while in
the other class, there were four Asians and four non-Asians. The two classes were
at the same level of English proficiency in the ESL program.

ACTION

Usingthe description above, identifythe follow: ngvariables in Sato's study:


Independent variable:
Dependent variable:
Control variable:
Possible confounding variable(s):

Chapter 3 Key Concepts in Planning Classroom Research 75


Toanalyze the data, Sato (1982) coded the turn-taking behavior of the stu
dents in the six lessons. Her coding categories included general solicits (ques
tions or tasks posed by the teacher to the whole class), thestudents' responses to
general solicits, personal solicits (questions or tasks posed by the teacher to an
individual), the students' responses to personal solicits, and self-selection by the
students. She also coded whether the speaking turns were taken by Asian or non-
Asian students.

REFLECTION

What do you think about the comparability of the data from the two
classes in Sato's study? In her own class, the data consisted of the video
tapes of three lessons. In the other teacher's class, Sato tape-recorded the
lesson and took observational notes.

Sato's (1982) findings are interesting and complex'. We will summarize only
a few of them here. She found that, although the Asian students outnumbered
the non-Asian students in the two classes combined (nineteen Asians versus
twelve non-Asians), the non-Asian students took 63.5% of the total speaking
turns. Further scrutiny revealed that the Asian students only self-selected a third
of the time, while the non-Asian students self-selected two-thirds of the time. In
terms of teacher-allocated turns, the Asian students were selected for turns by
the teachers 40% of the time while the non-Asians were selected 60% of the
time. (All these differences were statistically significant.) Sato's comment about
her own turn distribution patterns is particularly insightful:

[T]he Asian American teacher behaved no differently than did the


Caucasian American teacher on this measure (i.e., the teachers nomina
tion of students for turns). Whatever ethnic ties the former may have
felt toward the Asian students, she nevertheless called upon them less
often than she did the non-Asians, (p. IS)

We see this study as a very interesting effort bv a teacher to investigate a com


monly held view. Sato's study has influenced our thinking about turn taking in
language classes, both as researchers and as teachers. I ler work also influenced
further research in our field.

PAYOFFS AND PITFALLS

In this chapter we have dealt with key concepts in planning language classroom
research. By now, it is probably apparent that there are many payoffs and pitfalls
involved in conducting classroom research, and your choice of research design
can lead to both.

76 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


There are several payoffs involved in choosing a research design that is part
ofthepsychometric tradition in language classroom research. These procedures
are well codified: There are accessible trainingcourses and numerous textbooks
to help you learn about them. The threats to validity and reliability are well
understood. In addition, research using these designs is often highly valued in
academic contexts.
The pitfalls, of course, are the mirror image of the payoffs. In order to use
the statistical analyses associated with the designs of the psychometric tradition,
you may need special training (or at least guidance). There is a fair amount of
jargon associated with this approach to language classroom research, so novice
researchers sometimes feel overwhelmed by the amount of information to be
learned.Just remember our research-as-culture metaphor. We often feel over
whelmed when we enter a new culture until we learn some basic vocabulary, the
cultural norms and rules for behavior, and the important traditions and stories.
But even though cultural adjustments take time, they do happen and they are
worth doing.This is the casein learning these key concepts and mastering these
design issues as well.
There are research designissues associated with naturalistic inquiry and ac
tion research as well, but there is considerable interplay among those two re
search traditions and the psychometric tradition at the level of design. In fact,
Sato's (1982) study is a criterion groups design using observational methods of
data collection and analysis. In the chapters to come, we will examine many such
blendings. Our goal in doing so is to help you understandyour choices as a well-
informed researcher and consumer of research.

CONCLUSIONS

In this chapter, you read about hypotheses and hypothesis testing. We saw that
there are specificwordings as well as reasons why researchers pose null, alterna-
tive, and alternative directional hypotheses. We noted the point that the research
questions and/or hypotheses determine what sort of variables and research de
signsare involved in a study.
We looked at different sorts of threats to the validity and reliability of a
study and discussed the expost facto class of designs: the correlation designand
the criterion groups design. These were then compared with the true experi
mental designs and the intact groups designs, whichhad been introduced earlier.
The following questions and tasks, as well as the suggestions for further reading
on these topics, should helpyou consolidate your understandingof these impor
tant issues in language classroom research.

QUESTIONS AND TASKS


1. It is often said that there can be no validity without reliability. What does
this phrase mean? Do you agree?

Chapter3 Key Concepts in PlanningClassroom Research 77


Look back at the summary ofJepson's research, which was the sample
study in Chapter 1. Based on the information provided, identify the
following variables. (Hint: There can bemore than onedependent variable
in a study.)
Independent Variable:
Dependent Variable(s):
Control Variable(s):
Possible Confounding Variable(s):
Read the following description of a research context. Identify the possible
threats to external validity in this study.

Teachersat a universityin Hong Kong conduct research to determine


the effectiveness of the school's new self-access English center. The
teachers believe that using the self-access center will improve stu
dents' English as well as their attitudes toward using English. The
self-access English center hasa library of 1,500 recentmovies in Eng
lish,as well as computer packages that allow students to practice their
pronunciation, build their vocabulary knowledge, and increase their
readingspeed in English. Everystudent who successfully completes a
four-hour orientation program about the self-access center is issued
his or her own laptop computer. In addition, as students leave the self-
access center, they can enjoy free food and beverages at a snack bar
connected to the center. After two semestersof collectingdata, the re
searchers find that the students who used the self-access center have
very positive attitudes toward using English.

4. Think of a way to replicate but improve upon Sato's (1982) research. You
could compare the speaking turns of Asian and non-Asian students as she
did. Or you could compare the speaking turns of male and femalestudents.
In fact, you could combine these two issues and use a factorial criterion
groups design, using ethnicity as the independent variable and gender as the
dependent variable. Draw the box diagram for a factorial criterion groups
design study investigating the influence of ethnicity and gender on
students' speaking turns in a language classroom.
5. In the replication of Sato's study that you envision, are you planning a di
rect replication, a systematic replication, or a conceptual replication? What
do you see as the value of replicating a previous study?
6. Perhaps you have noticed the tendency of two variables to co-occur, or to
run counter to each other. Brainstorm some research questions about these
situations. This sort of research calls for a correlation design. Choose a
research question you have posed and identify the X and Y variables.

78 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


SUGGESTIONS FOR FURTHER READING

D. Allwright and Bailey (1991) provide an introduction to language classroom


research, including ideas about how to getstarted. The studies cited in that book
are somewhat dated now, but the general approach to classroom research is still
sound. I
J. D. Brown (1988) wrote a book entitled Undei'standing Research in Second
Language Learning. It isa good introduction to research design and basic statis
tics in the psychometric tradition.
Nunan's (1992) book Research Methods in Language Learning provides more
detailon the concepts of internal and external reliability and validity.

Chapter 3 Key Concepts in Planning Classroom Research 79


PART II

Research Desig^ Issues


Approaches to Planning and
Implementing Classroom Research

Although this section and the section that follows overlap somewhat, we
see this particular section as a big-picture treatment. The section begins
with a chapter on experimental methods, which is one of the two "pure"
research paradigms (Grotjahn, 1987). The section also covers the other "pure"
research paradigm—ethnography. Other approaches that are dealt with in this
section includesurveyresearch, casestudiesresearch, and action research.
For manypeople, the experimental method issynonymous withresearch, and
other approaches that draw on more naturalistic forms of inquiry are oftenseen
as ground-clearing operations designed to yield preliminary data and to set the
scene for experimental research. We don't see it that way. Case study research,
surveys, and actionresearch all have their value and—depending on the research
questions and the overall intention of the research—can generate useful infor
mation where experiments may well be inappropriate.

Chapter 4: The Experimental Method


Bythe end of this chapter, readerswill
a understand the main classes of research designs as well as the specific
designs commonlyused in this tradition;
b describe threats to internal and external validity;
h understand the basic assumptions underlying the logic of inferential
statistics;
n differentiate between descriptive and inferential statistics;
b be familiar with some basic statistical tools (e.g., frequency distributions).

81
Chapter 5: Surveys
By the end of this chapter, readerswill
b definesurvey research and explain its basic uses;
b differentiate between survey research and the experimental method;
b describe different kinds of sampling strategies for obtaining subjects;
b discuss the basic principles of questionnaire design;
b recognize potential problems in usingvarious questionnaire item types;
b be familiar withvarious ethical issues in survey research.

Chapter 6: Case Study Research


By the end of this chapter, readers will
b discuss the characteristics of case studies;
ei differentiate betweencase studies and experimental research;
s describe different types of case studies;
b state the advantages of casestudies in language classroom research;
b discuss issues to consider in selecting a case;
b understand the factors that contribute to the qualityof a casestudy.

Chapter 7: Ethnography
By the end of this chapter, readers will
b define ethnography and differentiate it from case study research;
b understand the distinction between emic and etic perspectives;
b articulate the principles that guide ethnography;
b describe four types of triangulation;
b discussconcerns about reliability and validity of ethnography.

Chapter 8: Action Research


By the end of this chapter, readers will
b define action research and differentiate among classroom research, action
research, and teacher research;
b describe the steps in the action research cycle;
b articulate the value of action research for classroom practitioners;
b describe changes teachers have documented as a result of doing action
research;
b discuss problems in action research and identify potential solutions.

82 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


CHAPTER

The Experimental Method

Any timeyou use phrases like: "On average, I cycle about 100 miles a
week"or"We can expect a lot ofrain atthis time ofyear"or"The earlier
you start revising, the betteryou are likely to do inthe exam"you are
making a statistical statement, even though youmay have performed no
calculations. (Rowntree, 1981, p. 13)

INTRODUCTION AND OVERVIEW

In this chapter, we will look more closelyat the experimental method, one of the
two 'pure' research paradigms (Grotjahn, 1987). In Grotjahn's terms, this para
digm involves (1) experimental designs, (2) quantitative data, and (3) statistical
analyses. The experimental method is basically a collection of research designs,
guidelines for using them, principles and procedures for determining statistical
significance, and criteria for determiningthe quality of a study. The experimen
tal method is part of the psychometric tradition, and it is also referred to as the
scientific method. For some researchers, the experimental method is the premier
method, all others being 'ground clearing' operations, that is, preliminary data
collection and interpretation exercises to prepare por a formal experiment.
We will begin this chapter by adding to our earlier discussion of possible
confounding variables. Then we will add more research designs to those you
have read about in earlier chapters. We will systematize this discussion by ana
lyzing and exemplifying the research designs, dividing them into classes, and
explaining their relationships to one another. T[hen we will use an extended
example to look at the issue of extrapolating from samples to populations.
This extrapolation is based on the logic of the normaldistribution, whichwill be
discussed as well.

83
As you read this material, keep in mind that different forms of" research have
different cultures. The experimental method has one ofthemost strictly codified
sets ofvalues and procedures of any of the main methods we will study. It also in
volves a fair amount of jargon, which can sometimes be a bit intimidating. But
just imagine that you are learning new vocabulary, as you would when entering
any new culture.

REFLECTION

What do you picture when you read the phrase the experimental method}
What images does it evoke for you?

In this section, we will build on key concepts that were introduced earlier.
These concepts included samples, populations, variables, reliability, and validity.
As we saw earlier, experiments are generally conducted in order to test the
strength of relationships between variables. We also saw that when the re
searcher is testing the influence of one variable on another, the variable doing
the influencing is called the independent variable, while the one being influ
enced is called the dependent variable. For example, in a study of the effect of
two different methods for teaching grammar, the teaching method would be the
independent variable, and the students' performance on a test of grammar
knowledge would be the dependent variable.
In Chapter 3, we discussed confounding variables—those factors that might
negatively influence the interpretation of your results. In the experimental
method, one of the researcher's key goals is to control and systematically manip
ulate variables in order to determine cause-and-effect relationships. This goal
has such a high value in the culture of the experimental method that people have
written extensively about the things that can go wrong. These types of con
founding variables are also called extraneous variables or threats to -validity.

QUALITY CONTROL ISSUES:


THREATS TO INTERNAL VALIDITY

Quality control in the experimental method is largely a matter of understanding


the many things that cango wrongand taking stepsto preventor minimize those
threats. Many of these safeguards are embodied in the various research designs
described here. Aswe saw in Chapter 3, there are threats to both internal and ex
ternal validity. We will revisit these issues now in more detail.
The threats to internal validity can be divided into three categories(Tuckman,
1999). These are sources of bias based on experience, participants, and instrumen
tation, lixperience bias factors are those "based on what occurs within a research
study as it progresses" (p. 134). Participant bias is a result of the "characteristics of

84 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


thepeople onwhom thestudy isconducted" (ibid.). And instrumentation bias has to
do with "the way the dataare collected" (ibid.).

Threats to Internal Validity Based on Experience


There are three experience bias factors: history, testing, and expectancy. In this
context, history refers to events—things that happen during an experiment,
which may influence the results. For example, if classes in a study are disrupted
due to naturaldisasters or political unrest, the research will be affected. History
can also have an unintended influence on the outcomes of the treatment. Imag
ine you were teaching Japanese to secondary school students in an English-
speaking country and running anexperiment inwhich one group gotto see films
about Japan during class and one group did not. Butif all the students go to see
some popular new action film aboutJapanese samurai warriors outside ofschool,
the treatment couldbe compromised bythat eventsince both groupswouldhave
been exposed to the film.
Testing (also called the practice effect) refers to jhe fact that taking apre-test
may influence the subjects' performance on a post-test. That is, in addition to
learning from thetreatment, learners may dobetteron the post-test because the
pre-test alerted themto what was being investigated in the study.
The third issue, expectancy, is an interesting psychological problem. Tuckman
(1999) explains it this way:
A treatment mayappear to increase learning effectiveness as compared
to that of a control or comparison group, not because it really boosts
effectiveness but becauseeither the experimenter or the subjectsbelieve
that it does and behave according to this expectation, (p. 135)
These two threats are called researcher expectancy and subject expectancy,
respectively.
There are steps researchers can take to overcome or minimize these prob
lems. For example, if you use a design with a pre-test, it is important that the
post-test be a different form of the test, instead of the same test the subjects en
countered at the beginning of the study. (We will read about other safeguards
later.)

Threats to InternalValidity Based on Participants


There are five participant bias factors, andyouhave already read about some of
them.They are (1) selection, (2) maturation, (3) statistical regression, (4) exper
imentalmortality, and (5) the interactive combinations of factors.
Selection as a threat to internal validity is the idea that somehow the groups
to be compared turn out to be different before the treatment. Random selection
and random assignment are used to combat this problem. The logic here is that
randomly selecting subjects and then randomly assigning them to different
conditions distributes 'contaminating' participant factors across both the
experimental and the control group. You would thus be able to argue that any

Chapter^ TheExperimental Method 85


differences observed in terms of the dependent variable are due to the experi
mental treatment because the othervariables thatmight have had an effect pre
sumably exist in equal quantities in both the experimental and control groups,
and therefore cancel one another out.
After randomly selecting your subjects from the population, you can use
pre-test data elicited from all the subjects in the experiment before you assign
them to groups. This step allows you to make sure, for instance, that the inter
mediate learners in two different groups are at roughly the same level of lan
guage development to begin with. Of course, using a pre-test introduces the
possibility of the testing threat, so there are some trade-offs in the decisions
you must make.
Maturation refers to the normal development people undergo whether or
not theyare receiving a treatmentof some kind. This threat is particularly rele
vant in longitudinal studies involving children. If we see syntactic development
in five-year-olds taught with a certain method for a school year, can we be sure
that that development was due to the treatment, or was it due to the normal lin
guistic changes that small children experience in their first language, or both?
This problem is addressed through the use of a control group, which is at the
same developmental level asthe treatmentgroup and goes through the same ex
periences (exceptfor the treatment itself) for the same period of time.
Statistical regression is the name of a tendency for people's test scores to
change whether or not the knowledge, skill, or ability being measured truly
changes. Tuckman (1999) gives as an example of the situation in which

a group of students take an IQ test, and only the highest third and the
lowest third are selected for the experiment, eliminating the middle
third. Statistical processes would create a tendency for the scores on any
post-test measurement of the high IQ students to decrease toward the
mean, while the scores of the low IQ students would increase toward the
mean. Thus, the groups would differ less in the post-test results, even
without experiencing any experimental treatment, (p. 136)

Tuckman explains that this pattern happens because "chance factors are more
likely to contribute to extreme scores than to average scores, and such factors
are unlikely to reappear during a second testing" (ibid.). Using subjects who
represent an entire range of abilitylevels in your designis one wayto avoid this
problem.
Experimental mortality (or just mortality) is the problem of losing subjects
from the study. It canbe especially worrisome if the groups end up beingof quite
different sizes because people dropped out. To deal with this threat, researchers
often try to recruit more people for a study than they may actually need. Re
searchers must sometimes also try to locate subjects who took the pre-test and
experienced the treatment (or were in the control group) but then moved away
or were absent when the post-test was administered.
As the name suggests, the intei-active combination offactors happens when
more than one threat is presentin a study. For example, ifyouconducta studyof

86 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


reading readiness and choose two intact groups for a study—say, two classes of
kindergartners at two different schools—you may end up with children from
widely divergent socioeconomic backgrounds. It is possible that the more afflu
ent children have better nutrition and more opportunities to be read to than do
the poorer children. This is, in part, a selection issue, but if those children are
developing at different rates as a result oftheir nutrition, then you may also have
a maturational issue. Careful planning isneeded to avoid these sorts of problems.

Threats to Internal Validity Based on Instrumentation


The term instrumentation refers to the "measurement or observation proce
dures used during an experiment" (Tuckman, 1999, p. 137). Instrumentation
includes tests, questionnaires, observation systems, elicitation devices, audio-
and videotaping—in short, any means of collecting data.The threat of instru
mentation is also sometimes called instability of measures (Brown, 1988, p. 39),
because it occurs if the measurement or recording processes change during the
experiment. I
Instrumentation is related to reliabilitysince it involves consistency in data
gathering. To minimize this threat, data collection procedures must "remain
constant across time aswellas constant aa-oss groups (or conditions)" (Tuckman, 1999,
p. 138). The mostlikely instrumentation problemin classroom research involves
studies with human observerstaking notes or using coding systems during class
room interaction. If the observers are not consistent as they collect data, the data
willnot accurately represent the eventsbeing observed. Observer training, rater
training, and the carefulpilotingof all questionnaires and data collection devices
are the best safeguards against instrumentation problems.

REFLECTION

InChapters 1,2, and 3, find two examples ofinstruments used inresearch


that mightbe subjectto the instrumentation threat.

QUALITY CONTROL ISSUES:


THREATS TO EXTERNAL VALIDI

In Chapter 3, we saw that external validity (or generalizability) is the extent to


which the findings of an experimentwillgeneralizeto nonexperimentalcontexts.
There are four issues of concern about external validity: (1) the reactive effects of
testing, (2) the reactive effects of experimental arrangements, (3) the interaction
effects of selection bias, and (4) multiple-treatment interference (Tuckman,
1999).
The external validity threat called reactive effects of testing is related to the
testing threat (or practice effect) to internal validity. In this situation, the presence

Chapter 4 TheExperimental Method 87


of a pre-test may give the effects of the treatment a boost. Then, when the out
comes of the experiment are transferred to the real world where no pre-test is
involved, that extra boost will be lacking. This problem is also applicable toatti
tude questionnaires (ibid.). Ifa researcher administers a questionnaire at thestart
of an experiment in order to see if the subjects' attitudes change as a result of the
treatment, the questionnaire maysensitize the subjects to the fact that attitude is
an issue in the study. As a result, they may respond more positively to the ques
tionnaire when it is administered following the treatment, leading the researcher
to conclude that the treatment improved students' attitudes. However, that im
provement may not be present (or may not beas pronounced) in nonexperimen
tal conditions where there is no pre-test attitude questionnaire.
The reactive effects ofexperimental arrangements threat is a very interesting
problem. This idea refers to the fact thatsometimes just knowing that one is in
an experiment is enough to cause a difference that may be captured by the de
pendent variable, whether or not one is in the treatment group! You may come
across the term the Hawthorne effect as a label for this threat because of a famous
experimentcarried out at the Western ElectricCompany in Hawthorne. Illinois.
in the 1920s:

The researchers wanted to determine the effects of changes in the phys


ical characteristics of the work environment as well as in incentive rates
and rest periods. They discovered, however, that production increased
regardless of the conditions imposed, leading them to conclude diat the
workers were reacting to their role in the experiment and the impor
tance placed on them by management, (ibid., p. 140)
In other words, simply being included in a stud)' can influence subjects' behav
ior. If you conduct observational research in a class other than your own, the
teacher may say to you, "The students were on their best behavior because you
were here," or "The students were really rowdy because you were here." These
are examples of the reactive effects of experimental arrangements in language
classroom research.

REFLECTION

Think about the times that you have observed a class or have been ob
served when you were teaching a class. Did any reactive effects occur?
What were they? WTio was affected? What might be done to counteract
such problems when an observer visits a language class?

The threat known as the interaction effects of selection bias occurs when the
sample in an experiment is not really representative of the population from
which it was drawn. This is a major issue for language classroom researchers be
causeit hingesupon firstdefining the populationwe wish to study and then upon

88 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


sampling fromthat population appropriately. For example, assume youare inter
ested in cognitive style and you wish to investigate the effects of analytic versus
holistic teaching styles on language learners. You conducta studywith a hundred
students who are divided into two groups, one oft which is taught analytically
while the other group is taught holistically. Unfortunately, after you finish your
study, you read a research report about left-handed people tending to be more
holistically oriented and right-handed people tending to be more analytically
oriented. When you checkyour records, you see that only one of the hundred
subjects was left-handed. This is unfortunate because left-handed people make
up about thirteen percent of the population. So, left-handed people have been
underrepresentedin your sample.
A way to cope with this threat is a process called stratified random sampling.
This term means that before we select people from the population to be in the
sample, wedetermine whatthe relevant characteristics of the population are (like
handedness) and make sure the levels (strata) in the sample represent the popu
lation appropriately. In this case, we would make sure that the sample included
about thirteen percent left-handed people. If there were enough left-handed
people included in the sample, you could even buildin handedness as a modera
tor variable to check its effects in your research on holisticand analyticteaching
styles.
Finally, there is a threat known as midtiple-tredtment interaction. This threat
is hard to manage in classroom research, especially in second language settings
where students have access to the target language outside of class. It can be a
problem in foreign language settings, too. Imagine that you are using an intact
groups design, in which your 9 A.M. class of secondary school French students
is taught with the traditional materials, butyousupplement the lessons foryour
1 P.M. class with recordings of French popular music. The students in the 1 P.M.
class respond enthusiastically and even share the French music recordings with
their friends in the 9 A.M. class. In effect, the comparison group has now gotten
the treatment!

RESEARCH DESIGNS IN THE EXPERIMENTAL


METHOD

In order to deal with these threats, the experimental method includes many dif
ferent research designs to counteract the possible confounding variables that
could influence the internal and external validity of a study. The various designs
have different strengths and weaknesses. Anyone who chooses to do an experi
ment must balance the focus of the research question against the time and re
sources available for conducting the study in order to choose the best design.
In this section, we will review some research designs discussed in Chapters
1, 2, and 3 (the true experimental designs, the intact groups design, and two in
the ex post facto class—correlation and criterion groups designs). We will also
introduce some other designs that are used in the experimental method. To show

Chapter4 The Experimental Method 89


you the differences among these designs, we will start with a research situation
and develop it through a series of evolving scenarios. (There are manymore re
search designs in the experimental method. We are just describing some of the
most important ones here.)

Scenario 1: The One-Shot Case Study Design


Suppose you are an EFL program administrator. You want to determine whether
or not the students taking the TOEFL preparation course in your program are
benefiting from the course. You decide to conducta studyin which the twenty
students in the course are tested on the TOEFL at the end of the fifteen-week
term.

REFLECTION

When you have completed this study, what information will you have
about the students? What information willyou lack?

This research design is called a one-shot case study. (This phrase does not
mean the same thing as the term case study does in naturalistic inquiry—a point
we will explore in Chapter 6.) The one-shot case study is a weakdesign because
of the problems inherent in the interpretation of the results. Since there is no
pre-test, we don't know how proficient the students were in English at the begin
ning of the TOEFL preparation class. As a result, we can't really say, on solid
empirical grounds, whether the course helped them (though the students and
teacher may be sure that it did). And since only one set of data is available, no
comparisons are possible.

Scenario 2: The One-Group Pre-Test Post-Test Design


You still want to determine whether or not the students taking the TOEFL
preparation course are benefitingfrom the class, but you realizesome things that
could be done to improve the study. You decide to add a pre-test to your design,
so the twenty students in the course are tested on the TOEFL at the beginning
and at the end of the fifteen-week term.

REFLECTION

What informationwillyou have at the end of this study that you wouldn't
have had after conducting the one-shot case study described in Scenario 1?

This design is called a one-group pre-test post-test design. It is a weak design,


but it is an improvement over the one-shot case study because you can at least

90 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


tell if the students made progress by comparing their scores at the beginning of
the class with their scores at the end. (Hence the name pre-test post-test design.)
Comparing the pre-test and post-test scores allows you to determine whether
they made progress during (but not necessarily because of) the TOEFL prepa
ration course. (Perhaps they got extra tutoring outside of class or had some
English-speaking friends.)
The difference between the pre-test scores and the post-test scores is called
the gain scores. If the students' post-test scores are higher than their pre-test
scores, then we can conclude that their English (as measured by the TOEFL) has
improved. Butbe forewarned: There are alsonegative gains scores—when the post-
test scores are lower than the pre-test scores. This situation can be discouraging
for both teachers and students. It may mean the students have experienced some
loss of proficiency, or that they didn't test as well the second time, or that there is
just some flux in the measurement process. (We will return to this point later.)

REFLECTION

What steps could you take to refine this design so that you could confi
dently say that the students' measured improvement was in fact due to the
TOEFL preparation course?

Scenario 3: The Intact Groups Design


Suppose you wonder whether or not the TOEFL preparation course helps the
students increase their TOEFL scores. You conduct a study in which the twenty
students in the course are tested on the TOEFL at the end of the fifteen-week
term. You will compare their end-of-term TOEFL scores to those of twenty
similar students in another class that meets at the same time of day and for the
same number of hours every week. However, the students in that class are not
studying specifically to prepare for the TOEFL.

ACTION

Answer these questions about the study described above.


1. What is die research question or hypothesis for this study?
2. What is the independent variable and how many levels does it have?
3. What is the dependent variable?
4. What are the two control variables?
5. Identify one problem inherent in this study.
6. What are two things that could be done to improve this study?
Compare your answers with those of a classmate or colleague.

Chapter 4 TheExperimental Method 91


As you will recall from Chapter 2, this research design is called an intact
groups design. It is known as a weak design because of the problems inherent in
the interpretation of the results, but it isstronger than the one-shot casestudy or
the one-group pre-test post-test design.

REFLECTION

What are the weaknesses inherent in the intact groups design? (Think back
to our discussions of randomization in Chapters 2 and 3.)

The main problems with the intact groups design stem from the fact that the
subjects in the groups being compared were not randomly selected from the
population, nor were they randomly assigned to groups. (Hence, the name "in
tact groups design.") Without randomization (and without a pre-test), we cannot
be certain that the groups being compared were identical (or at least quite simi
lar) to begin with. Perhaps the students in one group are more motivated than
those in the other group, or have greater language aptitude or higher language
proficiency to start with. As a result, we cannot be sure that any differences we
find are truly due to the treatment (the TO FIT preparation course). Therefore,
when we usean intact groups design we must be conservative when we report the
results.
These three designs—the one-shot case study, the one-group pre-test post-
test design, and the intact groups design—all belong to the pre-experimentalclass
of designs. They are pre-experimental in that they lack some of the defining
characteristics of the true experimental designs.

ACTION

Decide which of the three designs discussed above is being used in each of
the following research situations.
1. An Arabic teacher wanted to know whether pronunciation exercises im
proved her students' pronunciation of difficultwords. She gave the stu
dents a pronunciation test before doing the pronunciation exercises, and
then she tested them again after the class.
2. A teacher used two different methods to teach vocabulary with her two
intermediate French classes. One group got a list of randomly ordered,
unrelated words to memorize. The second group got the same vocabu
lary lists, but in addition, the teacher created jazz chants using the
words. At die end of the term, both classes were tested over the vocab
ulary presented in the course.

92 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


3. A teacher wanted to see if listening to Spanish radio programs would
help the pronunciation of his beginning Spanish students. He consis
tently assigned three hours per week of listening to Spanish radio for
homework. At the end of the course, the teacher asked a native Spanish
speaker to judge the students' pronunciation.
Compare your ideas with those of a classmate or colleague.

Scenario 4: The Nonequivalent Control (Comparison) Groups Design


Suppose you are an EFL program administrator who wants to determine
whether or not the students taking the TOEFL preparation course in your pro
gram are benefiting from the course. Vou conduct a study in which the twenty
students in the course are tested on the TOEFL at the beginning and at the end
of the fifteen-week term. You will compare their TOEFL scores to those of
twenty similar students in another class that meets at the same time and for the
same number of hours each week. These students are not studying for the
TOEFL, but they are also tested at the beginning and end of the fifteen-week
term with the same pre-test and post-test that the TOEFL preparation students
take.

REFLECTION

What element has been added to this scenario that was not present in Sce
nario 3, which described the intact groups design?

ACTION

Reread the paragraph above and answer these questions.


1. What is the research question or hypothesis for this study?
2. What is die independent variable and how many levels does it have?
3. What is the dependent variable?
4. What are the two control variables?

5. Identify one problem inherent in this study.


6. What are two things that could be done to improve this study?
Compare your answers with those of a classmate or colleague.

This research design is called a nonequivalent control groups design, or—more


properly— a nonequivalent comparison groups design. (The name comes from the

Chapter 4 The Experimental Method 93


fact that the groups were not randomly sampled so we cannot claim they are
conceptually equal.) This design isnot as strong as the trueexperimental designs
because it lacks randomization, but it is stronger than the one-shot case study,
the one-group pre-test post-test design, or the intactgroups design.
The benefitof the nonequivalent comparison groups design over the intact
groups design is the data from the pre-test. Those data allow you to saywhether
or not the groups were identical (or quitesimilar) at the beginning of the term.
You can also compare the groups' gain scores instead of just their post-test
scores.

However, because the groups being compared were intact (i.e., they were
not randomly sampled or randomly assigned), there are still limitations on the
claims you can make. In fact, that is whywe prefer the term comparison groups in
the design's name rather than control groups. By definition, a control group is one
made up of people randomly selected from the population and randomly as
signed to the groupsin the study. In addition, in the strongestdesigns—those in
the true experimental class—which group serves as the control group is often
randomly determined, perhaps by the flip of a coin. This step is a further safe
guardto ensure that there are no known preexisting differences that mightinflu
ence the outcome of the study.

Scenario 5: The Time Series Design


At this point, in order to introduce two different research designs, we want to
change our focus a bit. Let's look at the TOEFL preparation course from the
point of view of the teacher. In this situation, there is only one group of stu
dents—avery familiarsituation in language classroom research.
Imagine that you are the teacher for this course and that there are twenty
students. Youwant to help the students prepare for the TOEFL. Given the im
portance of academic vocabulary on the TOEFL, at the end of every week you
give the students a twenty-item quiz that tests the vocabularyfor that week. The
average scores on the quizzes tend to be around 13 or 14 points each week.
During the eighth week of the semester, you administer a practice TOEFL.
Several students are unhappily surprised by their low scores. The experience of
taking the practice TOEFL seems to motivate them, and, thereafter, they exert
more effort in studying for the weeklyvocabulary quizzes. In the weeksafter the
practice TOEFL, you notice that the quiz scores are higher than they were be
fore the practice exam, with the average ranging between 16 and 18 points. The
average scores on the weeklyvocabulary quiz are indicated by the asterisks (*) in
Figure 4.1. The vertical gray bar indicates the administration of the practice
TOEFL.
This scenario is an example of the time series design. In this design there is
only one group, so comparison with another group is not possible. However, it
is possible to compare the group's average scores before the practice TOEFL,
which we can think of as a treatment, with their average scores after the practice
TOEFL. We can use this design, for instance, to help answer the question of
whether or not administering the practice TOEFL motivated the students to

94 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


20 -|
19-
18- * * *

* * *
fl 17"
0
* 15-
14- * * * *

13- * * * i:

12- i i i i i i i 1 l i i i i i i

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Week

FIGURE 4.1 Average scores on weekly twenty-point vocabulary quizzes


before and after the administrationof the practiceTOEFL

FIGURE 4.2 Possible outcomes in a time series design (Tuckman,


1999, p. 169)

study harder. From the data above, it appears that that effort resulted in higher
vocabulary quiz scores.
In the time series design, the group under investigation serves as its own
control. That is, before the treatment, the students are functioning as a control
group, but after the treatment, they are analogous to an experimental group.
This situation is far from ideal, but it can be informative. We need to be careful,
though, about interpreting the results, as shown in Figure 4.2. In this figure,
T stands for time.

Chapter4 The Experimental Method 95


The inference that the treatment caused an effect is most justified in cases 1 and
2 above. It is least justified in eases 3 and 4 (Tuckman, 1999).

REFLECTION

Why does Tuckman (1999) assert that cases 3 and 4 suggest that the treat
ment has not caused an effect?

These four sets ol data all provide different information about the possible
efteets of the treatment. The first line in Figure 4.2 tells us that the scores were
higher after the treatment than they were before, and that they remained consis
tently high.The second line tells us that the scores improvedafter the treatment
but that the improvement tapered off later. The third line indicates that the
scores were higher after the treatment and continued to increase, but we cannot
say that the treatment caused this improvement because the scores had already
begun to increase before the treatment. It appears that this developmental trend
might have continued with or without the treatment being administered. Finally,
in the fourth line, the scores are somewhat erratic. They improve immediately
alter the treatment and then drop back down again, but the scoreswere relatively
high at one point before the treatment as well.

Scenario 6: The Equivalent Time Samples Design


.Another design that is used in contexts where there is no comparison group or
control group is called the equivalent time samples design. It is related to the time
series design. To see how this design works, we will use another scenario related
to the TOEFL preparation course.
Once again, please imagine that you are the teacher for the course. You have
implemented the practice of giving weekly vocabulary quizzes to encourage the
students to study the academic vocabulary covered in class. You read a research
article about the importance ol providing a meaningful context for the study of
vocabulary and you decide to test the author's ideas. For every other week of the
fifteen-week term, starting with the first week, you provide a story at the begin
ning ol the week that oilers a clear, memorable context for the vocabulary the
class will study that week. On alternate weeks, you simply provide the week's vo
cabulary list without contextualizing the vocabulary items in a story. At die end
of every week you administer a twenty-point quiz. The results are depicted as a
bar graph in Figure 4.3.

REFLECTION

What can you infer from the results depicted in Figure 4.3?

96 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


20 -|
19-

18-

17-
n
{16
a.
15-

14-

13-
12 T T t—n r—n r—n

3 4 5 8 9 10 11 12 13 14 15
Week

FIGURE 4.3 Average scores on weekly twenty-point vocabulary


quizzes for weeks with contextualization (gray bars) and
weeks without contextualization (white bars)

Once again, when we work with the equivalent time samples design, we say
that the group serves as its own control. There is no separate control or compar
ison group, but we are able to compare the scores on the dependent variable (the
twenty-point vocabulary quiz) for the weeks when contextualization was pro
vided with the vocabulary scores for the weeks when it was not. In effect, the
contextualization is the treatment, and it is alternately provided and withheld
from the same group of students. This design is stronger than the time series de
sign because multiple comparisons are possible. That is, since the treatment has
been given and withheld several times, we have a better chance of detecting its
effect (if any).
These three designs (the nonequivalent comparison group design, the time
series design, and the equivalent time samples design) all belong to a class called
the quasi-experimental designs. This class is characterized by (1) the possibility of
making comparisons on the dependent variable, but also by (2) the lack of a
randomly sampled and randomly assigned control group.

REFLECTION

Think of a research question that you could address with a time series de
sign and one that could be addressed using the equivalent time samples
design. Remember that these are designs that can be used when no con
trol group (or even a comparison group) is available. Given that fact,
what could you confidently say about the possible outcomes of your two
studies?

Chapter 4 The Experimental Method 97


ACTION

Identify the particular quasi-experimental design involved in each of the


following situations. Compare your ideas with those of a colleague or
classmate.

1. An EFL teacher in Turkey wanted to see if having a party where only


English was spoken would have an impact on the conversational fluency
of her students. The party was arranged for the middle of the semester.
Every week before the party, she recorded a brief conversation with
each student. After the party, she continued to record the weekly
conversations to try to determine whether the party had had an impact
on the students' conversational fluency.
2. A teacher of Swedish as a second language wondered whether imple
menting drama techniques would improve her students' pronunciation.
The teacher tested the pronunciation of her three intermediate speak
ing classes at the beginning of the term. With the 9 a.m. group, she
used just regular conversation practice. With the 11 a.m. class, she used
role plays in addition to conversation practice. And with the 2 p.m.
class, she used role plays and conversation practice, but she also had the
students perform scripted plays in Swedish. At the end of the term, she
tested all the students again to see whether their pronunciation had
improved and, if so, whether those gains differed across the three
groups.
3. An EFL teacher in Shanghai wanted to determine the effect of translat
ing new vocabulary items into Chinese for the students (instead of just
talking about their meanings in English). She gave vocabulary quizzes
every week and tried the translation approach to vocabulary teaching
everyother week. At the end of the semester, she compared the student'
quiz averages for the weeks following the translation approach with
those from the weeks in which only English explanations were used.

Scenario7: Post-Test Only Control Group Design


We are now moving into a different class of designs—the true experimental de
signs. These are characterized by (1) random selection of subjects from the pop
ulation, (2) random assignment of subjects to groups, and (3) the presence of an
actual control group.
Let's return to the situation about the TOEFL preparation class. Please
imagine once more that you are an EFL program administrator who wants to
determine whether taking the popular 'FOF.FL preparation course in your pro
gram helps the students. You conduct a study in which the twentystudents in the
course are tested on the TOEFL at the end of the semester. You will compare

98 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


their TOEFL scores to those of twenty similar students who are tested at the
same time but who are not studying for the TOEFL.
In order to make sure the two groups of students are as similar as possible,
you randomly draw forty names from the list of over a hundred students who
wish to take the TOEFL preparation course. Then you randomly assign twenty
of those names to one group and twenty to another group. Finally, you flip a coin
to see which group will enroll in the TOEFL preparation class this term and
which group will wait until next term. This second group is placed in a grammar
review course, which meets at the same time and for the same number of hours
as the TOEFL preparation course.

ACTION

Answer the following questions about this study.


1. What is the research question or hypothesis for this study?
2. What is the independent variable and how many levels does it have?
3. What is the dependent variable?
4. Name a problem inherent in this study.
5. What is one thing that could be done to improve this study?
Compare your ideas with those of a classmate or colleague.

This research design iscalled the post-test only control group design. It is one of
the true experimental designs because of the presence of a control group and the
random selection and random assignment of subjects.

Scenario8: Pre-Test Post-Test Control Group Design


You have probably already realized from the name of this design what additional
change could be made. If we simply add a pre-test to the study in Scenario 7, we
will have a pre-test post-test control group design. The independent variable
and its levels remain the same.

REFLECTION

Now what is the dependent variable? What is one possible threat to valid
ity inherent in this design?

The pre-test post-test control group design is one of the true experimental de
signs because of the presence of a control group and the random selection and
random assignment of subjects. In one sense, it is stronger than the post-test
only control group design because you can measure improvement (through the
gain scores). However, it is also susceptible to the testing threat.

Chapter4 The ExperimentalMethod 99


COMPARISON OF MAIN RESEARCH DESIGNS
IN THE EXPERIMENTAL METHOD

So far in this chapter, we have studied eightresearch designs. In Chapter 3, we


discussed the criterion groups design and the correlation design, which are the
two members of the expostfacto class. The fourmain classes of designs and the
various specific designs that comprise them are depicted in Table 4.1.
If youstart with the upperleft box in Table 4.1 anddraw a large Z through
the four boxes, you will have a way of remembering the increasing power of
these designs. That is, the pre-experimental designs are the weakest, followed
by the quasi-experimental designs and then the ex post facto designs. The true
experimental designs are the strongest. Within the culture of experimental re
search, this relative strength is a function of increasing control over variables.
The stronger designs are those with the greatest internal and external validity.
Three designs are marked with asterisksin Table 4.1. The addition of one or
more moderator variables in these designs makes themfactorial. For example, a
factorial post-test only control group design is one in which there are one or more
moderator variables in the study. Theoretically, it would also be possible to add
a moderator variable to an intact groups design or a nonequivalent comparison
groups design, but this is rarely done because adding a moderator variable adds
complexity to the statistical analysis. Since these designs are relativelyweak, it's
hardly worth the extra effort to add a moderator variable.
Another wayto contrast these designsis to list their defining characteristics.
Table 4.2 does so by answering the followingyes/no questions:
Column 1: Does the design involve more than one group of subjects?
Column 2: Does the design involve administering a treatment?
Column 3: Does the design involve a randomly selected, randomly as
signedcontrol group to comparewith the experimental group(s)?

TABLE 4.1 Major designs and classes of research designs in


experimental research (after Shavelson, 1981, pp. 30-44)
Pre-Experimental Class Quasi-Experimental Class
1. One-Shot Case Study 1. NonequivalentControl (or
Comparison) Groups Design
2. One-Group Pre-Test Post-Test Design 2. Time Series Design
3. Intact Groups Design 3. EquivalentTime Samples Design

ExPost Facto Class True Experimental Class

1. Criterion Groups Design* 1. Post-test Only Control Group Design*


2. Correlation Design 2. Pre-test Post-test Control Group
Design*

100 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 4.2 Comparison often experimental research designs
Design 1 i
3 4 5 6 7

One-shot case study No Yes No No No No No

One-group pre-test
post-test No Yes No No Yes No No

Intact groups Yes Yes No Yes No No No

Nonequivalent
comparison groups Yes Yes No Yes Yes No No

Time series No Yes No No Yes No No

Equivalent time samples No Yes No No Yes No No

Criterion groups Yes No No Yes No Yes No

Correlation No No No No No Yes No

Post-test only control


group Yes Yes Yes No No Yes Yes

Pre-test post-test control


group Yes Yes Yes No Yes Yes Yes

Column 4: Does the design involvesome other kind of comparison group?


Column 5: Is a pre-test given?
Column 6: Is random selection used to constitute the sample?
Column 7: Is random assignment used to constitute the groups?
There are some caveats to remember when interpreting this table. First, it is
important to distinguish between the presence of an actual control group (Col
umn 3) and some other sort of comparison group (Column 4). Secondly, in cor
relation studies, the statistics used to perform the correlation analysesare always
based on two (or more) sets of data from one group of people. However, some
studies include different sets of correlation statistics if correlations are sought in
more than one group. (We will deal more with the statistics used in correlation
designs in Chapter 13.)

ACTION

With a classmate or colleague, talk through the yes/no responses in


Table 4.2. Make sure you understand how these seven defining character
istics can help you identify these research designs in the experimental
method.

Chapter 4 The Experimental Method 101


RESEARCH DESIGN ISSUES AND INFERENTIAL
STATISTICS

To review these concepts, let's consider asituation inwhich an experiment might


be an appropriate way ofgathering data. Imagine thatyou have developed some
innovative listening materials based on authentic radio and television programs.
You have used these materials with good results in your secondary school EFL
classes. Although you feel strongly thatyour innovative materials aresuperior to
the school's traditional listening program, your colleagues are skeptical. Your
challenge is to test the possible superiority of your materials. You have several
options here. You could obtain the opinions of the students through surveys and
questionnaires. Alternatively (or additionally), you could ask a sympathetic col
league to observe your classes and make an observational record of the teaching
and learning thatgoes on. However, you may feel that these steps areunlikely to
sway your more skeptical colleagues, who are only likely to be convinced by
superior test scores.
Yourfirst thought is to test your students' listening comprehension at the end
of the semester, and,assuming that the results are favorable, presentthesefindings
to your colleagues. However, you come across the following criticism of such an
approach (which you now recognize as the one-shot case studydesign):

Much research in education todayconformsto a design in whicha single


group is studied only once, subsequent to some agent or treatment pre
sented to cause change. Such studies might be diagrammed as follows:
XO
IX = the treatment administered to the subjects, and O = the observation.]
[Unfortunately]... such studies havesuch a total absence of control as to
be of almost no scientific value ... It seems well-nigh unethical... to
allow as theses or dissertations in education, case studies of this nature
(i.e., involving a single group observed at one time only). (Campbell and
Stanley, 1963, pp. 176-177)

If you are convinced by this argument,your next inclination might be to test


two of your classes—one taught with the innovative materialsand one taught with
the traditional approach. (Doing so would involve using the intact groups design.)
However, you quickly realize that it is no good simply testing the students at
the end of the semester and comparing their scores becausethe groups might not
have been at the same level to begin with. The solution would seem to be to test
both groups at the beginning and at the end of the semester. Then, if the group
that has been taught through the innovative materials makesgreater gains in lis
tening comprehension than the traditional group, you can presumably ascribe
the superior results to the innovation. (Here you will recognize the nonequiva
lent comparison groups design.)
Your research design is becoming more rigorous, but it is not yet rigorous
enough to allow you to make claims that there is a causal relationship between

102 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


the independent variable (the innovative materials) and the dependent variable
(the students' listening comprehension test results). There is always thepossibility
that factors other than the innovative materials are responsible for any observed
differences in the scores.

REFLECTION

Make a list of possible issues that might be responsible for differences in


the groups'scores in the study described above, Do these issues affect the
internal validity or the external validity of the stkdy?

There are many possible influences that can affect the outcome of a study
such as this one. If different teachers are involved in teaching the different
groups, then it could be the teachers rather than the materials that make a differ
ence.If one teacherworkswith both groups,you will havecontrolled for teacher
style as a factor, but the teacher's enthusiasm for (or boredom with) one type of
materials could influence the results. Even the time of day at which a classis held
can affect learning outcomes. These issues weaken the internal validity of the
study because it is not possible to state categorically that the treatment brought
about any differences observed in the students' test scores.
While factors such as those mentioned above may impinge on research out
comes, participant factors (such asthe selection threat)are the most pervasive. For
example, you may have happened to select a group of fast-track or high-aptitude
smdents as the recipients of the experimental authentic materials, and a group of
slow learners that used the traditional materials. In order to guard against the
possibility that factors such as age, motivation, or aptitude might influence the
research outcomes, sound experimentaldesign in the psychometrictradition sug
geststhat you assign subjectsrandomlyto the control and experimental conditions.
Using randomization puts you in a better position to argue that any ob
served differences on the end-of-course test are due to the innovative materials
because possibleconfounding variablesthat might have had an effect (such as in
telligence and aptitude) are presumably evenly distributed in the experimental
and control groups. You can also test both groups of students before the experi
ment just to make sure that the groups really are the same, though this step in
troduces the possibility of the testing threat. (Doing so would entail the use of
the pre-test post-test control group design.)
Unfortunately, in ongoing programs, it is not always practical to rearrange
students and randomly assign them into different groups or classes. In many
schools, if an experiment is to be conducted, it will have to be with classes to
which students have been preassigned. That is why the intact groups design and
the nonequivalentcomparison groups design are so often used in languageclass
room research—because they involve groups that were already established (by
means other than randomization) without the researcher's control. In these
circumstances, while the internal validity of the experiment is weakened, the
study may still be worthwhile.

Chapter4 The Experimental Method 103


REFLECTION

Imagine that you are able to carry out the experiment described above. You
randomly assign ten final-year secondary school students to the control
group and ten to the experimental group. A 100-point listeningpre-test in
dicates that the groups are at the same level of proficiency. You teach both
groups for a semester, using the innovative authentic materials with the
experimental group and the traditional materials with the control group.
At the end of the semester, the groups are retested with an alternate form
of" the 100-point listening test. You calculate the averages:
Control Group: Experimental Group:
Post-test average: 80 85
The experimental group has thus outscored the control group.
Are you entitled to claim that the innovative materials are superior to the
traditional materials? If so, why? If not, why not?

The answer to the question is "Not yet!" You have selected a sample, or sub
set, of all the possible students in the final year of secondary school as your ex
perimental subjects. II you retested them again tomorrow, or if you selected a
different group of subjects and tested them, it's highly unlikely that you would
get exactly the same scores. The students might be more tired (or more ener
getic), or the weather could affect their performance. In short, a whole range of
factors could be responsible for test score variation. What you need to decide is
whether the variation in scores between the control and experimental groups
might have happened by chance, or whether the differences were a result of the
experimental treatment. In order to do this, you must make inferences based on
statistical procedures.

FROM SAMPLES TO POPULATIONS: THE LOGIC OF


INFERENTIAL STATISTICS

The aim of this section is to introduce you to the logic of statistical inference.
The information presented here will probably not equip you to carry out your
own statistical analyses, but it should help you to understand and appreciate the
logic behind the statistical procedures that enable researchers to make claims
about an entire population based on a sample or subset of subjects from that
population.
In most research, it is not possible to collect data from the entire population
in which we are interested. Consider your investigation of the authentic materi
als in secondary school EFL classes. Although not impossible, it would be ex
tremely time-consuming and cumbersome to obtain data on all the secondary

104 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


school EFL students in your country, state, province, county, or prefecture. (In
fact, that's what national examination boards and international testing companies
try to do.) Normally, someone who wanted to carry out such an investigation
would select a sample of students (say 20, 100. or 2,000 or more) from the wider
population and test them.
However, a problem immediately arises. The problem has to do with decid
ing the extent to which the data obtained from the sample are representative of
the population as a whole and, in fact, what that population is. In everyday life,
overgeneralizations arc common (witness the prevalence of "dumb blond"
jokes). But overgeneralizations also occur in research when investigators fail to
recogni/.e that their subjects are not fully representative of the population being
investigated.
Here is an anecdote from Rowntree (1981) that illustrates the complicated
issue of matching samples and populations:
During the Second World War, gunners in bombers returning from
raids were asked from which direction they were most frequently
attacked by enemy fighters. The majority answer was "from above and
behind." (p. 23)

REFLECTION

Why might it have been unwise to assume, supposing you were a gunner,
that this claim would be true of attacks on gunners in general?

Rowntree provides the following answer to the question:


The risk of a false generalization lay in the fact that the researcher was
able to interview only the survivors of attacks. It could well be that
attacks from below and behind were no less frequent, but did not get
represented in the sample because (from the enemy's point of view) they
were successful, (ibid.)

I Ie goes on to say that this issue of matching samples and populations is a para
dox of sampling:
A sample is misleading unless it is representative of the population; but
how can we tell it is representative unless we already know what we need
to know about the population and therefore have no need of samples!
The paradox cannot be completely resolved; some uncertainty must re
main. Nevertheless, our statistical methodology enables us to collect
samples that are likely to be as representative as possible. This allows us
to exercise proper caution and avoid over-generalization, (ibid.)
lb help protect against overgeneralization, researchers use several procedures
based on descriptive and inferential statistics. (Explaining all these procedures is
beyond the scope of this book, but we will deal with some in Chapter 13.)

Chapter 4 The Experimental Method 105


TABLE 4.3 EFL students' scores on a listening comprehension test
(n = 20)

ControlGroup Experimental
Student ID Scores Student ID Group Scores

C-l 80 E-l 85
C-2 82 E-2 87
C-3 78 E-3 83
C-4 77 E-4 82
C-5 83 E-5 88
C-6 80 E-6 85
C-7 76 E-7 81
C-8 84 E-8 89

C-9 75 E-9 84

C-10 85 E-10 86

Mean 80 85

DescriptiveStatistics
To understand the logic behind the procedures that enable extrapolation from
samples to populations, you need to be familiar with a number of statistical con
cepts. The two most important of these are the mean and standard deviation.
These are two of the descriptive statistics—so labeled because they describe the
sample.
For experimental researchers, two particularly interesting features of nu
merical data sets are the extent to which individual items in the data set are sim
ilar and the extent to which they differ or are dispersed. The most important
measure of similarity is the numerical average, or mean (symbolized by a capital
X with a horizontal bar above it and called X-bar). The average is obtained by
adding the individual scores together and dividing the sum by the total number
of scores. To illustrate, let's look at the post-test scores from the students in the
control and experimental groups in the (hypothetical) study about innovative
listening materials.
The scores presented in Table 4.3 are simply listed in order of the students'
identification codes. They are not organized in any particular fashion. It can be
quite useful to rank order these scores, in order to see existing patterns more
clearly. When we rank order these scores, we find the data in Table 4.4.
The lowest score in this entire data set is 75 and the highest score is 89. The
difference between the highest score and the lowest score is the range. This is
one of the descriptive statistics. (Ranking the scores in this way is useful because
it allows us to see the range quickly.) The range in the control group was 75 to
85 and the range in the experimental group was 81 to 89. So, just by looking at

106 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 4.4 Ranking of EFL students' scores on a listening-
comprehension test (n = 20)

Ranking of Control Ranking of Experimental


Control Group Experimental Group
Group Scores Scores Group Scores Scores

1 85 1 89

2 84 2 88

3 83 3 87

4 82 4 86

5.5 80 5.5 85

5.5 SO 5.5 85

7 78 7 84

8 77 8 83

9 76 9 82

10 75 10 si

Average SO 85

the difference in the ranges, we can see that the experimental group did better.
However, range is also reported in terms of its absolute value. That is, we some
times subtract the lowestscore from the highest score. Using this procedure, we
can saythat die range for the control group is ten, and the range for the experi
mental group is eight. We cannot tell which group did better—we can only tell
that there was more variability in the control group's scores than in the experi
mental group's scores.

REFLECTION

In Table 4.4, in the columns that provide the ranking of the two groups'
scores, there are two places where the rank is given as 5.5—one for the
control group and one for the experimental group.Why do you think these
ranks are given as 5.5 instead of 5 and 6?

The answer to the question in the reflection box above has to do with prin
ciples of decision making. Let's use the prize money in a golf tournament as a
metaphor. One golfer is the first place winner, and he receives a check for
S100,000. Two golfers are tied for second place. The prize money for second
place is $50,000, and the prize money for third place is $30,000. What is a fair
way to decide which golfer was second and which was third, since their scores
were tied? The answer is to add the prize money for second place and third place

Chapter4 The ExperimentalMethod 107


and divide thattotal amount by two (for the two tied golfers). So, instead offlip
ping a coin to decide who gets $50,000 andwho gets $30,000, we add these two
sums and get $80,000. We then divide that amount by two and the two second
place contestants each receive $40,000.
The same logic applies in a set of ordinal data when you have tied ranks.
Lookat the ranks that the scores would have covered, had theynot beentied. In
Table 4.4, these are the fifth and sixth ranks. Since the tied scores are identical,
instead of calling one "fifth" and the other "sixth" they are both assigned the
rank of 5.5—halfway between the fifth and sixth ranks.

Frequency Polygons
Anotherway to look at these test datais to see how many studentsobtainedeach
particular score. Let's start with the scores of all twenty students combined. We
canlistthe score values across the bottomaxis of a chartand the numberof peo
ple who obtained each particular score on the vertical axis. The termfrequency
here refers to how often each score was obtained. In Figure 4.4, each asterisk
shows how many people out of the twenty subjects in the study received each
possible score value.
If you were to draw bars down from each of these asterisks to the horizontal
axis, you would have a bar graph, or histogram. If you were to draw a line con
nectingthese asterisks, you would have whatis called afrequency polygon—a chart
of the frequency with which each score value was obtained. Notice that no one
got a score of 79.At this point the line connecting the asterisks would drop down
to the horizontal axis, to indicate that no one received that score.
The frequency polygon is a very important conceptual tool as well as an
informative visual aid. In fact, the frequency polygon is the basis for many of
the most important statistical procedures that we use in language classroom
research.

0 i i i i t i i i i i i i I I I t"
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score

FIGURE 4.4 Frequency of 20 EFL students' scores on a listening


comprehension test

108 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

In Figure 4.4, we combined the scores of the control and experimental


group, but you can also draw a frequency polygon that contrasts two or
moresets of data. Usingthe chartframework below, plot the scores forthe
control and experimental groups separately. Remember that for anyscore
that did not appear in the data, the frequency is zero.
3-1

1-

—i 1 1 1 1 1 1 1 1 1 1 1 1 1 i i
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score

With very small samples like this, frequency polygons can be nearly flat. But
an interesting thing happens when large data sets are plotted on a frequency
polygon. Imagine for a moment that ninety-five students actually took this
100-point test. If we had that many scores to enter in the frequency polygon, it
might look something like this:

i 1 1 1 1 i 1 i i i i i i i r

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score

FIGURE 4.5 Frequency of EFL students' scores on a listening-


comprehension test (n = 95)
Ifyou connect the asterisks in Figure 4.5, you can see the rough shape of a
bell emerging. In fact, a frequency polygon with this shape is called a bell curve or

Chapter4 The Experimental Method 109


a bell-shaped curve because it is typically high in the middle with sloping sides ta
peringoff to tails on the leftand right. It is alsocalled the normal distribution be
cause it is such a common pattern when variables are measured in large groups
ofpeople. Thatis tosay, the characteristic being measured (whether it is height,
intelligence, language aptitude, etc.) isdistributed normally throughout thepop
ulation. Relatively few people score very low on the measurement and relatively
few people score very high. Most people score somewhere in the middle of the
range.
Keep in mind that "we never get a completelynormal data distribution. The
normal distribution is an idealized concept" (I latch and Lazaraton, 1991,
p. 194). But with large samples, the pattern does appear, and this fact leads to
some interesting opportunities for analyzing data. In very large data sets, the
normal distribution is predictable. The shape of the bell is smooth and regular,
and its sides are very symmetrical, like this:

FIGURE 4.6 Standard normal distribution (downloaded from


http://www.tushar-mehta.com on August 14, 2006)
The vertical line drawn straight down from the apex of the bell in Figure 4.6
represents the midpoint or median—defined as the middle score in a data set. In
the normal distribution, it also represents the mean—the average—and the
vwdc—the most frequently obtained score in that data set. That the line repre
sents the median makes sense because the bell is symmetrical. The fact that it
represents the mode also makes sense because the hells apex (the high point)
shows the most frequent score. These three descriptive statistics are collectively
referred to as the measures ofcentral tendency because they all provide information
about the tendency of scores to cluster in the middle range of the curve.
The three other descriptive statistics are the range and standard deviation
discussed above, and also variance. (We will discuss variance in Chapter 13.) To
gether these make up the measures ofdispersion because thev provide information

110 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


about the variability in a set of scores—how dispersed the scores in the data set
are. The standard deviation is the most important measure of dispersion. It tells
us die average amount by which scores in the data set vary from the mean. In
other words, it tells ushow spread out the scores are. In Figure 4.6, the numbers
running along the horizontal axis (from negative four to four) represent standard
deviations.

REFLECTION

Why is the number zero directly under the line that represents the mean,
the median, and the mode in Figure 4.6? (This issue is a matterof logic and
definition rather than of mathematics.)

Calculating the standard deviations for the scores in Table 4.3 gives us the
values reported in the last row of Table 4.5. (We won't go into the formula for
calculating the standard deviation here. We will work with it in Chapter 13.)
We said earlier that the ranges for the scores of the control and experimen
tal groups were ten and eight, respectively. Can you see how the range is re
flected in the standard deviations reported in Table4.5? Sincestandard deviation
is an index of how spread out thescores arein a given data set, it makes sense that
where there is a wider range there will he a bigger standard deviation.

TABLE 4.5 Scores, means, and standard deviations of two groups of


EFL students on a listening comprehension test (n = 20)
Control Group Experimental
Student ID Scores Student ID Group Scores

C-l SO E-l 85

C-2 82 E-2 87

C-3 78 F.-3 83

C-4 77 I-.-4 82

C-5 83 E-5 88

C-6 80 E-6 85

C-7 76 E-7 81

C-8 84 E-8 89

C-9 75 E-9 84

C-10 85 K-10 86

Mean 80 85

Range 10 8
(75-85) (81-89)
Standard 3.25 2.58
deviation

Chapter? The Experimental Method 111


REFLECTION

To put the concept of standard deviation in a practical context, thinkabout


thesituation where you are about to start teaching three different classes of
intermediate English students. For all three classes, the mean score on the
program's 100-point placement exam was 70 points. But the standard devi
ations for the three classes were 15, 10 and 5 points. What do these values
tell you about the composition of the three classes? As a teacher, what can
you expect as a result of this variability?

Inferential Statistics

Atthis point, we want to remind you of some things thatyou already know about
percentages. We want to build on your existing knowledge and confidence to in
troduce a new concept.

ACTION

Look at the pie charts below and decide what percentage of the area of
each circle is indicated by the various sections.

Circle 1 Circle 2 Circle 3

We are quite sure that you recognize the percentages represented by these
divisions. In Circle 1, die sections of the bisected pie chart represent 50% and
50%. In Circle 2, the pie chart is divided into four equal portions. Each segment
represents 25% of the whole circle. And in Circle 3, we have three equal
sections, each of which represents 33M% of the whole circle. You recognize these
percentage values because you have seen charts like this ever since you started
school.
We can use the image of the normal distribution to indicate percentages,
too. As noted above, when the bell curve is bisected by the vertical line repre
senting the mean, the median, and the mode, 50% of the scores fall above that
line and 50% fall below it. (Remember the frequency polygons where you
connected the asterisks. A bell curve is just a symmetrical frequency polygon.)

112 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


—3a —2a —la \± la 2a 3<r

FIGURE 4.7 The areas under the normal curve, as indicated by-
standard deviations (downloaded on August 13, 2006
from Wikipedia)

The image of the bell curve can also be divided in predictable, recognizable
ways.
When you see a chart of the normal distribution, it usually has numbers
written on the horizontal axis. These numbers range from negative four to pos
itive four, as mentioned above, to indicate the location of the standard deviations
in the diagram. But, for ease of interpretation, such diagrams also often have
vertical lines drawn that represent the standard deviations. These lines help us
see percentage divisions, as shown in Figure 4.7. Oust think of this as a different
shaped pie chart—one with which you may not be very familiar at this time.)
Here and elsewhere the Greek symbol a (the lower-case sigma) stands for
standard deviation. The symbol /x (the Greek letter /////) represents the popula
tion mean.

REFLECTION

Look at Figure 4.7. Can you see why we said that 68.2% of the scores fall
within one standard deviation of the mean? (Another way to saythis is that
68.2% of the scores fall in the area between one standard deviation below
the mean and one standard deviation above the mean.)

ACTION

Look up normal distribution in a statistics text or on the Internet. There


will probably be more examples and more detail than we have been able to
provide.

Chapter 4 The Experimental Method 113


Means and standards deviations are important when it comes to comparing
different sets of interval data, such as test scores from a control group and an ex
perimental group. We can also think about the standard deviation and the mean
of a population. Imagine you are reviewing reading test scores for nine-year-olds
in 100 different primary schools. Those scores are likely to be reported as
schoolwide averages because looking at individual scores for all the nine-year-
old pupils at 100 schools would be a daunting task. Even looking at the individ
ual class means for even' school would be very time-consuming, since there
could be five to ten classes of nine-year-olds at even,' school. Seeing the scores
reported as schoolwide averagesis economical and it also allows us to make com
parisons across the various schools.
If wewere to draw a frequencypolygon using the 100schools' readingaverages
as the data to be entered in die polygon, we would once again get a bell-shaped
curve. This time the data points in die frequency polygon would not represent
individual students' scores but rather the mean reading scores of the 100 schools.
We have already discussed the concept of a population. Statistically, a pop
ulation is defined in terms of means and standard deviations. If the means and
standards deviations for two sets of test scores are quite similar, then the sub
jects can be said to be drawn from the same population. If they are very differ
ent, then they are drawn from different populations. The key question here is,
"I low different do they have to be for us to be confident in claiming that they
come from different populations?" The phrase statistically significant has to do
with this question of how different a set of scores must be (or how strong a
correlation must be) in order to consider the difference (or the correlation)
important and trustworthy.
Over time, researchers using the experimental method have agreed that re
sults can be considered statistically significant if there is a less than 5% chance
that they are wrong—that is, if the results are due to chance rather than to an
actual relationship between the variables being investigated. (There are some
situations where more stringency is required and the level is set at a 1% chance
instead.) Another way to say this is that researchers generally want to have 95%
(or 99%) confidence that their results are trustworthy before they reject the null
hypothesis andaccept the alternative hypothesis. So, when you read that a differ
ence or a correlation was statistically significant, it means that the values
obtained met the statistical requirements for having confidence that the results
were not due to chance.

REFLECTION

The mean, median, mode, standard deviation, range, and variance are the
descriptive statistics. What do you think is meant by inferential statistics?

Let's return to our hypothetical investigation into the relative effectiveness


of the innovative versus traditional listening materials. On the end-of-the-
semester test, we discovered that the mean score for the traditional (control)

114 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


group was 80, and the mean score for the innovative (experimental) group was
85. The control group continues to represent the population fromwhich it was
drawn. What we want to know now is whether, through our experimental treat
ment, we have 'created' a different population—roughly defined as "listeners
taught through innovative materials based on authentic input." In order to an
swerthis question, we need to knownot only the mean for each group, but also
the standard deviation. The reason is that the further an individual score is from
the group mean, the less likely it is to occur by chance, and statistical tools can
tell us fairly precisely how likely or unlikely this is.
How likely? Here's where the standard deviation comes in. Please take our
word for this temporarily (or read ahead in Chapter 13).Asshown in Figure 4.7,
for any given set of normally distributed scores, 68% of all scoreswill be within
one standard deviation of the mean, 95% of scores will be within two standard
deviations, and 99% will be within three standard deviations.
When there is a large difference between the means for the control and ex
perimental groups, we can say that the difference is statisticallysignificant. How
large a difference is determined by the standard deviations in the bell-shaped
curve. For instance, if the mean for the control group falls between two and
three standard deviations below the mean for the experimental group, we can say
that there is less than a 5% possibility that this difference would occur by chance.
If our level of confidence is set at 5%, then we can say that the difference is sta
tistically significant. Therefore, we can conclude that the innovative materials
were significantly superior to the traditional materials in enhancing listening
skills. Keep in mind, however, that there is still a 5% possibility that the differ
ence could have occurred by chance.
Given that the control group remains 'untreated,' it represents the popula
tion of interest. The experimental group represents a changed version of that
original population—a group that benefited (we hope!) from the treatment. And
this is what is meant by inferential statistics: We use the data from the sample to
make inferences about what would happen to the population if the treatment
were implementedthere (i.e., if it were generalized). We do that, quite often, by
comparing the means and standard deviations of the sample groups to one an
other, using particular statistical formulae.
Now let's go back to our data in which the control group mean was 80 and
the experimental group mean was 85. We can use an inferential statistic called a
t-test (the t here always being printed in lower case)to see if these differences are
statisticallysignificant.The t-test is speciallydesigned for comparing two means
of small groups where the means are based on interval data. (We will learn more
about t-tests and other inferential statistics in Chapter 13.)And in fact, when we
conduct the t-test on these data, we find that there was a statisticallysignificant
difference between the control and experimental groups. Using the innovative
listening materials did improve the students' scores on the listening comprehen
sion test. Now you can confidently show your skeptical colleagues your results!
In this section, we have oversimplified things somewhat. (If you are really
curious about these issues, you can read ahead in Chapter 13.) Our purpose
here is simply to provide a very basic introduction to the logic behind statistical

Chapter4 The Experimental Method 115


inferences and to show you the basis upon which researchers working in the ex
perimental tradition make their claims for significance.
We willuse thesestatistical concepts in future chapters to illustrate the ways
classroom researchershave analyzed their data. At this point, we willsummarize
a study that illustrates many issues related to research design and threats to
validity.

A SAMPLE STUDY

Many years ago, Kathi Bailey was involved as an observer in a process-product


study, the results of which were never published. We willdescribe it here because
it is a beautiful example of a formal experiment in language teaching and because
there is much to be learned from the way the project evolved.
The research project was conducted at a large military school in the United
States,which had a program in Russian as a foreign language. The students were
young adults. The study was designed to determine whether the school's regular
method of teaching Russian was more effective, or whether Suggestopedia was
more effective.
Suggestopedia is a language teaching method that was originally developed
by Lozanov(1979; 1982). It uses musicto relax the students, who are given new
target culture identities in the class. There is little or no formal error correction
and students learn by hearing and seeing lengthy dialogs in the target language.
The method developsspeaking fluency, reading skills,and vocabulary. Suggesto
pedia normally requires extensive teacher training, and there is a Suggestopedia
institute that provides that training.
The military school's usual teaching method, in contrast, emphasized gram
matical accuracy. There were frequent tests, and error treatment was regularly
used to help the students improve their grammar, pronunciation, word knowl
edge, etc.
The researcher set up a classic textbook experiment. He was able to ran
domly select a group of forty people from the incoming Russian students. All
fortywere true beginners in Russian. He then randomly assigned thesestudents
to two groups,eachof whichcomprised two Russian classes. There were ten stu
dents in each class. Two classes were taught with Suggestopedia and two with the
regular teaching method. (See Figure 4.8.) The twoSuggestopedia classes were
taught by certified instructors who had been trained in the method. The other
classes were taught by regular employees at the school—all experienced Russian
teachers.
The students were all true beginners of the Russian language, so no pre-test
was necessary. But the researcher did administer an attitude questionnaire to all
forty students before the classes began, so that he could compare their attitudes
before and after the Russian course. At the end of the course, all forty students
were given the attitude questionnaire again, and their Russian ability was also
tested.

116 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Experimental
Sample

Control Group: Experimental Group:


Regular Teaching Suggestopedia
Method (n = 20) (n = 20)

r \ f

Class 1 Class 2 Class 3 Class 4


(n = 10) (n = 10) (n = 10) (n = 10)
\ J v J v J V

FIGURE 4.8 The control and experimental groups in the sample study

ACTION

Answer the following questions based on what you know about this study.
1. What were the likely research questions in this study?
2. What is the independent variable and how many levels does it have?
3. What are the dependent variables? (Hint: There are two.)
4. What is one control variable?
5. What is the design of this study? (Hint: This is a trick question. Think
about the dependent variables.)
6. Can you anticipate any possible threats to validity in this situation?
Compare your ideas with those of a classmate or colleague.

The researcher clearly understood the differences among process studies,


product studies, and process-product studies. (See Chapter 1.) For this reason,
he arranged for trained classroom observers to be present (one at a time) in some
of the Suggestopedia classes and some of the regular classes. The observers were
to take notes on the sessions so that if any significant differences were found
between the results yielded by the two teaching methods, those outcomes could
be linked to the actual teaching and learning processes documented by the
observers.
Several interesting problems arose as the study progressed. We report these
problems with great respect for the researcher, who had set up a well-designed
experiment.

Chapter 4 TheExperimental Method 117


First, the Suggestopedia teachers—who were visitors from out of town—
looked at the fifteen-week curriculum and said that they could cover that amount
of material in ten weeks instead of the fifteen weeks the regular classes would
take. Therefore, the Suggestopedia classes ended five weeks before the regular
classes, even though they had covered the same curriculum. At that point, the
students who had been in the Suggestopedia classes were tested and then inte
grated into the regular classes.
Secondly, due to the normal scheduling patterns at the school, the students
taught with the regular method had many different teachers during a typical day.
The two Suggestopedia classes each stayed widi one teacher for the entire day.
There were never substitutes in the Suggestopedia classes, but due to adminis
trative requirements of meetings and testing, substitute teachers were frequent
in the regular classes. (In fact, one day when Kathi Bailey entered a classroom
quite early, before the regular teacher had arrived, a student said, "Oh, no! Not
another one!" When the observer introduced herself and asked him what he
meant, he said he had thought she was another substitute teacher.) In addition,
one of the teachers using the regular method found it too stressful to be observed
so often and chose to drop out of the study.
On a different occasion, when Bailey was observing a Suggestopedia class,
another visitor was present. That person was introduced as a representative of
the Suggestopedia training program. When it was time for a break, the students
all left the classroom. That observer approached the teacher and said, "What are
you doing? This isn't Suggestopedia!" The teacher replied, "I know, but what
can I do? The students want grammar rules and error correction."
The students in both the control and experimental groups were young mil
itary personnel, who lived together in large dormitories. They had been assured
at the outset of the experiment that their performance on the post-test would in
no way influence their subsequent job postings. But the students in the
Suggestopedia classes apparently doubted this promise, and—knowing that
they would face the school's regular accuracy-oriented testing at the end of the
experiment—several of them began to study at night in the dormitories with
their friends who were in the regular classes. In fact, the two groups of students
regularly exchanged their Russian class materials.

REFLECTION

What threats to validity can you identify in the description above?

PAYOFFS AND PITFALLS

As the sample study illustrates, there are many pitfalls in using the experimental
method to conduct language classroom research. Many of these problems
stem from the difficulty of controlling all the possible confounding variables

118 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


associatedwith research on human subjects. Human beings have agency, desires,
anxieties, and goalsthat are often far beyondthe abilityof the researcherto man
agein anycomprehensive way. As a result, even in the strongestresearch designs,
threats to validity sometimes arise that compromise the interpretation of the
results. In classroom research,

the securityof isolatingvariables and defining diem operationally, a secu


rity obtained by laboratory-like experiments and statistical inferences, is
largelylost, as the researcher is forced to look for determinants of learn
ing in the fluid dynamics of real-time contexts, (van Lier, 1998, p. 157)

Indeed, classroom research "entails a very large number of human and institu
tional factorsthat can affectresearch designand outcomesin manyunforeseen and
unforeseeable ways. It is not for the timid" (Rounds and Schachter, 1996,p. 108).
There is also a more philosophical problem with the experimental method.
For some people, if a phenomenon is not measurable, it is not worth studying.
For them, the psychometric tradition, and the experimental method in particu
lar, may be the only valuable ways of conducting research. However,in an effort
to quantify phenomena of interest we may miss important issues. We may not
investigate key variables because they are not easilyquantified, or we may focus
on trivial issuesthat are easyto quantify. Furthermore, the data collected in lan
guage classroom research are often collectedfrom people, but the need to quan
tify and the widespread use of group averages sometimes make it seem that
individual learners are represented only by test scores. And the seeming dehu-
manization of participants is also noticeable when researchers talk about the
sample in a study as "experimental subjects."
Another problem relates to the issue of objectivity and subjectivity. The ex
perimental method emphasizesobjectivity in hopes of counteracting the threats
of researcher and subject expectancy. As a result, teachers (and learners) have
typically not been seen as potential collaboratorsin languageclassroomresearch
conducted with the experimental method. Teachers have had important roles,
but often primarily as deliverers of a particular treatment in an experiment.
However, for teachers, it is sometimes very difficult to be simply a treatment de
liverer when students' needs and desires run counter to the prescribed treatment.
(Remember the Suggestopedia teacher's comment, "I know, but what can I do?
The students want grammar rules and error correction.")
There are also several payoffs associated with the experimental method.
First, it is a well-documented, highly codified approach to conducting educa
tional research with well-developed quality control procedures. There are many
textbooks available and it is relatively easy to locate courses if you wish to get fur
ther training.
Secondly, the experimental method is an internationally recognized way of
conducting research. If you conduct an experiment in Jakarta and publish your
findings in Prospect, the TESOL journal of Australia, readers in Germany, Iran,
India, Canada, and Brazilwillall understand the report (provided they have been
trained in the language and culture of the experimental method).

Chapter4 TheExperimentalMethod 119


Third, there are clear criteria for interpreting the outcomes in such re
search. The conceptof statistical significance provides the field withways to un
derstand whether a correlation is powerful enough or whether a difference in
scoresis big enough to warrant generalizing the resultsof the study.
Fourth, because of the emphasis on operational definitions and controlling
variables, it is sometimes possible to make comparisons across studies conducted
at different sites. The widelyunderstood research designs and numerous statisti
cal procedures allow researchers to replicate studies conducted by other people.
Finally, because it has high prestige in many contexts, using the experimen
tal method may enable researchers to obtain grant money or get their reports
publishedin venuesthat are not as accessible to those trying to publishaction re
search reports or the findings of naturalistic inquiry. (Of course, the research has
to be done well in order to be published. Simply using a prestigious research
method is no guarantee that your report will be accepted.)

CONCLUSIONS

The experimental method has been very important, historically, in all sorts of
research, in both the physical and social sciences. It has often been used in lan
guage classroom research, but with varied success, since it is so difficult to con
trol all the possible confounding variables that can arise in research with real
people. It is, however, a valuable approach to understanding teaching and
learning, and it has influenced many other approaches, as we will see in future
chapters.

QUESTIONS AND TASKS


1. Identify the specific design in each of the following research situations.
A. At the beginning of an advanced composition course for international
collegestudents, the instructor had each student write an in-classessay
about the differences between the educational systems of their host
country and their home countries. After ten weeks, the teacher had the
students write on the same topic. Then the two sets of essays were
mixed together at random and rated by another teacher who did not
know about the experiment.
B. A Greek teacher in Australia wanted to test the hypothesis that exposing
his students to the cultural setting of the target languagewould enhance
their language learning.The firstthree weeks of the class he tested them
everyweek on basic grammar points from the lessonsdiscussed in class.
The fourth week, he took them to a Greek festival, complete with food,
music, dancing, and traditional Greek clothing. The following three
weeks, he gave them weekly tests on the grammar lessons.
C. A French teacher in a secondary school wanted to know if using
new vocabulary words in a hands-on experience would help students

120 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


remember them. So, she performed an experiment using her two be
ginning French classes as the groups to be compared. One group met
in the home economics classroom kitchen and made crepes while
learning the words for the ingredients. The other groupmet in class as
usual, read about crepes, and made vocabulary lists of the ingredients.
Afterthis experiment, the teacher tested both classes with a vocabulary
quiz to see which group did better.
D. The teacher of an elementary Spanish class wanted to know how well
the students would perform at the end iof the course since he had
changed to a newtextbook. He used the students' final exam scores at
the end of the semester as an indication of the success of the new text
book.
E. A language teacher wanted to determine the relationship between oral
proficiency and grammatical accuracy of his students. (He knew that
his teachingmethod stressedspeaking skills but that his students would
be faced with standardized tests of grammatical accuracy.) He designed
a study in which he could plot the students' scores on an oral profi
ciencyinterview against their scores on a 100-point grammar test.
E An ESL teacher in Scotland wanted to compare the listening compre
hension of his students from Asia, Latin America, Europe, and Africa.
He administered a test of English listening comprehension and then
computed the mean scores for the students from these four regions.
G. The ESL teacher in Scotland wondered whether those students who
had traveled in the United Kingdom had better listening comprehen-
sion than those who hadn't. For this reason, he added U.K. travel expe
rience as a moderator variable in his study comparing the listening
comprehension scores of students from Asia, Latin America, Europe,
and Africa.
H. A German language teacher wanted to know if viewing videotapes
about various aspects of German culture would increase her students'
vocabulary acquisition. So, she taught her German course for sixteen
weeks with the curriculum divided into four four-week modules. She
gave the students a vocabulary test at the end of every second week. In
the third week of each module, she showed a German film about art,
theatre, music, sports,etc. At the end of the semester, she comparedthe
four average scores from the weeks with films and the four average
scores from the weeks without films.
2. What is/are the research questions for each of the situations described
above? Can you state the hypothesis (or hypotheses) for each situation?
3. Not all the designs we have studied are represented in the paragraphs
above. Which ones are not represented here? Choose one of those missing
designs and write a brief scenario that provides sufficient detail for your
colleaguesor classmates to be able to identify which specificdesign you are
describing. Exchange your paper with a classmate or colleague.

Chapter4 The ExperimentalMethod 121


4. Turn back to Chapter 3 andreread the summary of Sato's (1982) investiga
tion of the turn-taking behaviors of the Asian and non-Asian students in
twoESL classes. How might the different datacollection procedures used
in those two classes lead to the instrumentation threat in her study?
5. Lookback at the vignettes of the sample studies in Chapters 2 and 3. Iden
tifythe threats to validity in those studies, usingthe vocabulary introduced
here.

6. The increasing strengths of the research designs enable researchers to cope


with the threats to validity, but such threats are often present. For each de
signdiscussed above, identify the specific threat(s) that mayinfluence your
interpretation of the findings.
7. Read the following situation and decide how you could design a study to
address your friend's request.

You havea friend who has been teachingEnglishinJapan for many


years. Your friend noticed that those of her students who often sang
English songs at karaoke clubs seemed to have better pronunciation,
better listeningskills, better speakingfluency, and better Englishvocab
ulary skills than those students who did not participate in karaoke
singing.
Your friend began to use karaoke singing regularly as an activity in
her English conversation classes for adult students. This experience
was so successful that she created a Web site, called "Karaoke Corner."
The Web site providesmusic and lyricsso students can sing along with
English songs in the privacy of their own homes. The Web site also
offers follow-up activities with the English song lyrics—cloze passages,
vocabulary quizzes, and reviews of grammar structures used in the
songs. The Web site has been running for over three years, and your
friend now has about four thousand clients (both adults and teenagers)
who visit the site regularly and pay to sing along in English. Yourfriend
also has two employeeswho request formal copyright permission to use
the songs and who regularly update the Web site with new songs and
activities.
Now your friend wants to expand the market for "Karaoke Corner"
by showinghowsuccessful its regularusers are in improvingtheir Eng
lish skills. She wants you to collect some data that would encourage
companies to pay for "Karaoke Corner" subscriptions for their employ
ees. Since she knows you are learning about conducting research, she
asks you to design a study that would provide evidencethat would show
potential corporate clients the effectiveness of the products and services
offered by the "Karaoke Corner" Web site.

A. Pose the research question(s) you wish to ask. You could write a formal
hypothesis if you wish.

122 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


B. Identify the independent variable in this context. How many levels of
the independent variable will you use in your investigation?Why?
C. Determine what the dependent variable would be.
D. Do you want to incorporate a moderator variable? If so, what is it?
E. What are some possible confounding variablesyou should be aware of
in carrying out this research?
E What are some possible control variables you would like to impose?
G. Think about how you could operationallydefine all those variables.
H. Select a research design from among those that we have discussed.
I. Identify the strengths and weaknesses of the study as planned.
J. Identify the practical difficulties in carrying out this study as planned.
8. We noted at the beginning of this chapter that sometimes the jargon asso-
ciated with the experimental method can be a bit confusing. This is partly
the case because there are many synonyms. Look at the two columns of
words below. Draw a line matching each item in the left column with its
synonym in the right column.

Average Experimental method


Bar graph Practice effect
Normal distribution Generalizability
Testing threat Threats to validity
External validity Histogram
Confounding variables Mean

Scientific method Bell (shaped) curve


Hawthorne effect Reactiveeffects of experimental
arrangements

SUGGESTED READINGS

For more information on research designs, see Mitchell and Jolley (1988),
Shavelson (1981, 1996), and Tuckman (1999).
J. D. Brown (1988) provides a very good description of the normal distribu
tion. He also provides a clear introduction to the descriptive statistics (1988;
2001).

Chapter4 TheExperimental Method 123


CHAPTER

Surveys

You can't catch anelephant with a butterfly net. Then again, youcan't
catch a butterfly with anelephant net. (K. M. Bailey on questionnaire design
to her research students)

INTRODUCTION AND OVERVIEW

This chapter is devoted to surveys and, more specifically, to questionnaires. Here


we will define surveys and questionnaires and then expand on the concepts of
populations and samples introduced earlier. Next, we will look at designing and
piloting questionnaires, both in paper format and online. We will consider sev
eral quality control issues, including the wording of items, the translation and
back-translation of questionnaire items, and ethical issues in questionnaire
design. The chapter ends with a summary of a study in which EFL learners
responded to an online survey, followed by a discussionof the payoffs and pitfalls
of survey research utilizing questionnaires.
Surveys belong to a disparate assortment of data collection techniques under
the rubric of elicitation devices. In this context, to elicit meansto causepeopleto do or
saysomething,so an elicitation device in secondlanguage researchis a procedure
for getting researchsubjects to do or saysomethingin response to a stimulus.
Studies usingelicitation are extremely common in language teaching research.
In fact, all research techniques can be classified as either elicitation, or nonelicitation.
In a surveyof the research literature reported in 1991, Nunan found that around
half of the studiesanalyzed used elicitationtechniques.Such techniqueshave been
common in second languageacquisition research, as far back as the original mor
pheme order studies of the 1970s. Not surprisingly, elicitation techniques vary
enormouslyin scope,aim, and purpose.They includethe use of artifacts designed

124
to stimulate language production, such as pictures, diagrams, and even standard
ized tests, as well as surveys, whichcollect data through questionnaires and inter
views. (Interview procedures are sometimes classified under the rubric of survey
research, butwe will treat interview procedures ina! separate chapter.)
In this chapter, we will look specifically at survey research using
questionnaires—written data elicitationdevices. Questionnaires haveoften been
used in classroomresearch,but perhaps more in classroom-orientedstudies than
in classroom-based studies. Questionnaires administered to teachers, parents,
and/or students can amplify and improve upon a classroom-based study.
For example, a very early and important published example of language
classroom research was a study by Seliger(1977).j He wanted to investigate an
element of turn taking in the target language. He believed that some learners
could be characterized as "high input generators"—that is, language learners
who participated in conversations in ways that generated input from their inter
locutors (the people they were talkingwith). He hypothesized that the high input
generators would outscore "low inputgenerators" pn a testof language achieve
ment. However, since he was studying a group of ESL learners in the United
States,he realized that they had opportunities for English input outside of class
as well. So, he included a questionnaire in his study called the "Language Con
tact Profile." It asked students about their use of English outside the classroom.
When Seliger analyzed his results, he found that the high input generators
outscoredthe lowinput generatorson both the measureof Englishachievement
and alsoon the Language Contact Profile. The datafrom this questionnaire pro
vided an added dimension to the research in terms of understanding how inter
action in the target language can improvestudents' learning.

Defining Survey Research


Surveys are widely used for collecting data in most areas of social inquiry, from
politics to sociology, from education to linguistics. In general education, their
useranges from large-scale demographic studies of community attitudes andex
pectations to small-scale studies carried out by a single researcher (Cohen and
Manion, 1985). The overall purpose of a survey is to obtain a snapshot of condi
tions, attitudes, and/or events of an entire population at a single point in time by
collectingdata from a sample drawn from that population.
Surveys have also been used frequently in applied linguistics to gather data
on a range of issues (D.Johnson, 1992):
Survey methods have been used by second language, bilingual educa
tion, and foreign language researchers to study a wide varietyof issues
that impinge on language learning. These include the changing
demographic context, the institutional settingsin which L2 profession
als function, the policies that affect learning and teaching, program
administration, teacher preparation, attitudes of teachers and professors
toward language varieties, classroom practices, target language norms,
and student language use and growth, (p. 105)

Chapters Surveys 125


According to Dornyei (2003), surveys are especially well suited for askingfactual
questions, behavior questions, and attitudinal questions. The overriding consid
eration, however, is matching the data collection procedure to the research ques
tion you are posing or the hypothesis you are testing.

Questionnaires and Experimental Research


Questionnaires are defined as "any written instruments that present respondents
with a series of questions or statements to which theyare to react, eitherbywriting
out dieir answers or selecting from among existing answers" (J. D. Brown, 2001,
p. 6). (Respondents are the people whocomplete and return the questionnaires to the
researchers.) Many kinds ol questionnaires elicit numeric responses, so surveys are
sometimes grouped in quantitative approaches to research. They are part of the
psychometric tradition in that they try to measure psychological constructs.
However, the term survey research is not identical to experimental research
even though questionnaires are often used as dependent variables in experi
ments. For instance, researchers sometimes try to determine whether a particu
lar experimental treatment causes a change of attitude among the research
subjects, so an attitude questionnaire may be administered at the beginning and
the end of the experiment, and the two sets of responses compared. (This was
what the researcher did in the sample study at the end of Chapter 4 to ascertain
the students' attitudes before and after their first Russian course.)

REFLECTION

Survey research is distinguished from experimental research in several


respects. Think of at least one way that a survey might differ from an
experiment.

The main difference between conducting an experiment and conducting a


sun'ey is that, while the experimental researcher intervenes and manipulates
variables to test relationships, the survey researcher typically does not.
The researcher doesn't "do" anything to the objects or subjects of
research, exceptobserve them or ask them to provide data. The research
consists of collecting data on things or people as they are, without try
ing to alter anything. A survey researcher might want to know about
teachers' honest attitudes towards their school principals, unaltered by
the act of asking. The more intrusive a survey, the lower the chances
that it will accurately reflect real conditions. (Jaeger, 1988, p. 307)
So, the broad goal of survey research is to elicit subjects' ideas, attitudes, opin
ions, and so on without influencing those data in any way. The challenge for
survey researchers is to design questionnaires that capture the information they
wish to elicit without unduly shaping that information.

126 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 5.1 Steps in carrying out a survey
Step Key Question

1. Define objectives What do we want to find out?


2. Identify target population Who do we want to know about?
3. Cany out a literature review What have others said/discovered about the issue?
4. Determine sample How many subjects should we survey and bow
will we identify them?
5. Identify survey instruments Will the data be collected through
questionnaires, interviews, or both?
6. Design surveyprocedures How will the data collection actually be carried
out?

7. Identity analytical procedures I low will the data be assembled and analyzed?
8. Determine reporting procedure How will the results be presented?

Whether you use a questionnaire as part of an experiment, in naturalistic


inquiry, or in action research, there are eight key steps that need tobe carried out
in conducting a survey. These are set out in Table 5.1.
In the remainder of this chapter, we will focus on questionnaire research.
(Readers should see J. D. Brown [2001] and Dornyei [2003] for more detailed
discussionsof these steps.) We begin with further information about populations
and samples.

POPULATIONS AND SAMPLES

One of the most important questions a survey researcher must confront is the
following: VMiat is the population I wish to learn about byway of the survey? Po
litical surveys, particularly those carried out in the run up to an election, gener
ally purport to reflect the entire population of eligible voters (although they also
report a margin of error—usually between 2 and 5%). It would not, of course, be
practical to obtain data from the entire population. In fact, that's what the elec
tion itself is meant to do. A major task for the survey researcher, therefore, is to
select a representative sample from the population as a whole. The idea that the
sample represents the population is important if the predictions based on the
opinion poll are to be accurate.

REFLECTION

If you wished to study voting intentions, one way to collect data for a large-
scale survey, which claims to represent the entire population of a voting dis
trict, would simply be to go into the street and question people at random.
Why do you think tliis may not be a sound way of collecting the data?

Chapter5 Surveys 127


TABLE 5.2 Strategies for survey sampling (adapted from Cohen and
Manion, 1985)

Strategy Procedure

1. Simple random samples Select subjects at random from a list of the


population.
2. Systematic sampling Select subjects in a systematic rather than random
fashion (e.g.. select even- twentieth person).
3. Stratified sampling Subdivide population into subgroups (e.g.,
male/female) and randomly sample from subgroups.
4. Clustersampling Restrict one's selection to a particular subgroup
from within the population (e.g., randomlv
selectinga school from within a particular school
district rather than the entire state or country).
5. Convenience sampling Choose the nearest individuals and continue the
process until the requisite number has been
obtained.

6. Purposive sampling Subjectsare handpicked by the researcher on the


basis of his/her own estimate of their typicality.

SamplingStrategies
In large-scale surveys, the major problem with simple random sampling is that it
may mask differences between underlying subgroups within the population. For
example, men and women often have different voting patterns and preferences in
elections. Therefore, it is important to be sure that men and women are repre
sented in the sample in proportion to the ratio of men and women in the voting
population. Table 5.2 shows some of the different sampling strategies available
to the researcher.

REFLECTION

A researcher at a college wished to gather students' opinions about a


proposed change from the fifteen-week semester system to the ten-week
quarter system. To gadier data, she stood outside the library and asked
ever}' fifth student who left the building whether they were for or against
the proposed change.

1. What kind of sampling strategy wasshe using? (SeeTable 5.2.)


2. What kind of data was she gathering—nominal, ordinal or interval?
3. What threat(s) to validity may be present in using this procedure?
(Remember the gunners discussed in Chapter 4.)

128 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


SampleSize
Sample size is an ongoing issue in any research in which the investigator wants
to make inferences from samples to populations. Common sense suggests that
the larger the sample size, the more accurate the inferences that can be made
aboutthe population. In addition, for reasons explained in Chapter4, where we
discussed the normal distribution, when researchers work with quantitative data
analyses, larger numbers are desirable because of the properties of the bell-
shaped curve. Most statistical procedures based on' the normal distribution work
better with largerdatasets. That iswhyresearchers working in the experimental
method and/or with questionnaire data often conduct "large N studies."
However, accurate estimates can be obtained from relativelysmall numbers
of subjects. Fowler, who produced one of the standard textbooks on survey
design, dismisses the common misperception that the adequacy of a sample de
pends on the fraction of the population included in that sample, arguing that
"a sample of 150 people will describe a population of 15,000 or 15 million with
virtually the same degree of accuracy assuming all other aspects of the sample
design and sampling procedure were the same" (Fowler, 1988, p. 41).
A moment'sthought willshowwhy this is so. You willrecallfrom Chapter 4
that the further a score or measurement is from the overall population mean, the
less likely it is that that particular score or measurement would have been ob
tainedbychance. (Really highscores are usually due to abilityor proficiency, and
really low scores are usually due to a lack thereof.) If we were to tell you that
Subject A scored 72 on a test, and then asked you to predict the mean for the
whole class of twenty students, this would clearly be impossible. The best you
could do would be to say, "Around 72." However, if we saidthat Subjects B and
C scored 69, and 73,you could then begin to makepredictions. It is unlikely, for
example, that the mean would be 36.
A general rule of thumb for survey researchis to get as many people as pos-
sibleto completethe questionnaire ifyou are workingwith quantitative data that
you plan to analyze quantitatively. Most of the inferential statistics work better
with more than thirty subjects in the sample, and their results become more
trustworthywith much larger numbers.Also, if you are workingwith qualitative
data, you will be able to see patternsmoreclearly ifyou have a largerdataset.
On the other hand, if you are a teacher conducting research with your own
class and you have twenty students, surveying them is just fine, depending on
your research question(s). In future sections we will read about an action re
search projectin whicha teachergatheredsomeveryinterestingdata by usinga
questionnaire with a class of only five learners.

REFLECTION

Do you see a role for a questionnaire in any of the research topics that
interest you?

Chapter5 Surveys 129


QUESTIONNAIRE DESIGN
In Table 5.2, we notedthat the first stepin survey research is deciding whatwe
want to find out in the process ofadministering our questionnaire andanalyzing
the results. Doing so hinges on posing clear and appropriate research questions
and operationally defining all constructs and keyterms.
The idea ofoperationalizing constructs (discussed in Chapter 2)isextremely
important in any discussion of questionnaire design. This is because the ques
tionnaire embodies the attitudes, beliefs, and practices that you wish to docu
mentbyadministering it to yourrespondents. The questionnaire becomes your
mainresearch instrument. In effect, it operationalizes the constructs youwish to
measure.

Getting questionnaires 'right' can be notoriously difficult, and constructing


an instrument that will actually give you the data you need for your research
requires considerable skill. In this section, we will consider some of the funda
mentals of questionnaire design, looking first at question types and question
wording, then at organizing questionnaires. (In later chapters, we will consider
ways of quantifying qualitative data.)

Closed Items on Questionnaires


Questionnaire items can be either closed or open-ended. A closeditem is one in
whichthe range of possible responses is determinedby the researcherand the re
spondents selectfrom or evaluate the options provided. For example: "Foreign
languages should be compulsory in secondary school. Agree/neutral/disagree."
An open-ended item is one in which the subjectcan decidewhat to sayand how
to say it. For example: "What do you think about the proposal that foreign lan
guage should be compulsory in secondary school?" Questionnaires can use only
closed questions, only open-ended questions, or a mixture of closed and open-
ended questions. One frequently used format is to havecloseditemsfollowed by
a space for open-ended comments.
For example, Springer and Bailey(2006)conducted research about teachers'
experience with and attitudes regarding nineteen reflective teaching practices
(e.g., discussing teaching with trusted colleagues, keeping a teaching journal, ob
servingcolleagues, being observed, and so on). They constructed and distributed
their questionnaire using SurveyMonkey®—an Internet-based program that
allowed them to build the items, collect the data, and do preliminary analyses
electronically. Each reflective teaching practice was addressed in two closed
items—one askingabout their respondents'experience with the practiceand one
asking them about how appealing the practice is. The statements on the closed
itemswere rated on a nine-point scale. After each pair of items, a space was pro
vided for optional open-ended comments. Figure 5.1 below shows an example
of the questionnaire layout. The two paired items for each reflective practice
address teachers' experience and attitudes. The paired sequential closed items
are followed by a spacefor optional open-ended comments.

130 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


8. In my experience as o teacher, to reflect on my teaching, 1have discussed my
teaching with trusted colleogue(s) ...
12 3 4 5 6 7 6 9

l=never, _j J J J J J J J -»
9=very
frequently

9. In general, I find the Idea of discussing my teaching v/ith trusted colleague(s) ..


1 2 3 4 S 6 7 8 9

l=not at _j J J J J J J J J
all
appealing:
9=very
appealing

10. Comments (optional):

FIGURE 5.1 Sample items using a computer-delivered Likcrt scale

REFLECTION

Make a list of the pros and cons of using closed and open-ended questions.
Compare your list with that of a classmate or colleague.

There are several advantages of using closed items. These include practical
ity7 in terms of the ease and speed with which people can respond to the question
naire. Also, providing the options for respondents to select or evaluate provides
the great benefit of comparability because it constrains the variation you can get
in the responses. This factor enhances the data analysis process greatly. All in all,
the benefits of questionnaires have made them very important tools in survey
research in psychology, sociology, sociolinguistics, general applied linguistics
research, and, to some extent, language classroom research.
A wide range of closed item formats can be used. Some of the question types
used in closed question surveys are presented in Table 5.3.
It is important to remember that the question format you choose will con
strain the type of data that you get. For example, in the grid format shown in
Table 5.3, a program director completing this grid would indicate in each cell
the number of people in that age range (e.g., there may be twelve students in
Level 1 who are between the ages of eighteen and twenty-four). But we cannot
tell, from that datum alone, how many of those twelve students are eighteen
years old, how many are nineteen years old, and so on. Likewise, this particular

Chapter5 Surveys 131


TABLE 5.3 Closed question types in survey questionnaires (adapted
from Youngman [1986], as cited in Bell [1987])
Question Type Example

1. List Indicate your qualifications bycirclingany of the following


that apply to you:
Certificate, Diploma, B.A.,MA., Ph.D.
2. Category Indicate the grade you achieved on the
Use of English Examination
A, B, C, D, E, F
3. Ranking Rank the following from 1 to 4 in order of preference.
"In class, I like to learn best by studying..."
with the whole class.
in small groups.
in pairs.
by myself.
4. Scale Circle one of the following phrases to indicateyour
attitude to the followingstatement:
"In class, I like to learn by having the
teacher explain everything to me."
Strongly Agree, Agree, Disagree, Strongly Disagree
5. Quantity/frequency Circle one of the following answers.
How many hours did you spend practicing
English outside of classlast week?
1,2, 3,4, 5,6, 7,8,9, 10 or more
6. Grid How many students of the following age groups are
enrolled in the four levels of the Intensive EnglishProgram?

18-24 25-30 31-40 41-50


Level/ years years years years
Number old old old old

Level 1

Level 2

Level 3
Level 4

grid does not permit the possibility that there are any students younger than
eighteen or older than fifty enrolled in the program.
For these reasons,you should think carefullyabout your research question in
designingyour questionnaire. The researchquestion determines the analysis you
wish to do with your data, and that will influence the decisions you make about
the format of the questions. Remember the elephant and the butterfly: you can
hunt for either one, but you have to have the proper net to captureyour quarry.

132 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Look at the item types in Table 5.3. Decide what sort of data—nominal,
ordinal, or interval—each question type will elicit from respondents.

There are two very important closed-item formats that have been used in
applied linguistics research. One is called the semantic differential scale and the
other is called a Likert scale (pronounced like "LICK-ert"). Both have important
potential applications in second language classroom research.
Likert scales are named after their originator, a psychologist named Rensis
Likert (1932), who developed the technique for measuring people's attitudes
about various social concerns (Busch, 1993). Likert scales are often used in ap
plied linguistics research (including language classroom research) "to investigate
how respondents feel about a series of statements" J. D. Brown, 2001, p. 40). In
this format, "the respondents' attitudes are registered by having them circle or
check numbered categories(for instance, /, 2, 3, 4, or )), which have descriptors
above them" (ibid.).
Let us take as an example an issue that was introduced earlier in this chapter.
A Likert scale could be used to investigate people's attitudes about the desirabil
ity of foreign languages being required subjects in secondary school. To gather
information about this issue, a researcher could ask the following question:

Or, if more precise information were needed about the respondents' attitudes,
the researcher could word the item as follows:

Foreign languages should be compulsory in secondary school.


Agree Neutral Disagree

You may also see this format, in which SA represents "strongly agree," A stands
for "agree," and N represents "neutral." D and SD represent "disagree" and
"strongly disagree," respectively. Respondents circle their choice:

Foreign languages should be


compulsory in secondary school. SA A N I) SI)

Chapter 5 Surveys 133


The benefit of the Likert scale item is that it allows researchers to gather
more fine-grained information about attitudes in the form of numerical data.
In a Likert scale, the issue would be presented like this and respondents would
circle a number:

Strongly Agree Strongly Disagree


Foreign languages should be
compulsory in secondary school. 1 i 4 5

REFLECTION

What types of data—nominal, ordinal, or interval—are elicited by these


four response formats about the issue of whether or not foreign languages
should be compulsory in secondary school?
1. Should foreign languages be
compulsory in secondary school? Yes No
2. Foreign languages should be
compulsoryin secondaryschool. Agree Neutral Disagree.
3. Foreign languages should be
compulsory in secondary school. SA A N D SD
Strongly Agree Strongly Disagree
4. Foreign languages should be
compulsory in secondary school. 12 3 4

The semantic differential format is a variation on the Likert scale format in


which pairs of opposite adjectives are place at the two ends of a continuum. An
item using a semantic differential scale might look something like this:

Intelligent : : : : Unintelligent
Impolite : : : : Polite
Educated : : : : Uneducated

Respondents see or hear some sort of stimulus material and react to it by


placing an A'or a check mark between these polar opposites at a point that indi
cates their opinion of the stimulus material. This format has been used
frequently in sociolinguistic research to assess people's reactions to accented
speech using recorded speechsamples as the stimulus material. Students' writing
samples can also be used as the stimulus material.

134 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Teachers canuse this formatin conductingneedsassessments with their stu
dents or getting feedback on lessons. Dornyei (2003, p. 41) gives the following
example:

Listening comprehension tasks are


Difficult _:_:_ -Easy
Useless : : Useful

The semanticdifferentlyconcept wasused byThorpe (2004) in an action re


search project on the use of authentic news broadcasts. He wanted feedback
from the Korean learners of English in his class about the various kinds of news
broadcasts and teaching activities he used. Here are just two examples from a
simple questionnaire he gave the studentsat the end of each lesson:

Please draw a single straight line (I) on the horizontal continuum (—) to
indicate your opinion.
The news story was
VeryEasy ^_ VeryDifficult
I found the teaching activity
Not At All Helpful j Very Helpful

Thorpe made sure that all the lines were of equal length, both on all the
items for the questionnaire after a particular lesson and on the questionnaires
he gave the students after all the lessons. This strategy allowed him to measure
the distance where the students' marks were placed on each line. He was there
fore able to compare students' opinions, both to one another and over time.
(We will return to Thorpe's data in Chapter 8 whlen we study action research.)
Perhaps you noticed that in both examples above, sometimes the more pos
itive adjective is on the left and sometimes it is on the right. This placement is
intentional. Researchers use it to avoidwhat is called a response set— a habitual or
patterned way of responding to itemsthat is independent of the items (Mitchell
andJolley, 1988). For instance, a respondent may rush through a questionnaire
and simplymark all the positive options without really thinking about their con
tent. Switching the positive and negative adjectives breaks up this tendency to
some extent.
You will recall from Chapter 3 that we distinguished among nominal, ordi
nal, and interval data. The differences among these types of data are important,
partlybecause the type of data you work with determines the types of analyses
you may use.

Chapter5 Surveys 135


There isa great deal of discussion about whether Likert scale items provide
us with interval data or ordinal data. Some researchers feel that Likert scale
items with numbers, as shown above, "are assumed to yield interval data"
(Mitchell andJolley, 1988, p. 403). Tuckman (1999) says,
A Likert scale lays out five points separated by intervals assumed to be
equal distances. It is formally termed an equal-appearing interval
scale. . . . Because analyses of data from Likertscales are usually based
on summated scores overmultiple items, the equal-interval assumption
is a workable one. (p. 216)

Tuckman makes this point because of the value and nature of true interval scales.
We said in Chapter 2 that interval scales are those on which the unit of measure
is a constant interval.
However, there is another important feature of true interval scales and that
is that their properties are known and widely used. For instance, an inch is an
inch whether you are using a ruler in Australia, Canada, Egypt, Kenya, or
Brazil. A kilometer is the same distance whether you are riding a bike in
France, Thailand, or China. The same cannot be said of Likert scales since we
do not know if "agree" means the same thing to each person using the scale.If
two students "strongly disagree" with a statement on a Likert scale, does that
mean that they are equal in the vehemencewith which they disagree? We can
not know because there is no standardized measure of agreement as there is
with inches and kilometers. (See Busch [1993] and Turner [1993] for further
discussion of this position.)
You may have noticed that in the example screen shot from Springer and
Bailey (2006) above, the Likert scale was nine points long. The researchers
used a nine-point scale in order to provide respondents with more choices
since the computer-delivered questionnaire would not allow them to mark
pluses or minuses, or to circle two numbers. In addition, "from a statistical
viewpoint, longer scale lengths of seven or more categories are more desirable
because of the grain in score variability" (Busch, 1993, p. 735). Hatch and
Lazarton (1991) state that Likert scales become interval-like when the length
of the scale is increased.
Whether Likert scale data should be treated as interval data or ordinal
data hinges on many factors. If you are a graduate student trying to complete a
research project or meet graduation requirements, you should consult your pro
fessor or researchadvisor. In anycase, you need to showthat you understand the
issueof interval versus ordinal data when you choose the statistical procedures to
use in your data analysis. (See Chapter 13.)

Open-Ended Items on Questionnaires


Open-ended questions are "items where the actual question is not followed by
response options for the respondent to choose from but rather by some blank
space (e.g., dotted lines) for the respondent to fill" (Dornyei, 2003, p. 47). As

136 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Mackey and Gass (2005) note, there isa trade-off between thecontrol and con
venience of closed items and the depth of response in open-ended items:

Closed-item questions typically involve a grea ter uniformity of meas


urementandtherefore greater reliability. They also lead to answers that
can be easily quantified and analyzed. Open-ended items, on the other
hand, allow respondents to express their own thoughts andideas in their
own manner, and thus may result in more unexpected and insightful
data. (p. 93)

Whileresponses to closed questions areeasier to collate andanalyze, researchers


often obtain more useful information from open-ended questions.
It is also likely that responses to open-ended questions will moreaccurately
reflectwhat the respondentwantsto say. It is not uncommon on paper-delivered
questionnaires to find that respondents have circled both 3 and4 on a five-point
scale, or that they havewritten in "3+" instead of simplycircling 3 or 4. (Elec
tronically delivered questionnaires avoid this problem byforcing the respondent
to indicate click on particular numeral.) Open-ended items also allow respon
dents to express mixed feelings and shades of meaning.
J. D. Brown (2001) says that open-ended items come in two basic formats:
fill-in questions and short-answer questions. Fill-in questions require very spe
cificinformation, such as the following questionsintended for the adult children
of immigrant parents in England:

Please providethe following background information:


Country where you were born:
Language(s) spoken in your home byyour family:
Your age upon arrival in England:

Short-answer questions are more open-ended and "usually require a few


words or phrases" (ibid., p. 39). Some examples are printed in the box below:

What are your earliest memories of livingin England?

What are your earliestmemories of speaking English in England?

Chapter5 Surveys 137


Sometimes open-ended questions are posed inzyes/no format, butifyou use
this approach, it isimportant to pose anexplicit follow-up question as well:

Did you ever translate documents or interpret English foryour parents?


Yes No

If you checked "Yes," please give an example:

Do you feel that English is easy to learn? Why or whynot?

Noticethat when youuse these item formats, the space youprovide indicates to
the respondentshow long a response you are hoping to receive.
Dornyeiaddsa third typeof open-endedquestion—sentence completion. In
thisformat, the questionnaire designer provides the opening clause of a sentence
and then provides spacefor the respondent to complete the idea.In this format,
the item is usually worded from the perspective of the respondent (i.e., it uses J,
my, and mine instead ofyou, your, and yours).

Please complete the following sentences:


As a child,one thing I liked about attending school in Englandwas

As a child,one thing I disliked about attendingschoolin Englandwas

A particular kind of completion item that has beenusedin second language


acquisition research is called a discourse completion task. In a discourse completion
task, part of a conversation is provided in written form and the respondents are
asked to write what they would say if they were actually in this conversation.
This format is especially useful if you wish to "investigate speech acts such as
apologies, invitations, refusals, and so forth" (Mackey and Gass, 2005, p. 89).
Discourse completion tasks frequently specify potentially important factors,
such as the age and status of the interlocutors. Here is an example that involves

138 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


makinga request to a person of higher status:

You are a student in a linguistics course. Your backpack is stolen and your
linguistics textbook was in it. You cannotafford to buya newcopy, but you
need a copy of the book over the weekend to! prepare for the midterm
exam, whichwill be given on Monday. You decide to askyour professorif
you mayborrowher copy. You go to her office and sayto her,

For addressing some research questions, it can be useful to constrain the


response by providing more of the conversation.

You askyour linguistics professorif you can borrow her copy of the text
book. Yours has been stolen and the library copy is checked out. You want
to use her textbook over the weekend because the linguistics midterm
exam is scheduled for Monday.

The professor says, "Very well, but I'll need it back Monday morning at
nine o'clock. Oh, and please don't mark in it."

Using discourse completion tasks can save substantial time compared to


waiting for people to use particular speech acts, such as the request above, in
natural speech and hoping that you will be around to record the data when
such requests do occur. Sometimes discourse completion tasks are posed in a
multiple-choice format, but this makes them closed items that show something
about the learners' ability to recognize appropriate speech acts but that do not
show the learners' actual productive abilities. For example:

You ask your linguistics professor if you can borrow her copy of the text
book. Yours has been stolen and the library copy is checked out. You want
to use her textbook over the weekend because the linguistics midterm
exam is scheduled for Monday. Yougo to the professors office and say,
A. "Hey, can I borrow your text book?"
B. "Excuseme, Professor, may I borrow your textbook over the weekend?"
C. "My textbook was stolen. Can I use yours over the weekend?"
D. "Excuse me, Professor, but unfortunately my textbook was stolen and
the librarycopyhas been checked out. May I|please borrow your copy?"

Chapters Surveys 139


We are now back to the trade-off between closed items and open-ended
items on questionnaires. Whether you use one or the other or a combination of
the two, open-ended questions should be easy to answer. Alreck and Settle
(1985) advise that open-endeditems should be direct, brief,and clear. They pro
vide the following questions for evaluating the effectiveness of survey questions:
1. Does the question focus directly on the issue or topic to be measured? If
not, rewrite the item to deal with the issue as directlyas possible.
2. Is the question stated as briefly as it can be? If the item is more than a few
words, it may be too long and should be restated more briefly.
3. Is the question expressed as clearlyand simplyas it can be? If the meaning
will not be clear to virtually every respondent, the item should be re
formed, (p. 101)

Bailey (1992) conducted research with a very short, simple questionnaire


consisting of only three open-endeditems.She wantedto investigate the issue of
innovation at the level of the individual teacher rather than at the levelof depart
mental or programmatic innovations, or large-scale systemic change. She asked
teachers to respond in writing to the followingprompt:

Think of a positive change you have made in your own teaching. It could
be a change in content, in philosophy,or procedure. The important thing
is that it be a change for the better which you have made and which has re
mained with you. I am interested in learning about changes that last in
your work as a language teacher—that is, I am trying to understand how
teachers bring about their own professional development, (p. 263)

Baileythen asked the respondents to describe what they changed, to explain


why they had made the change, and how they had made it. As it turned out, the
open-ended questions about what and why teachers had changed were relatively
straightforward to analyze. But the how question had not been specific enough.
Baileyhad meant to gather information about the actual processes teachers had
used to bring about the changes they desired. Unfortunately, in addition to (or in
some cases, instead of) that information, she got comments about the difficulty
and speed of the process. For instance, in response to the how question, teachers
wrote things about the processbeing veryslow, or being very demanding of their
time and energy, without reallyexplainingwhat steps enabled them to carry out
the change.
This problem did not arise in the piloting of the questionnaire. It was only
in the process of struggling to analyze the data that Bailey realized she should
have been more specificin posing the question. The sort of responses she antic
ipated receivingwould have been more about actual strategies and processes.For
instance, one teacher talked about the innovation of providing authentic materi
als (what she changed) because the required textbook (which had been written by

140 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


a senior member of the department) was dry, boring, and out-of-date. In re
sponse to the question of how she brought about the use of authentic materials,
the teacher wrote that she and her students agreed that the authentic materials
were more helpful and more interesting, so they agreed to keep secret from the
administration the fact that they were not using the required text.This detailed
answer to the how question contrasted sharply with comments such as "very
slowly" or "with great difficulty."
It can often be helpful to providerespondentswith examples as guidance. If
you do this, you must be careful to provide examples that will counteract (or at
least not trigger) subject expectancy.
The real power of open-ended items is explained by Dornyei (2003), who
says that in spite of questionnaires' limitations,
open-ended items still have merits. Although we cannot expect any
soul-searching self-disclosure from the responses, by permitting greater
freedom of expression, open-format items can provide a far greater
"richness" than fully quantitative data. The open responses can offer
graphic examples, illustrative quotes, and can also lead us to identify
issues not previously anticipated, (p. 47)
An additional issue is that researchers sometimes "need open-ended items for
the simplereason that we do not know the range of possible answers and there
fore cannot providepre-prepared responsecategories" (ibid.).

Organizing YourQuestionnaire
Once you have settled on the questions that you want to ask, and you have de
cided how you want to frame them, the next step is to organize the question
naire. The ordering of the questions should makesense to the respondents, and
there shouldbe an ordered progression from one questionto the next.Table 5.4
provides guidelines for organizing a questionnaire.

QUALITY CONTROL ISSUES IN QUESTIONNAIRE


RESEARCH

Aswe haveseen in previous chapters,qualitycontrol is a major concern no mat


ter what approach to research we espouse. In this section we will review some
important advice about crafting questionnaire items. (We use the word crafting
intentionally because there is a fair amount of skill involved and practice is
needed to produce questionnaire items that actually measure what you want
them to measure.) We then discuss the importance of piloting questionnaires
and ofusing translation as needed. Finally, we willl consider some ofthe ethical
issues associatedwith the collection and analysis of questionnaire data.

Chapter5 Surveys 141


TABLE 5.4 Steps in organizing a questionnaire (adapted from Alreck
and Settle, 1985, p. 160)
Step 1 Picture die questionnaire in three majorparts: initiation, body, and
conclusion.

Step 2 Begin with the most general questions and avoid those that might be
threatening or difficult to answer.
Step 3 Remember the initial portion sets the stage and influences the respondents'
expectations about what is to come.
Step 4 Be sure the body ol the questionnaire flows smoothly from one issue to
the next.

Step 5 List items in the body in a sequence that is logical and meaningful to
respondents.
Step 6 Save the most sensitive issues and threatening questions for the concluding
portion when rapport is greatest.
Step 7 List demographic or biographical questions last so that if some respondents
decline, most of the data will still be usable.

CraftingQuestionnaire Items
Getting the wording of the questions right is crucial in questionnaire design.
One of the potential problems with any type of elicitation device is that the re
sponses we get may well be artifacts of the elicitation devices themselves. To
guard against this problem, the researcher should not ask 'leading' questions that
reveal his or her own biases, such as, "Do you think that the concept of learner-
centeredness is impractical and unrealistic?" Dornyei (2003) gives examples of
leading questions that start with phrases such as, "Isn't it reasonable to suppose
that. . . ?" or "Don't you believe that. . ." (p. 54).
Furthermore, questions shouldnot be complex and confusing, nor shouldthey
askmore than one question at a time. Here is an example that fails on both counts:

Would you prefer face-to-face, online, or 'blended' instruction in intensive


full-time or part-time mode with a single instructor or multiple instructors?

Such a question is likely to be highly confusing to respondents, and the answers


are almost certain to be uninterpretable by die researcher.

ACTION

Look at the question quoted in the box above. How many constructs is it
attempting to measure? How could you rewrite this question so that it is
clearer?

142 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


This question should be rewritten as three separate items because it is trying to
capture three different constructs: (1) the delivery mode, (2) the time devoted to
study, and (3) the preferred number of teachers. Flere isone possible rewording:

As a language learner, which do you prefer? Use an X to choose only one


option for each line:
1. Face-to-face classes OR Online classes OR Blended
classes

2. Intensive full-time study OR Part-time study


3. Having a single teacher OR _ Having more than one teacher

It is also important to use vocabulary that is familiar to the respondents. In


the revision above, teacher has been substituted for instructor and the phrase more
than one has been substituted for multiple. In addition, you should also make sure
that the respondents understand key terms you are using. For instance, it might
be important to indicate that blended in this context means a combination of
face-to-face and online classes.
In research into language teaching and learning, another danger to avoid is
culturally biased questions. Specialists on questionnaire research have pointed
out that there is considerable intercultural variation concerning the type of in
formation that can legitimately be sought by a stranger. (When Kathi Baileyfirst
started teaching EFL in Hong Kong, she wasvery surprised by some of the ques
tions she was asked by her students. After she explained the syllabus on the first
day of class, she invited questions. In both of her speaking/listeningclasses, stu
dents asked her how much money she made as a teacher and what kind of man
she liked.)
Major differences may exist between the culture of the researcher and that
of the respondents, and these differences may affect the responses given. If you
are planning to survey people from cultures other than your own, it is important
to check with a native of that culture to make sure the items on your survey are
appropriately worded and not offensive to the respondents.

REFLECTION

Can you think of any examples of cultural bias in questionnaire design in


studies you have read or in questionnaires you have seen?

When you are constructing a questionnaire, it is important to take factors


such as the following into consideration: the willingness of the respondents to
make critical statements or to criticize, for example, their teacher or teaching

Chapter5 Surveys 143


institution; and the willingness of the respondents to discuss certain personal
topics,such as age,salary, or opinions on politicaland socialissues. Another con
cern is the extent to whichsharedvalues can be assumed, such as the conceptof
freedom of the press or free access to the Internet. There may also be differences
in attitudes that can influence how people respond to questionnaire items. For
example, the commonly held belief among many Western educators that class
room learning should be an enjoyable experience is not necessarily a universally
held view. (In many contexts,fun and enjoyment are associated with entertain
ment, not education.)
Yet another trap to avoid is asking too many questions. Our own personal
rule of thumb is to restrict the questionnaire to between thirty and thirty-five
questions although this will depend on the types of questions being asked, that
is, whether they are closed or open-ended. If you are using a paper question
naire, you may get a better rate of return if you limit its length to the front and
back of one sheet of paper. (If you do use the back of the page, it is important to
add a note at the bottom of page 1 askingthe respondents to turn the page over
and continue on the back.)
It is also important to determine in advancehow the data to be gathered will
be analyzed. A trap for the inexperienced researcher is to collect the data and
then realize that the question was askedin a wayin which yielded data that can
not be analyzed to answer the question. Because of these and other pitfalls, it is
imperative that questionnaires be piloted before the main study is carried out.
Sometimes researchers want to ask parallel questions of two populations
involved in the same activity. In the context of second language classroom re
search, this pairing is most often teachers and students. As an example, Radecki
(2002) conducted a study in the United Arab Emirates to "identify and compare
student and teacher preferencesin a laptop environment" (p. 2). The study used
interviews and parallel questionnaires to "determine (1) student and teacher
preferences in the areas of materials, grouping and activities in high-tech envi
ronment; (2) how these preferences differ for a high-tech and a low-tech
environment; and (3) how teacher and student preferences differ" (ibid.).
To address these issues,parallel questionnaires were devised for the students
and teachers in the study. The questions were reworded to elicit the perspectives
of the two groups of respondents. For example, Question 21 on the student ques
tionnaire said

Complete this sentence: "In classes with laptops, I usually like to ."
a. listen to the teacher and take notes
b. work on exercises or homework
c. discussissues in groups or with the whole class
d. work on a research paper or a project
e. do many different things, like listen to lectures, do exercises, have dis
cussions, and do projects

144 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Question # 21 on the teacher questionnaire read:

Complete thissentence: "In classes with laptops, I usually like to ."


a. lecture while the students take notes
b. have students work on exercises or homework
c. discuss issues in groups or with the whole class
d. have students work on a research paper or a project
e. do many different things, like lecture, do exercises, have discussions,
and do projects

Finally, as we have already indicated, constructing a reliable questionnaire


that willtell you what you want to knowis challenging and time-consuming. Be
fore beginningthe process, it is important to be clear about the objectives of the
study, and each item should be referenced against one or more of the research
objectives. There is no pointin including items that don't elicit data addressing
your research question(s). Likewise, you must makesure that there are no 'holes'
in the questionnaire that lead to gaps in the data. So, as a final step before you
pilot your questionnaire, cross-reference the items against the research ques
tions and objectives of the study. Make sure the draft instrument satisfies the
purposes of the research you have planned.

PilotingYour Questionnaire
The concept of piloting a questionnaire (or any other data collection procedure)
is like a dress rehearsal in die theater. By administering the questionnaire before
the actual data collection, you can locate any unclear items, misnumbered items,
confusing instructions, and so on.
Piloting a questionnaire is at least a two-stage process. First, after you have
carefully organized and proofread your questionnaire, you should pre-pilot it
with a fewcolleagues, especially those who are familiar with the populationyou
wish to sample. You may need to revise the questionnaire somewhat based on
their feedback. Then you pilot the questionnaire by administering it to a small
number of people who are part of the population you wish to sample but who
willnot be in the sample themselves. You should be physically present when you
pilot the questionnaire with this group so that you can ask them for feedbackand
answer their questions. Doing so will give you valuable input about possible
problems in the questionnaire before you actually administer it to the sample
from whom you wish to collect the data.

Translation Issues
If your questionnaire is going to be completed by people who are not native-
speakers of the questionnaire language, you need to take special steps to make

Chapter5 Surveys 145


sure it does not become a reading test. For instance, if your questionnaire is
going to be administered to ESL or EFL students enrolled in an English pro
gram, you should show the questionnaire to some teachers and/or administrators
before you pilot it with a small group of the students. The teachers and admin
istrators are likely to be familiar enough with the students' proficiency so they
can tell you immediately if the language of the questionnaire is too difficult for
the intended respondents. Keep in mind that syntactic, vocabulary, and dis
course-level factors in the questionnaire can all introduce difficulties for the
non-native reader. If your respondents do not fully understand the language of
the items, then you cannot be sure that they have responded truthfully or com
pletely to the questionnaire.

REFLECTION

Have you ever completed (or been asked to complete) a questionnaire in a


language other than your first language? How did you respond? Was your
proficiency sufficient to allowyou to understand the questionnaire and ex
press yourselfwell? Or did your proficiency constrain your understanding
and/or your output in some way? If you have never been in this situation,
try to imagine what it would be like.

One solution that is sometimes used if the respondents are language learn
ers is to translate the questionnaire into the learners' native language. This solu
tion can work well if you are sampling from one or a few first-language groups,
but it becomes very unwieldy if many different first languages are represented in
your sample.
If you do decide to translate your questionnaire, there is an important pro
cedure for checking on the accuracy of the translation. This procedure is called
back translation and it works like this: First, the original questionnaire is drafted,
piloted, and revised. When the questionnaire is in its final form, it is translated
into the respondents' first language by a competent bilingual translator. Next,
another bilingual translator, working from the translation and withoutseeingthe
original version of the questionnaire or speaking to the first translator, puts the
questionnaire back into the language it was originally written in. Then you and
the two translators (if you are not one of them) compare the original version of
the questionnaire with the back-translated version. If there are anydifferences in
wording, the translators try to resolve the ambiguity in the translated version
and help you clarify your intended meaning before you administer the question
naire. At this point, you are ready to pilot your questionnaire.
Doing a back translation is time-consuming and can be costly. However,
making sure you have a proper translation of your questionnaire from the outset
is an important professional step in enhancing the reliable and valid measure
ment of the constructs you wish to capture. In the long run, back translation may

146 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


save you time, money, and work because you will get interpretable results from
more people if the questionnaire is fully understandable.

ACTION

Look at a questionnaire that was used in a published research article in our


field. Was it translated into the learners' first language? If not, do you think
the questionnaire was appropriate for the subjects' level of proficiency as
described in the article? Cite specific examples to support your opinion.

Ethical Issues
Questionnaires elicit a kind of self-report data—that is, data in which the respon
dents are providing information about themselves. As a result, in order to get
truthful data, it is important to promise your subjects confidentiality whenever
you can. Usually confidentiality is accomplished by treating the questionnaire
data anonymously. That is, the respondents are not identified in any way that
will indicate who gave what opinions, ideas, information, etc., in responding to
the questionnaire. Here is an example of how one group of researchers promised
confidentiality to the students who completed their questionnaire:
Your answers to any or all questions will be treated with the strictest
confidence. Although we ask for your name on the cover page, we do so
only because we must be able to associateyour answers to this question
naire with those of other questionnaires which you will be asked to
answer. It is important for you to know, however, that before the ques
tionnaires are examined, your questionnaire will be numbered, the same
number will be put on the section containing your name, and then that
section will be removed. By following a similar procedure witii the other
questionnaires, we will be able to match the questionnaires through
matching numbers and avoid having to associate your name directly
with the questionnaire. (Gliksman, Gardner, and Smythe, 1982, p. 637,
as cited in Dornyei, 2003, p. 23.)
The main points here are (1)you are likely to get better data if you promise your
respondents anonymity, but (2) if you can't promise complete anonymity, you
must guarantee confidentiality in the handling and reporting of the data.

ACTION

Flow could the comments about confidentiality quoted above (from


Gliksman et al., 1982) be rewritten in a simpler fashion? Produce a version
of these comments that would be appropriate for lower-intermediate
English learners.

Chapter5 Surveys 147


Based on his review of the literature on questionnaire design, Dornyei
(2003) has synthesized five ethical principles of data collection. We agree that
these principles are valuable, so we have excerpted from them here.
Principle 1. No harm should come to the respondents as a result of their
participation in the research....
Principle 2. The respondent's right to privacy should always be respected
and no undue pressure should be brought to bear....
Principle 3. Respondents should be provided with sufficient initial infor
mation about the surveyto be able to givetheir informed consent concern
ing participation and the use of data
Principle 4. In the case of children, permission to conduct the survey
should always be sought from some person who has sufficient
authority....
Principle 5. It is the researcher's moral and professional (and, in some
contexts, legal) obligation to maintain the level of confidentiality that was
promised to the respondents at the outset, (pp. 91-92)
We believe these principles offer sound advice, no matter what constructs you
are trying to measure with your questionnaire. (For the full discussion of these
principles, see Dornyei, 2003, pp. 91-92.)

A SAMPLE STUDY

In this section, we present a sample study based on a survey. While surveys are
widely used in applied linguistics, they are less common in classroom research.
The study we haveselected is classroom-oriented rather than classroom-basedin
that the data were collected outside of the classroom although the study was
carried out in order to inform classroom action.
The survey was conducted by Nunan and Wong (2006) to investigate the
learning styles and strategies of good and poor language learners. The study
was carried out among undergraduate students who were native speakers of
Cantonese. The subjects were 110 undergraduate students drawn from all the
faculties at the University of Hong Kong.
The aim of the study was to explore whether there are identifiable differ
ences between good and poor learners. Language proficiency was defined in
terms of grades obtained on the Hong Kong Examinations Authority Use of
English Examination—a high-stakes English language examination that all
students have to take in order to graduate from high school. The aim of the
research was to investigate whether there were any common practices among
language learners who did well in English within the Hong Kong education
system as compared with those who did not. Ultimately, the research was
intended to provide practical guidelines for teachers to add a learning-how-to-
learn dimension to their teaching.

148 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

From what you know so far, what do you think the design of this study is?
What class of designs does it belong to?

Research Questions
Seven aspects of language learning and use were investigated in the study. The
following research questions were posed about the two groups of learners (those
who did well on the Use of English Examination and those who did not):
1. Are there any differences between the good and poor language learners in
terms of their overall learning style?
2. Are there any differences between die good and poor language learners in
terms of their individual learning strategy preferences?
3. Are there any differences between the good and poor language learners
in terms of their target language practice outside of class?
4. Are there any differences between the good and poor language learners in
terms of their areas of academic specialization?
5. Are there any differences between the good and poor language learners in
terms of their perceptions of the importance of English?
6. Are there any differences between the good and poor language learners
in terms of their perception of language ability?
7. Are there any differences between the good and poor language learners in
terms of their enjoyment of learning English?

REFLECTION

What is the independent variable in this study? How did the researchers
operationalize it? How many levels does it have and what are they?

Research Procedures
The independent variable was whether the students could be characterized as
being good or poor English learners. This construct was operationalized using
the grades the students obtained on the Hong Kong Exams Authority Use of
English Exam. There were two levels of the independent variable. 'Good' learn
ers were defined as those who obtained an A on the examination. 'Poor' learners
were those who obtained an E or E The dependent variable consisted of the stu
dents' responses to a questionnaire on strategy preferences, learning practices,
and attitudes.

Chapter5 Surveys 149


A thirty-item questionnaire developed byWilling(1988; 1990) was adapted
for the purposes of the research. This questionnaire wasselected because it has
been a robust instrument in numerous studies over the years (see, for example,
Nunan, 1999, and Nunan andWong,2003). In additionto identifying individual
learning strategy preferences, this questionnaire also enables the identification
of four overall learning style orientations: (1) communicative, (2) concrete,
(3)authority-oriented, and (4)analytical.
In addition to the thirty items in the original questionnaire, which asked
learners to indicate their preferred ways of learning in and out of class, the
questionnaire collected the following data:
Home faculty (academic specialization)
Grade on the Use of English Exam
Number of hours per week using English out of class
Importance of English to the student personally
The student's self-rating of English level
Extent to which the student enjoys learning English
The questionnaire was placed on a Web site, and a message was posted to the
students' list inviting them to take part in the study.
There weretwomain advantages to distributing the survey electronically. In
the first place, it saved an enormous amount of time and paper. More impor
tantly, the Web program wasset up to provide detailed analyses and collation of
student responses. This procedure also eliminated costly and time-consuming
effort by the researchers in having to tabulate paper-and-pencil responses and
then having to enter those data for computer analysis.

REFLECTION

Prom a research perspective, do>yours^ a^^sadwa^^


the surveyelectronically?

Subjects in the Sample


One of the major problems with the study from a research perspective has to do
with sampling and defining the population for the research. As the researchers
did not actively recruit subjects, there is no way of knowing whether those who
chose to respond were representative of the student population as a whole.
In all, 674 students responded to the survey. Of these, 77 reported that they
had received an A on the Use of English Exam. Another 33 reported that they
had received an E or F on that exam. Thus, the two groups being compared in
this criterion groups design consisted of the "good learners" (n = 77) and the
"poor learners" (n = 33), in terms of their self-reported grades on the Use of
English Exam. So, these 110students made up the two comparison groups used
in this ex post facto criterion groups design.

150 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Results :
The results relating to the seven research questions are summarized below.
1. Overall learning style
The majority of the good learners (53%) were labeled as 'communica
tive' learners. The dominant orientation on the part of poor learners was
'authority-oriented' (36%).
2. Learning strategy preferences
There was a marked difference in the learning strategy preference be
tween the two groups. The five most popularpreferences for each group
were as follows:

The good language learners


"I like to learn by watching/listening to nativespeakers."
"I like to learn English words by seeing them."
"At home, I like to learn by watching TV in English."
"In class, I like to learn by conversation."
"I like to learn many new words."
The poor language learners
"I like the teacher to tell me all my mistakes."
"I like to learn English words by seeing them."
"I like the teacher to help me talk about my interests."
"I like to have my own textbook."
"I like to learn new English words by doing something."
3. Amount of time spent using English out of class
Forty percent of good learners reported spending betweenone and five
hours a week on English outside of class. Twenty-nine percent spent more
than ten hours a week on English outside of class. In contrast, no poor
learners spent more than ten hours a week outside of class, and 70 percent
spent less than an hour a week on English outside of class.
4. Faculty
The majority of the good learners were studying in the faculties of Arts,
Law, and Medicine. Poor language learners tended to come from Engi
neering and Science.
5. Perceptions of the important of English
On this question, responses were identical between good and poor
learners. Ninety-seven percent of respondents in both groups rated
English as either 'important' or 'very important.'
6. Self-rating of English level
This wasan interesting finding. Subjectsself-rated themselveson a five-
band proficiency rating scale, and the results were tabulated against their

Chapter5 Surveys 151


reported Use of English Exam scores. Fifty-six percent of good learners
ratedthemselves in the two top levels of the scale, while onlysix percentof
poor learners rated themselves at this level.
Enjoyment of learning English
There was also a statistically significant difference between higher and
lower proficiency students in terms of theirenjoyment oflearning English.
Forty percent of higher proficiency students reported enjoying English a
great deal, while onlysix percent of the lowerproficiency did. In contrast
(sadly andnot surprisingly), twenty-four percentof lower proficiency stu
dents reported that they did not like learning Englishat all.

Implications
Nunan and Wong (2006) identified a number of implications, particularly for
teachers working with poorer language learners. The main implication was to
encourage such learners to see language as a tool for communicating rather than
as a bodyof content to be memorized. Developing independent learning strate
gies and reducing students' dependence on the teacher were also recommended.
Learnersshould also be encouraged to develop a greater range of strategies and
to activate their language outsideof the classroom. Following Christison (2003),
the researchers suggested that teachers audit their own classroom practices to
identify the strategies that they themselves favor.

Limitationsof the Study


As these authors acknowledge, the greatest single weakness of the study is in
identifying the population from which the sample is drawn. The researchers did
not actively draw the sample from the undergraduate population but rather in
vited all students to take part. Asis often the case in questionnaire research, it is
by no means certain that those who volunteered are representative of the under
graduate population as a whole. This issueis an exampleof a very common threat
to external validity in surveyresearch.That is, if we don't know the characteristics
of the population represented by the survey sample, we won't know how gener-
alizable the results of the survey may be.
We can see by the numbers of subjects that more than twice as many good
students completed the questionnaire as did poor students (seventy-seven versus
thirty-three). Fortunately, there are inferential statistical procedures that work
well with unequal numbers of subjects in the groups being compared. But, here
again, we do not know the parameters of the population in this case. That is, of
all the students who take the Use of English Examination (the population), we
do not know what proportion of those students get As and what proportion get
Es and Fs. It may be that students who did not do well on that exam were reluc
tant to complete the questionnaire.

152 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


PAY0FFS AND PITFALLS |
Written surveys are powerful tools for collecting specific information, poten
tially from a large group of people. Carefully designed questionnaires allow re
searchers to gatherdataaboutpeople's attitudes, beliefs, and practices. They can
consist of anycombination of open-ended and closed questions. The structure
provided byclosed questions gives researchers a certain amount of control over
thetype ofinformation they wish to elicit, while open-ended questions allow for
more creativity and varietyon the part of the informants.
One benefit of questionnaires is that collecting data with them is more
efficient than with interviews, and they "elicit comparable information from a
numberof respondents" (Mackey and Gass, 2005, p, 94). In addition, surveys can
be administered orally (face-to-face, by telephone) or in writing (via e-mail,
through the mail, or in person; ibid.). Respondents cancomplete questionnaires
individually or in groups, with or without the researcher present.
The payoffs of questionnaire research include the possibility of getting a
wide range of data from many different people. Those data can be analyzed
quantitatively to provide percentages, means, andstandard deviations aboutdif
ferent respondents' views. Open-ended comments can also be analyzed qualita
tivelyto identifykey themes and unique responses.
The pitfalls of survey research usingquestionnaires cannotbeignored, how
ever. One was alluded to above: Typically, people who complete questionnaires
are thosewho wishto complete surveys. Suchvolunteers mayor maynot be rep
resentative of the populationthey are supposed to represent. They maybe more
confident, for instance, or more proficient in the language of the questionnaire,
or perhaps they simply have more time on their hands or care more about the
topic than do those who choose not to respond.
Another potential problem is that of self-report. This term applies in any
context where the research subjectsare reporting on their own behavior or atti
tudes. It is a natural human tendency to represent oneself in a good light, so
sometimes people leave out or downplay negative factors, and increase or em
phasize positive factors. For example, in a questionnaire abouthow many hours
perweekstudents usethe targetlanguage outside of class, if the respondents feel
it is desirable to use the target language outside of class, they may inflate their
estimates of the hours they spend in such activities. Another issue related to
self-reportingis that people maysimplybe unawareof their practices and, thus,
report them from an uninformed position.
A relatedproblemissubject expectancy (introduced in Chapter 4). In this con
text, the term refers to the idea that survey respondents may think they know
what the researcheris expecting and, therefore, respond accordingly. This prob
lem arises from the natural urge to please. It can be particularlyimportant if you
are a teacher surveying your own students or a program administrator asking
students in your program to respond to a questionnaire.
With open-ended items, another difficulty is that respondents may be un
comfortable expressing themselves in writing—and probably even more so if

Chapter5 Surveys 153


they are writing in a second or foreign language. As Mackey and Gass (2005)
point out that people may "provide abbreviated, rather than elaborative, re
sponses" (p. 96). These authors suggest providing sufficient response time,
letting people respond in theirfirst language, and letting people respond orally
(especially if theyhave limited literacy skills) asways to counteract thisproblem.
Havingdrafted, organized, constructed, edited, pre-piloted, piloted, and ad
ministered a questionnaire, the nexttaskis to collate andinterpretthe responses.
As we have indicated, the greatadvantage of closed questions is that theyyield
responses that can be readily collated, particularly if you are using a computer
izedstatistical package. Or, if—as in the sample studyfor this chapter—the data
are collected through the Internet, in which case much of the tabulation and
analysis canbe done electronically. Free-form responses from open-ended ques
tions, which may result in more useful and insightful data, are much more
difficult to analyze (although there are ways of doing this, as we will see in later
chapters).
The sample study presented here (Nunan and Wong, 2006) as well as the
studyby Springer and Bailey (2006) mentioned above, both used electronically
delivered questionnaires to gather data. There are great advantages to using
computer-delivered questionnaires (e.g., speed of distribution and return, sav
ings in terms of postage and photocopying, ease of tabulating results, etc.). But
one disadvantage that cannot be overlooked is that of the digital divide. If you
are administering a questionnaire via the Internet, you can only distribute it to
and expect responses from people who have cheap and easy Internet access. For
example, Springerand Bailey (2006) wished to get a wideinternational sample of
respondents and asked a colleague in Vietnam to encourage her colleagues to
complete the questionnaire. However, the colleague responded that they really
could not do so becausenone of them had personal computers, the school's com
puter was reserved for administrative and instructional purposes, and using an
Internet cafe to respond to a lengthy questionnaire would have been prohibi
tively expensive. This situation may change gradually in the future as Internet
access becomes more widespread, but for now it is an issue to be aware of.
One of the limitations inherent in questionnaire research is "the relatively
short and superficial engagement of the respondents" (Dornyei, 2003, p. 47).Be
cause questionnaires typically provide a 'snapshot' they "cannot aim at more
than obtaining a superficial, 'thin' description" (ibid.) of the issue under investi
gation. However, there are situations when getting a goodclearsnapshotby elic
iting data from a number of people in a uniform format can be a very useful
accomplishment.

CONCLUSIONS

In this chapter,we havelookedat surveyresearch. The bulk of the chapter is de


voted to the issue of questionnaire design and issues of quality control. While
there are limitations to questionnaires, they can be very useful in language

154 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


classroom research, whether we are working in the psychometric tradition,
doing naturalistic inquiry, or conducting naturalistic research.

QUESTIONS AND TASKS


1. Brainstorm a list of research question relating to classroom-based or
classroom-oriented research that could be investigated through some type
of questionnaire.
2. Use one of the research questions you've brainstormed to create a draft
research planusing the eight-step procedure set out in Table 5.1.
3. Locate an empirical study in our field that used a questionnaire as(partof)
its data collection procedure. (Questionnaires are typically printed in the
appendix of a published research report.)
A. What constructs) wasthe questionnaire designed to capture?
B. What kinds of item format(s) does the questionnaire use? (Refer to
Table 5.3 above.)
C. In that same questionnaire, does it appear that the researcher(s)
followed the advice of Alreck and Settle (1985) about organizing a
questionnaire? (SeeTable 5.4 above.)
D. Were the instructions clear and appropriate?
E. If the questionnaire was to be completed bylanguage learners, werethe
instructions and items written appropriatelyfor the respondents' profi
ciency level?
E What sampling procedures did the researcher(s) use to identify the re
spondents?
G. Did the questionnaire utilize open-ended items, closed items, or both?
H. What sort of data did the questionnaire items elicit? (If the items were
closed, were the data nominal, ordinal, interval, or some combination
of these)?
I. Based on your reading of the article, can you identify any possible
threats to validity?
J. Take the questionnaire yourself, imagining you were a member of the
intended sample. What insights do you gain by completing the ques
tionnaire.
4. If you were to replicate the studyyou read, whatimprovements would you
make to the research design? What changes, if any, would you want to
make to the questionnaire itself?

Chapters Surveys 155


SUGGESTIONS FOR FURTHER READING

If you are going to work with questionnaires, we recommend Dornyei's (2003)


book Questionnaires in Second Language Research. It has many clear examples and
suggestions forformatting questionnaires. It iswritten in a "user-friendly" style.
Likewise, J. D. Brown's (2001) book Using Surveys in Language Programs vtzs
written specifically forpeople in our field. It includes chapters on planning a sur
vey project, designing a survey instrument, gathering andcompiling survey data,
analyzing survey data statistically, analyzing survey data qualitatively, andreport
ing on a survey project. The appendices also contain helpful examples of several
entire questionnaires. Consulting this volume will certainly help you design
better questionnaires.
For interesting discussions aboutthe issue ofwhether Likertscales yield in
terval or ordinal data, see Busch (1993), Davidson (1998), Hatch and Lazaraton
(1991), and Turner (1993). See also Gu and Wen (2005).
Reid (1990) has written about the problems of conducting survey research
with second languagelearners as the respondents.

156 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


C H A PT E R

Case Study Research

Ifyou study grains ofsand,you willfind each isdifferent. Even by


handling one, itbecomes different But through studying itand others
like it,you begin tolearn about a beach. (Larsen-Freeman, 1996, p. 165)

mTROIW€TT0N AND OVlRVIEWJ


The focus of this chapter is the case study in second language research and its
applicability to classroom research. Methodologically speaking, the case studyis
a 'hybrid' in that almost any data collection and analytical methods can be used.
In this chapter,we definecasestudiesand explore issues and problemsassociated
with case study research. We will see that, while there are potential problems,
especially with the traditional concept of external validity, case studies have
considerable value.
The case study has played an important role in applied linguistics research,
especially in studying first and second language acquisition. In fact, casestudies
have a long history in research on language learning, so we will reviewsome of
the early work that influenced the development of this research method in our
field. However,in spite of their importance, defining casestudies is not particu
larly easyas we will see in the next section.

REFLECTION

In Chapter 4, we talked about one-shot case studies as a weak research


design in the experimental tradition. Yoiurnay have already guessed that we
are iisingthe term case study somewhat dhlferently in this chapter. What do
you ejspeptthe differences willbe?

157
Acase study isa detailed, often longitudinal, investigation of asingle individ
ual or entity(ora few individuals or entities). In applied linguistics research, case
studies canbestbeclassified asa type ofnaturalistic inquiry in that theytypically
do not involve any sort of treatment. Instead, researchers working in the case
study tradition set out to learn what is happening—whether it is with a child
learning his first language, an adultdeveloping literacy skills, a particular class of
preschool children, a novice teacher, or anyother entityof interest.
In experimental research, case studies (or "one-shot case studies" as they are
oftencalled) have traditionally beenseenhas having limited value. This is because
their lack ofcontrol over variables prohibits researchers from making strong causal
claims—a problem of internal validity. With only one or a few subjects, the argu
mentgoes, wecannever be surethe population iswell represented, so generalizing
the findings ofa case study to a population isa dubious undertaking—a problem of
external validity. Given these concerns, the perceived value of case studies in the
experimental research tradition is that theymaygenerate hypotheses that can later
be tested in experiments. However, from theperspective ofnaturalistic inquiry, the
case studymethod isveryimportant in other ways and in its ownright.
In the one-shot case study design of the experimental tradition, the re
searcher applies a treatment and observes its apparent effects in a post-test or
some other form of after-treatment measurement. There is no pre-test, nor is
there any comparison group. In naturalistic inquiry, however, "the researcher
usually does not provide experimental treatments or interventions that might
modifythe processof change" (Duff, 2008, p. 41). Nor does the researcher exert
control over variables in a naturalistic casestudy:
rather, the data reflect natural changes in the learner's behavior and
knowledge, influenced by numerous possible factors, such as the envi
ronment, physical maturation, cognitive development, and schooling,
which the researcher must also take into account in order to arrive at
valid conclusions concerning learning processes and outcomes, (p. 41)
Indeed, when we compare casestudy research in naturalisticinquiry with the ex
perimentalresearch tradition,wesee that "the strengths of one approachtend to
be the weaknesses of the other" (ibid., p. 42).
Case study research assumed a great deal of importance in education in the
1970s, when it was embraced by a group of researchers and evaluators in
Cambridge, England. Three members of the Cambridge Action Research
Network (CARN), Adelman, Jenkins, and Kemmis (1976), produced an impor
tant position paper in which they argued that case studies are not merely pre-
experimental and that case study is not a term for a standard methodological
package.The issue of whether or not case studies are 'pre-experimental'—mere
ground-clearing operations for more 'rigorous' experimental research—was
challenged by Adelman et al.
Although case studies have often been used to sensitize researchers to
significant variables subsequently manipulated and controlled in an
experimental design, that is not their only role. The understandings

158 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


generated bya case study aresignificant in theirown right. It is tempt
ing to argue that the accumulation of case studies allows theory-
building via tentative hypotheses culled from th; accumulation ofsingle
instances. But the generalizations produced in a case study are no less
legitimate when about theinstance, rather than theclass from which the
instance is drawn (i.e., generalizing about the case rather than from it).
(Adelman et al., 1976, p. 140)
In this extract, the authors assert that the investigation of a single instance is a le
gitimate form of inquiry and that the case study researcher need not feel bound
to report the instance as an exemplar of a class of objects, entities, or events.
Havingdetermined whata case studyis not, Adelman et al. go on to suggest
that it is the studyof an "instance in action" (1976, pp. 2-3). In other words, the
researcher selects a single entity from a class of objects or phenomena, which
could be 'bilingual speakers', 'second language classrooms,' etc., andinvestigates
the way that the entity functions in context.
lb give you a feel forthis "single instance" idea!and why it's important to see
howan "entityfunctions in context," here is an example of a description of a few
minutes ofclassroom time. The focus is largely on| a single child. The descrip
tion below (quoted from Carrasco, 1981, p. 168) is based on a videotape of chil
dren in a bihngual classroom. The "case" here isa Ijlispanic girlcalled "Lupita,"
or perhaps it is a brief series of interactions that she engages in—a classroom
episode. Her teacher had thought Lupita was not a very capable student. The
excerpt describes what the observers saw when they watched the videotape,
focusing on Lupita's interactions with her classmates.

The Taped Scene: Briefly, what they saw was Lupita performing and
interacting outside of teacher awareness during free time. After having
finished the Spanish Tables instructional event task—she was the first to
finish the task—Lupita decided to work on a puzzle in the rug area. She
was soon joined by two other bilingual girls, each successful in placing a
few pieces back on the puzzle template. Then Marta, a bilingual child as
sessed bythe teacher asa very competent student, asked Lupita for help in
placing herfirst piece on thetemplate. Lupita notonly helped herbutalso
taught her how to workwith it by taking Marta's hand and showing her
where and how it should be placed. Lupita continued to help her for a
short while and then returned to her own puzzle. A few moments later,
Lupita disengaged herselffrom her puzzle workand became interested in
whata boy, who had justentered the scene, was doing with a box of toys.
Lupita asked him if she could play with him when, suddenly, the boywas
interrupted by the classroom aide who asked him if he had finished his
work at the Spanish Tables, trying to convey to him that he shouldn't be

Chapter 6 Case Study Research 159


there. The boy did not quite understand, perhaps, what was being asked of
him. Lupita turned around and told him that he should go back to finish
his work. He left and Lupita continued working on her puzzle, which was
almost completed. Marta again asked for help after not having accom
plished very much since the last bit of assistance from Lupita. Lupita
helped Marta for a short time, then returned to her own task. The teacher
then entered die scene, moved past them toward the piano, and played a
few notes to cue the children that the event would be over in two min
utes and to begin to prepare for the next event. Lupita sped up her effort
while Marta continued to have trouble. Lupita finished her puzzle, then
helped Marta with hers, while the third child in the scene, who was sit
ting next to Marta, approached Lupita's side with the puzzle and nonver-
bally indicated that she also needed help. Lupita told Marta to continue to
work on hers alone while she helped the third child and asked if she could
help her finish die puzzle. Lupita, working quickly with the third child's
puzzle, directed Graciela to go help Marta since she needed help. After
Lupita and the third child put away their finished puzzles, Lupita looked
around and noticed that Graciela and Marta were still at theirs. She quickly
approached them, knelt down in front of them, took command, and helped
them finish in time for the next lesson. This taped scene clearly showed
Lupita's competence as a leader and teacher as well as her ability to work
puzzles. Moreover, it seemed to reveal how Lupita's peers perceived her as
compared to how the teacher perceived her.

REFLECTION

If you were a classroom observer who watched Lupita and later described
her interactions during this lesson to a colleague who hadn't been there,
what could you sayabout her? What can you infer about Lupita from this
description?

Defining CaseStudies
Different authors have defined cases studies in different ways. Those who come
from a background of naturalistic inquin,' see this method much differently from
those who come from the experimental tradition. Because it is so difficult (and
somewould argue, not helpful) to try to control variables in naturally occurring
classroom settings, in this book, we will consider case studies from the perspec
tive of the naturalistic inquiry tradition. The definitions printed below reflect
this orientation.

160 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Study the following definitions, identify commonalities, and then come up


with your own definition of a case study.
1. A case study is an empirical inquiry that investigates a contemporary
phenomenon within its real-life context; when the boundaries between
the phenomenon and the context are not clearly evident; and in which
multiple sources of evidence are used (Yin, 1984, p. 23).
2. A case study is defined in terms of the unit of analysis. That is, a case
study is a study of one case. A case-study researcher focuses attention on
a single entity, usually as it exists in its naturally occurring environment
(D.Johnson, 1992,p. 75).
3. [A case study] is a phenomenon of some sort occurring in a bounded
context (Miles and Huberman, 1994, p. 24).
4. The qualitative case study can be defined as an intensive, holistic
description and analysis of a single entity, phenomenon, or social unit.
Case studies are particularistic, descriptive, and heuristic and rely heav
ily on inductive reasoning in handling multiple data sources (Merriam,
1988, p. 16).
5. A case is a single instance of a classof objects or entities, and a case study
is the investigation of that single instance in the context in which it
occurs (Nunan, 1992, p. 79).
6. The most common type of. . . [case study] involves the detailed de
scription and analysis of an individual subject, from whom observations,
interviews, and (family) histories provide the database. . .. [Case study
methodology] may involve more than one subject. ... It may be based
on particular groups (e.g., group dynamics within a classroom); organi
zations(e.g.,a summer intensivelanguage learning program at a univer
sity); or events (e.g., a Japanese language tutorial . . . where one could
examine the amount of time a teacher speaks in either Japanese or
English for class management purposes) (Duff, 1990, p. 35).
7. "Acase study iswhatyou call a case, in case, in case you don't have any
thing else to call it" (unidentified student cited in Jaeger, 1988, p. 74).

Commonalities Across Definitions

Despite their differences, these definitions have two main commonalities. The
most important of these is the notion that a case is a 'bounded instance.' By
bounded we mean "defined" or "having boundaries"—whether those boundaries
are physical (a certain school site, a child), or temporal (as in a lesson, which has
a beginning and an end). You can think of the metaphor of" a fenced-in area as

Chapter6 CaseStudyResearch 161


being bounded. In classroom research, the bounded instance can be a single
learner or teacher, one classroom, a school, or even a particular school district.
The second commonality is that the phenomenon is studied in context.
Unlike formal experiments, which control and manipulate variables and look for
causality, case studies are centered on description, inference, and interpretation.
They also contrast with surveys, in which the researcher asks standardized
questions of large representative samples of individuals. The case study
researcher typically observes the characteristics of an individual entity—a child,
a clique, a class, an educational program, or a community—in that entity's natu
rally occurring situation. As a result, "case studies clearly have the potential for
rich contextualization that can shed light on the complexities of the second
language learning process" (Mackey and Gass, 2005, p. 172).
This issue of contextualization is very important, because "each human case
is complex, operating within a constellation of linguistic, sociolinguistic, socio
logical and other systems, and the whole may be greater than—or different
from—the sum of its parts" (Duff, 2008, p. 37). Experimental research typically
attempts to neutralize contextual differences through the use of control vari
ables. (See Chapter 3.) Case studies, in contrast, explore and describe the con
text as an essential part of understanding the phenomenon under investigation.
An example of language classroom research that illustrates both the
bounded instance and the contextualized nature of case studies is found in
Donato and Adair-Hauck (1992). They reported on two secondary school
French teachers' lessons about the future tense. The two teachers are called
Elizabeth and Claire in the report. They used different but clearly patterned
strategies for teaching the future tense. Elizabedi took eight lessons to cover
this structure and Claire took ten. The researchers videotaped and then tran
scribed these lessons, and analyzed the transcripts in order to document the
two teachers' styles. Elizabeth's orientation was more monologic and Claire's
was more dialogic.
These authors used a unit of analysis called the instructional episode: "a de
tachable piece of instructional material having a recognizable beginning and end
point for both teacher and students. ... In this study, the instructional episode
for analysis consisted of a unit containing the target structure the future tense"
(ibid., p. 77). The classroom data about the twoteachers' styles are veryconvinc
ing. Part ofwhat makes this study compelling reading forteachers is that asread
ers we can relate to the choices Elizabeth and Claire make about how to teach
the future tense.

REFLECTION

Based on what you have read so far, and on your previous reading, what do
you see as the advantages of case study research? What might be somedis
advantages?

162 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


OTHER CHARACTERISTICS OF C4SE STUDIES
There are some other key characteristics that many case studies have in com
mon. In addition to boundedness and contextuaUzation, according to Duff
(2008), these characteristics include multiple perspectives or triangulation, par
ticularity, and interpretation. In addition, longitudinality is a characteristic of
many, but not all, casestudies.

Longitudinality, Multiple Perspectives, andTriangulation


One tremendous value of case studies is that a researcher normally studies the
case for a long period of time. A longitudinal case study "examines development
and performance over time" (Duff, 2008, p. 40). Longitudinal case studies
provide"multiple observations or datasets, as informationis collected at regular
intervals, over the course of a year or longer" (ibid.). For example, Leopold's
(1978) research on his daughter's language acquisition covered three years. Not
all case studies are longitudinal in nature, but this characteristic is one of the
main strengths of the approach.
One key characteristic of case studies is the detailed nature of the data. As
you cansee fromthe descriptionof Lupita, the observational record derived from
the videotape shows us exactly how the child behaves. We come away from the
description with a clear understanding of her interactions with other children.
This clear understanding occurs in reading well-written case studies because
by concentrating on the behaviorof one individualor a smallnumber of
individuals (or sites) it is possible to conduct a very thorough analysis
("thick" or "rich" description) of the case and to include triangulated
perspectives from other participants or observers. (Duff, 2008, p. 43)
A related characteristic of case study reports is that they often include detailed
presentations of primary data, including transcript excerpts, speech and/or writ
ing samples, journal entries, and so on (ibid.).
The concept of multiple perspectives relates to the idea that many points
of view can be brought to the analysis of case study data. Usually, this is accom
plished through a processcalledtriangulation. This term is a metaphor borrowed
from navigation, surveying, and astronomy. Hammersley and Atkinson (1983)
explaintriangulation using the analogyof people wanting to locate their position:

A single landmark can only provide the information that they are situ
ated somewhere along a line in a particular direction from that landmark.
With two landmarks, however, their exact position can be pinpointed by
taking bearings on both landmarks; they are at the point where the two
lines cross, (p. 198)

In qualitative data collection, the metaphor refers to a quality control strategy. In


social research, if "diverse kinds of data lead to the same conclusion, one can be

Chapter6 CaseStudy Research 163


a little more confident in thatconclusion" (ibid.). (This concept iswidely used in
naturalistic inquiry, and we will revisit it in more detail in Chapter 7 when we
discuss ethnography.)
An example arose in Chapter 1 where we cited the study by Donato,
Antonek, and Tucker (1994). Their investigation of a Japanese FLES (foreign
language in the elementary school) program captured numerous perspectives in
data from questionnairescompleted by parents and learners, reflectionsfrom the
Japanese teacher, questionnaires from other teachers at the school, interviews,
and an observation system.

Particularity
The concept of particularity as a characteristic of case studies is related to the
boundedness of the case. In other words, "a single caseor nonrandom sampleis
selected precisely because the researcher wishes to understand the particular in
depth,not to find out whatisgenerally true ofthe many" (Merriam, 1998, p. 208).
Here, the analogyof a camera that uses different lenseswill be helpful. Sur
vey research, as described in Chapter 5, takes a wide angle view. It captures the
landscape—a panorama of mountains, streams, and trees. Casestudyresearch, in
contrast, uses a close-up lens. It examinesthe individualwildfloweror provides a
detailed study of a leaf. Although the flower and the leaf are part of the land
scape, looking at the photo shot with the wide-angle lens will not allow us to see
the petals of the flower or the delicate veins of the leaf.
In second languageclassroom research, we may choose to focus on a partic
ular student, or a particular teacher, or perhaps one particular conversational ex
change among three pupils doing group work. It is the close examination of the
particular phenomenon that allowscase study researchers to go into great detail
in terms of data collection and analysis.

Interpretation
To interpret something is to construe or attach meaning to it, that is, to under
stand it. When we look at case study data, we analyze those data and that analy
sis can be either qualitative or quantitative or both. Interpreting results in data
has to do with explaining what they mean. This comment is true of statistical
analyses as well of more qualitative analyses, and casestudy research employsin
terpretation in both contexts.
An example interpretation in language classroom research is found in
Ulichny's (1996) investigation of an interaction in an intermediate adult ESL
conversation class. Ulichny documented a particular classroom speech event,
which contained three different discourse activities. One student, Katherine, is
talking about why she decided not to continue with her volunteer work—a role
she undertook in order to practice her English. So, one discourse activity con
sisted of the actual conversation among Katherine, another student, and the
teacher, with the rest of the class listening. But, as Katherine's story goes on,
the teacher soon "exits from the conversation to work on specific elements of the

164 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


language" (p. 756). The teacher exits her roleas a listener to the storyfor a "cor
rection move or a conversational replayn (p. 756). She also offers "instruction, in
which [the teacher] involves the wholeclass in language work" (p. 754). Through
a detailedinterpretation of the transcript, Ulichny shows how the conversation
issubjugated to correction and instruction.In the process, Katherine is gradually
rendered "silent in the telling of her story" (p. 754).

TYPES OF CASE STUDIES

Deciding whether any given investigation is or is not a case study is not always
easy orstraightforward. As noted above, the term\case study has been defined in
various ways, and it is probably easier to say what a case is not than what it is.
While it seems reasonably clear that the study of an individual learner or an in
dividual classroom is a case, what about an investigation of an entire school or
even a whole school district? Any of these could be the focus of a case study.
In addition to focusing on a variety of topics for possible investigation, case
studies can serve a range of purposes and displayvarious characteristics. Sten-
house (1983), one of the 'fathers' of case study research in education, developed
a typology of case studies. The first type he identified as the neo-ethnographic,
which is the in-depth investigation of a single case by a participant observer.
Next, is the evaluative, which is a "single case or group of cases studied at such
depth as the evaluation of policy or practice will allow (usually condensed field-
work)" (p. 21). In contrast with these first two, the multi-site case study consists of
"condensed fieldworkundertaken by a team of workers on a number of sites and
possiblyoffering an alternative approach to research to that based on sampling
and statistical inference" (ibid.). Such research probably approaches ethnogra
phy (see Chapter 7), particularly if it attempts to capture a wide range of issues
and questions. The final type consists of action case studies. These are school case
studies undertaken by teachers who use their participant status as a basis on
which to build skills of observation and analysis.' (ibid.). A typology based on
Stenhouse is set out in Table 6.1.

iTAllrl'lt 6.1 The case study: A typology (following Stenhouse, 1983)


Type Description

Neo-ethnographic The in-depth investigation of a single case by a


participant observer
Evaluative An investigation carried out in order to evaluate
policy or practice
Multi-site A study carried out by several researchers on
more than one site
Action An investigationcarried out by a classroom
practitioner in his or her professional context

Chapter6 CaseStudyResearch 165


Other researchers have categorized case studies in different ways. For
instance, Yin (2003, as cited in Duff, 2008, pp. 31-32) discusses exploratory,
descriptive, and explanatory case studies. He sees defining questions and
hypotheses as the main purpose of exploratory case studies. A descriptive case
study, as the name suggests, provides a contextualized and detailed description of
the entity under investigation. An explanatory case study is intended to reveal
causal relationships.

REFLECTION

What do you see as the advantages of these different types of case studies?
What might be some disadvantages?

THE VALUE OF CASE STUDIES

Adelman et al. (1976) argue that there arc six principal advantages of case study
research in educational settings. In the first place, in contrast with some other
research methods, case studies are 'strong in reality,' and therefore likely to
appeal to classroom teachers who will be able to identify with the issues and
concerns raised. Secondly, they claim that one can generalize from an instance
to a class. (For example, you may recognize in R. L. Allwright's (1980) case
study of a conversation between Igor and his teacher a number of garrulous
students you have known.) A third strength of the case study is that it can rep
resent a multiplicity of viewpoints and can offer support to alternative interpre
tations. Fourth, if they are properly presented, case studies can also provide a
database of materials that may be reinterpreted by future researchers. Fifth,
insights yielded by case studies can be put to immediate use for a variety of
purposes, including staff development, within-institution feedback, formative
evaluation, and educational policy making. Finally, case study reports are often
written in a more accessible style than are conventional experimental research
reports, and are, therefore, capable of" serving multiple audiences. Because they
are 'user-friendly,' case studies can contribute to the democratization of deci
sion making in ways that studies based solely on quantitative data and statistical
analyses may not.

REFLECTION

Look back to the brief description of Lupita's interactions with her class
mates. Which of the six advantages of case studies identified by Adelman
et al. (1976) are discernable if we consider that excerpt to be a "mini" case
study, or some data from a longitudinal case study?

166 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Case studies have played an important role in applied linguistics, where
they have principally been employed as a tool toj trace the linguistic develop
ment of firstand second language learners. A classic in the field of firstlanguage
acquisition is R. Brown's (1973) longitudinal investigation of the semantic and
grammatical development of three children acquiring their first language.
Another case study that has had considerable influence is Halliday's (1975)
research on the language development of his own son. Studies such as these
have played an important part in enhancing the status of the case study in
applied linguistics. j
In second language acquisition, case studies 'jhave generated very detailed
accounts of the processes and/or outcomes of language learning for a varietyof
subjects" (Duff, 1990, p. 34). The types of subjects studied, according to Duff,
range "from young childrenin bilingual home environments, to adolescent im
migrants, adult migrant workers, and university-level foreign language learners"
(ibid.). Case study methodology can also embrace a wide variety of research
questions. Duff discusses this point with regard to the field of second language
acquisition (SLA):

Recent questions addressed in [case studies] in SLA research have


included ... How do children manage to function with two linguistic
systems at a time when most children are attempting to master one?
Why do some learners fossilize in their acquisition of a second lan
guage (in some or all domains), while some continue to progress?
In what ways do the forms and functions of constructions in a learner's
interlanguage (IL) differ? What features characterize the prototypical
"good language learner"? How do learners react to and/or benefit
from different methods of instruction? Is there a critical period for
SLA? (ibid.) i
Thus, case studies can be used to address a range of research questions about
both instructed and uninstructed language acquisition.
We concur with Duff, who says that case studies are attractive for a number
of reasons:

When done well, they have a high degree of completeness, depth of


analysis and readability. In addition, the cases may generate new
hypotheses, models and understandings about the nature of language
learning or other processes In addition longitudinal case study
research helps to confirm stages or transformations proposed on
the basis of larger (e.g., cross-sectional) studies and provides develop
mental evidence that can otherwise only be inferred. (Duff, 2008,
p. 43)

Thus, among researchers in the naturalistic inquiry tradition, there is wide


recognition of the value of case studies. We turn now to the practical issue of
how to select a case.

Chapter6 CaseStudyResearch 167


SELECTING THE CASE

As noted above, the case selected for investigation may be one person or a few
people. For instance, Carless (1999) conducted a case study of three primary
school teachers in Hong Kong, and Harklau (2000) studiedthree ESL commu
nitycollege students. Case studies have also been conducted aboutsingle or mul
tipleclassrooms, one or a groupofschools (see, e.g., Wang's [2003] case studyof
English language teaching in China).
Reasons for selecting the particular case(s) to be studiedare varied. Ideally, a
case can be chosen that embodies the phenomenon the researcher wishes to in
vestigate:

The individual case is usually selected for study on the basis of specific
psychological, biological, sociocultural, institutional, or linguistic
attributes, representing a particular age group, a combination of first
and second languages, ability level (e.g., basic or advanced), or a skill
area such as writing, a linguistic domain such as morphology and syntax,
or a mode or medium of learning, such as an online computer-mediated
environment. (Duff, 2008, pp. 32-33)

In some of the early literature on second language acquisition, cases were


chosen because the individuals were especially interesting or unusual. For
instance, one of the classic early studies in the second language acquisition
literature isJ. Schumann's (1978a, 1978b)investigation of acquisition and accul
turation. Schumann was part of a team that carried out a ten-month study of two
adults, two teenagers, and two children who were all Spanish speakers acquiring
English. One of the adults, Alberto, a thirty-three-year-old Costa Rican, made
very little progress in learning English in comparison to the others despite a
period of intensive instruction. Schumann studied Alberto and concluded that
his lack of linguistic development could be attributed to his social and psycho
logical distance from the target culture and the fact that the limited amount of
English he had managed to acquire was sufficient for him to fulfill his commu
nicative needs. In fact, this point of view became a testable hypothesis that other
researchers later investigated.
In other instances, an inviting, accessible context has prompted the choice
of the case. For instance, Peck (1980) looked at the role of language play in the
English development of two Mexican boys whose parents were graduate
students in the United States. The family lived in their university's married-
students' housing complex, which meant that Peck had ample opportunities
to record the boys' developing language in the context of them playing
with other children in English. Thus, the situation provided an ideal opportu
nity and a compelling context for studying language play in second language
acquisition.
Other people have selected casesbecause of the ease of access to the individ
uals) the researcher wished to study. Many parents have studied their own

168 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


children's language acquisition. For example, Celce-Murcia (1978) investigated
her daughter's bilingual acquisition ofEnglish and French over aperiod ofyears,
in both California and France. Burling (1978) documented his son's acquisition
of Garo and English. The opportunity to investigate child bilingualism in this
way arose when the family went tolive in India when the boy was ayear and four
months old. In conducting this case study, the researcher had ready access to
his son in a very interesting situation—total immersion in a new language and
culture.
Sometimes, a caseseems to jump out of the data for a different sort of study.
For example, in an early and influential classroom research project, R. L.
Allwright (1980) investigated the turns, topics, and tasks in two lower-level ESL
classes in California. For a period often weeks, both classes were tape-recorded
for two of their twenty instructional hours per week, while an observer took
notes. Allwright analyzed the data by transcribing the audiotapes and counting
the number and types of turns taken by the various students and the two teach
ers. One student, whose pseudonym was "Igor," got many more turns than did
his classmates. This numerical discrepancy caused Allwright to look more
closely at Igor's interactions with the teacher, to see how he got so many turns.
One conversationwas analyzed in great detail because it showed how Igor's dis
course moves caused the teacher to give him more turns through her repeated
attempts to understand his message.
In another classroom study, Block (1996; 1998) investigated the perceptions
of six adult English learners and their teacher in Spain. They all kept oral diaries
(i.e., they made tape-recorded diary entries) on a daily basis. Block wanted to
compare the teacher's and the smdents' views of salient activities during the les
sons. He himself observed some of the lessons. In the data analysis, it became
clear that one student, whose pseudonym was "Alex," provided detailed com
ments in which he "questioned and criticized what was going on in class" (Block,
1996, p. 183). Block chose to focus part of his report on Alex's perspectives be
cause "if we are to understand classroom culture better, we must examine not
only harmony but conflict" (ibid.). The casestudy of Alexis embedded in Block's
larger discussion of the classroom research as a whole. In discussing this choice,
Blockcites D. Allwright's (1988) comment in deciding to focus on Igor: "This is
the starting point for the case-study approach, where one learner stands out as of
particular interest" (p. 178).

REFLECTION

Think about the reasons given above for selecting a particular entity as a
"case" for investigation. Given the research topics that interest you, what
would be a case that you could study? For what reason(s) would you select
that entity?

Chapter6 CaseStudyResearch 169


ACTION

Read two or three case studies in books or professional journals from our
field. Do the authors explain how and why they selected the case(s)? Ifyou
are working with a group, compare the reasons found in the case studies
you read with those found byyour classmates or colleagues.

QUALITY CONTROL ISSUES IN CASE STUDIES


As case studies are concerned with the observation, documentation, and analysis
of a single instance, many of the quality control issues we looked at in earlier
chapters of the book will be revisited here from the perspective of case study.
However, since we are viewing case studies from the perspective of naturalistic
inquiry, we will also introduce other issues that do not typically arise in discus
sions of the experimental approach to language classroom research.

External and Internal Validity in Case Studies


In relation to validity, there are two points of view. On the one hand, there are
the researchers who feel that internal validity is important in any kind of re
search, but external validity may be irrelevant (see, e.g., Larsen-Freeman, 1996;
van Lier, 2005). In fact, Larsen-Freeman (1996) questions "whether generaliz
ability has ever been attainable in classroom research" in general—not just case
studies (p. 164). For many qualitatively oriented researchers, according to Duff
(2008), "the term generalizability itself is considered a throw-back to another era,
paradigm, ethos, and discourse in research" (p. 50).
On the other hand, some researchers believe that the purpose of such obser
vation is to make generalizations from the entity to the wider population to
which it belongs (see, for example, Cohen and Manion, 1985, p. 120). People
working in this perspective argue that tests of validity ought to be as stringently
applied to the case study as to any other type of research. Yin (1984), for exam
ple, believes that reliability and validity are just as important for case study
research as they are for other kinds of research. He suggests that four critical
tests confront the case study researcher:

reliability (demonstrating that the study can be replicated with similar


results)
• construct validity (establishing correct operational measures for the con
cepts being studied)
• internal validity (establishing a causal relationship, whereby certain condi
tions are shown to lead to other conditions, as distinguished from spurious
relationships)
• external validity (establishing the domain or population to which a study's
findings can be generalized)

170 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Having read these arguments, where do youstand? Should a case studybe


presented as an exemplar of a broader population from which generaliza
tions can be drawn, or should it be seen as a valid object of study and
reporting in its own right?

In relation to the internal validity of case study research, Yin (1984) claims
that this is a matter of concern only in

causal or explanatory studies,wherean investigatoristrying to determine


whether an event.vled to event y. If the investigator incorrectly concludes
that there is a causal relationship between x and y without knowing that
some third factor—~.—may actually have caused y, the research design
has failed to deal with some threat to internal validity, (p. 38)

Another problem relating to internal validity is the frequent necessity for


case study researchers to make inferences (which they have to do every time
they deal with an event that cannot be directly observed). Thus, an investigator
will 'infer' that a particular event resulted from some earlier occurrence, based
on interview and documentan,' evidence collected as part of the study. Other
researchers argue that this is not just a concern in case studies: Internal validity
is a matter of concern in all types of research because it deals with the question
of whether investigators are really observing what they think they are observing.

REFLECTION

Which do you think is die more important in case study research—internal


or external validity? Why?

Particularization and Transferability


Some researchers argue that internal validity has to take precedence over exter
nal validity on the grounds that without internal validity, the study is meaning
less, and it makes no sense to attempt to apply meaningless outcomes to broader
populations. This is a matter of logic: If a researcher claims that a certain vari
able caused learning or improved teaching but is wrong about that conclusion,
then it is problematic to try to generalize that finding to a wider context.
A major concern for some researchers has to do with the extent to which one
can extrapolate from a given case to the class of entities from which it is drawn.

Chapter 6 Case Study Research 171


Generalization has been a serious stumbling block for case study researchers
who see the need to argue from the single instance to the general. However,
Stake (1988) argues for the particularity of the case and rejects the need for
generalizability:

The principal difference between case studies and other research stud
iesis that the focus of attentionis the case, not the whole population of
cases. In most other studies, researchers search for an understanding
that ignores the uniqueness of individual cases and generalizes beyond
particular instances. They search for what is common, pervasive, and
lawful. In the casestudy, there mayor maynot be an ultimate interest in
the generalizable. For the timebeing, the search is for an understanding
of the particularcase, in its idiosyncrasy, in its complexity, (p. 256)

A similar position is taken by van Lier (2005).He contrasts casestudies with


experimental research and process-product studies. (See Chapter 1.)

In the past, case studies have often been accorded less status than more
rigorously controlled experimental or process-product studies because,
as the argument often goes, casestudies are not generalizable.However,
this criticism is unwarranted. It is probablytrue that it is difficult to gen
eralize from an individual (or a group) to an entire population without
the presence of strict controls to account for environmental variables.
However, there is also a form of generalization that proceeds not from
an individual case to a population, but from lower-level constructs to
higher-level ones. Furthermore, in the practical world in which case
studies are conducted, particularization may be just as important—if not
more so—than generalization, (p. 198)

According to van Lier, particularization means that "insights from a case study
can inform, be adapted to, and provide comparative information to a wide vari
ety of other cases" (ibid.). However, readers and researchers must take contextual
differences into account when doing so.
This idea has also been discussed by Larsen-Freeman (1996, citing Clarke,
1995). She says that particularizability involveshelping teachers find connections
between research results and the particulars of their own classroom realities. In
her view, those sorts of connections are more valuable than the statistical concept
of generalizing findings from studies using samples randomly drawn from a
defined population.
A related concept is the transferability (alsocalled comparability) of hypothe
ses, principles, and/or findings (Duff, 2008; Lincoln and Guba, 1985). In this
idea, it is up to the readers of case studies to decide for themselves "whether
there is a congruence, fit, or connection between one study context, in all its
complexity, and their own context, rather than have the original researchers
make that assumption for them" (p. 51).

172 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Haveyou everreada case studyabouta learner1hat remindedyou strongly


about someoneyou had known? What is your opinion of particularizabil-
ity and transferability?

Yin also argues that construct validity is especially problematic in case study
research. This problem is due to the frequent failure of case studyresearchers to
develop a sufficiently operational set of measures and because 'subjective' judg
ments are used to collect their data. This point leadsus to the issues of subjectiv
ity and objectivity.

Subjectivity and Objectivity


A subjective stance is not automatically considered to be negative in naturalistic
inquiryeven though objectivity is a hallmark of (or at leasta desideratum in) ex
perimental research. Various naturalistic inquiry methods have their own means
of establishing quality control in the objectivity-subjectivity dichotomy. (Wewill
revisitthis point in some detail in Chapter 7, where we discuss ethnography.)
What is sometimes seen as subjectivitymay be a by-product of involvement.
Well-documented case studies are valuable because of the illuminating insights
and vividexemplarsthey provide. For example,Schmidt (1983)conducted a lon
gitudinal case study of Wes, an adult learner of English in Hawaii. Wes and
Schmidt were close personal friends. As a result, Schmidt could observe and
record Wes's speech frequently and in a wide range of contexts, and the data col
lection was sustained over a period of time. Regarding his interpretation of the
data, Schmidt notes, "The judgments given here, as in most case studies, are ...
ultimately subjective, deriving their validityonly from close personal friendship
and familiarity with the subject, observations of his behavior, and discussions
with him and others who know him well" (p. 142). (See also Schmidt, 1984.)
While experimental researchers would see this intense involvement as posing a
threat to objectivity, well-written case studies are valuable precisely because this
familiarityand involvement enables the author to convincingly portray the indi
vidual or site under investigation.
For the most part, qualitative researchers "do not see subjectivityas a major
issue, as something that can or should be eliminated. Rather, they see it as an in
evitable engagement with the world in which meanings and realities are con
structed (not just discovered) and in which the researcher is very much present"
(Duff, 2008, p. 56). In fact, Duff quotes Stake (1995)as saying that subjectivityis
"an essential element ofunderstanding" (p. 45). j
REFLECTION

How do you feel about the subjectivity issue n case study research? Can
you find research resultsconvincing if they are not totally objective?

Chapter6 CaseStudyResearch 173


Case Studies and Hypothesis Testing
Yin (1984) makes aninteresting defense ofthecase study from theperspective of
external validity. He argues against drawing an analogy between case study and
survey research, suggesting, in fact, thatit isa false one. (See Chapter 5 foran in
troduction to survey research.) He says that critics ofcase study research
are implicidy contrasting the situation to survey research, where a
'sample' (if selected correctly) readily generalizes to a larger universe.
This analogy to samples and universes is incorrect when dealing with
case studies. This is because survey research relies on statistical general
ization, whereas case studies ... relyon analytical generalization, (p.39)
Yin's argument here is somewhat obscure. He seems to be arguing that a single
case can be deployed to falsify an assertion or hypothesis rather than to support
it. In fact, that has happened in second language acquisition research.
Here is the logic. You will recall from Chapter 3, where we discussed hy
pothesis testing, that the philosopher Popper (1968; 1972) argued that we can
never'prove' anything throughobservation; wecan onlydisprove tentatively es
tablished hypotheses. His famous example is the 'whiteswan' argument—that is,
a thousand sightings of white swansdo not entitle us to claim all swansare white
as a scientific fact. We can tentatively put forward the hypothesis that all swans
are white, but this hypothesis can be falsified, or disproved, by a single discon
tinuing black swan.
To exemplify this issue, let'sreturn toJ. Schumann's (1978a; 1978b) study of
Alberto, the Spanishspeakerwho made very little progress with his English over
the tenth-month period of the study. Alberto was able to get along and fulfill his
communicative needs with very limited English. Schumann concluded that
Alberto's lack of linguistic development was due to his high social and psycho
logical distance from the target culture.This point of view became a testablehy
pothesis, which suggested that low social distance and psychological distance are
required for language acquisition to occur.
Later, Schmidt (1983; 1984) challenged this hypothesis through his case
study of Wes, which we summarized above. Although Wes's English was very
limited (like Alberto's), he had very low social and psychological distance from
English speakers. Schmidt thus refuted Schumann's hypothesis about low social
and psychological distance being the keys to successful language acquisition. In
effect, Wes was the black swan.
In summary then, we can see three positions that may be taken with regard
to case studies and the issue of generalizability:
Position 1: Case studies can achieve the status of generalizability when
findings from many studies are aggregated.
Position 2: Generalizability is not necessarily the only end of research. The
particular and the unique might be just as worthwhile to document.
Position 3: Generalizability is unnecessaryin those case studies that set out
to falsifya hypothesis.

174 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Which of the three positions listed above could be used to justify the
following study?
Janet Allbright is investigating the different stages of acquisition that
learners go through as they acquire English. She uses a six-stage model of
acquisition that she has come across in the literature. This model argues
that all the grammatical structuresof English can be placed into six groups,
or stages, and that learners must pass through each of these stages in turn.
Her chosen methodology is case study. She records and analyzes the con
versations of an immigrant learner of English over a two-year period and
compares the learner's stages of language development from beginner- to
intermediate-level of proficiency.

Thisstudy could be justified by Position 1above. Janetcould argue thatshe


is adding one morecase to the growing number of cases in the second language
acquisition literature. On the otherhand, ifshe were able to document evidence
of her learner 'skipping' one of the six hypothesized stages of acquisition, which
would disconfirm the hypothesis, she could justify her case study in terms of
Position 3.

A SAMPLE STUDY

Sometimes, case study research is criticized for being atheoretical, and it is true
that case studies are sometimes more data-driven than theory-driven (Duff,
2008). Nevertheless, "much case study research is embedded within a relevant
theoretical literature and is motivated by the researcher's interest in the case and
how it addresses existing knowledge or contributes new knowledge to current
debates or issues" (p. 57). In thissection, we will summarize a case study that has
a very strong tie to theory.
Nassaji and dimming (2000) analyzed the interactions of an ESL teacher
and young Farsi speaker learning English in Canada. The child, whose pseudo
nym was Ali, was six years old and had moved to Canada from Iran. The re
searchers examined dialog journal exchanges between Ali and his teacher, Ellen
(also a pseudonym), over a period of ten months. It might be argued that this
study is more an example ofclassroom-oriented research rather than classroom-
based research (see Chapter 1) since it did not investigate classroom interaction
perse. However, the interactions between Ali and Ellen were part of their natu
ral ongoing relationship as learner and teacher; it's just that the interactions
under investigation were written instead of spoken.
The authorsquote Peyton's (1990) definition of a dialogjournal as"a written
ongoing interaction between individual students and their teacher in a bound
notebook" (p. 100). Ellen told her students to write about things that interested

Chapter 6 Case Study Research 175


them personally. The authors say, "The journals were written every few days as
part of routine classroom activities, forming a continuous flow of exchanges in
single notebooks (comprising four notebooks in total)" (p. 101).

Data Collection and Analysis


The article is richly illustrated with excerpts from the dialog journal exchanges.
Ali's earliest dialog journal entries were his first attempts at writing English.
Here is an example that consists ofAli's entry, Ellen's response, and Ali's reply to
her. These data are reproduced (from Nassaji and Gumming, 2000, p. 109) with
Ali's own spelling, punctuation, and capitalization:

xAli: Today is Tuesday.Dec.19th 1995. The Temperature is 11 A.M. It is a


Cold Day. I very very miss kathryn.
Ellen: Kathryn was avery kind student. We will all miss hervery much. Do
you think she will miss us?
Ali: yes

One key point is that all of these dialog entries were written before either
the teacher or the student were aware that they would be used for research.
For this reason, the dialog journal data "represent naturalistic classroom data"
(Nassaji and dimming, 2000, p. 101).

REFLECTION

Have you ever used the dialog journal procedure, either as a teacher or a
language learner? What do you think about this idea for encouraging
learners to communicate their ideas in writing in the target language?
What do you think about using dialog journal entries as data in language
classroom research?

To analyze the data, the researchers employed a coding scheme of fourteen


language functions. These categories were first developed by Shuy (1993), who
also studied dialog journal interactions. The fourteen categories are listed below
(from Nassaji and Gumming, 2000, p. 102):
1. Reporting personal facts
2. Reporting general facts
3. Reporting opinions
4. Requesting personal information
5. Request academic information
6. Requesting general information
7. Requesting opinions

176 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


8. Requesting clarification
9. Thanking
10. Evaluating
11. Predicting
12. Complaining
13. Apologizing
14. Giving directives

As it turned out, only eleven of these categories were used in this study because
neither the teacher nor the student requested academic information (5) or com
plained (12), and thanking (9) happened very rarely or was embedded as part of
other functions.
In addition to the function coding described above, the researchers divided
the dialog journal entries into T-units. This is a unit of syntactic complexity de
fined by Hunt (1970) as an independent clause and any attached subordinate
clause(s). So, a subordinate clause alone is not a T-unit. A full sentence is a
T-unit, whether it is a simple sentence or a complex sentence. A compound sen
tence is categorized as two (or more) T-units, as shown below:
1. It's raining. (One T-unit)
2. It's raining, which is unfortunte. (One T-Unit)
3. It's raining and there is lots of lightning. (Two T-units)
4. It's raining, there is lots of lightning, and I hear thunder. (Three T-units)
For both the function coding and the T-unit analysis, Nassaji and Gumming
(2000) checked their inter-coder agreement. They reported strong inter-coder
indices:

Our inter-coder agreement on a sample of 20 per cent of the data se


lected from every5th journal entry (101 T-units over 10exchanges), was
100 per cent (i.e., full agreement) for the segmentation of the data into
T-units and 92 per cent for the coding of the language functions. The
few discrepancies in coding were resolved through discussion, (p. 103)

REFLECTION

Based on what you have read so far, what do you think were the research
questions that Nassaji and dimming wished to address?

Sociocultural Theory
The first paragraph of Nassaji and Cumming's (2000) report begins with the fol
lowing questions (which, at first glance, may seem more like attention getters
and topic nominators than explicit research questions): "What doesa ZPD (zone

Chapter6 CaseStudy Research 177


of proximal development) looklike? How might we recognize one? How do we
know whether it is happening or not?" (p. 95). They say that "answers to these
questions are fundamental to guide—indeed should form a rationale for—the
practicesof language learning and teaching" (ibid.).
The zone ofproximal development (often called the ZPD) is a feature of socio-
cultural theory, which originated withVygotsky (1978, 1986). This theoryholds
that all learningis socially constructed and that learningoccurs in the ZPD. But
what exactly is this concept? Vygotsky (1978) defined the zone of proximal de
velopment as "the distance between a child's actual development level as deter
mined byindependent problem solving andthe level of potential development as
determined through problem solving under guidance or in collaboration with
more capable peers" (p. 86).
Another important concept in sociocultural theory is scaffolding, which is
"generally understood in cognitive psychology as progressive help provided by
the more knowledgeable to the less knowledgeable" (Nassaji and Cumming,
2000, p. 98). Scaffolding is not just providing an answer—it is helpingsomeone
arrive at an answer. Think of the term in its metaphoric sense: A scaffold is
erectedaround a buildingthat is either beingrenovated or beingbuilt.The scaf
fold is there to help the workers reach the problem areas or unfinishedareas that
need attention. When those areas have been dealt with, the scaffolding is re
moved. It is an intentionally temporary structure. Through scaffolded interac
tion with others, learners move from what they can currently do independently
to another level of capability, and this process happenswithin the ZPD.
Nassaji and Cumming (2000) say, "At the heart of the sociocultural perspec
tiveis definingthe dialogic nature of teachingand learningprocesses within the
ZPD as well as designing research that exemplifies its nature" (p. 97). They de
cided that "in order to studythe ZPD in detail, [they] needed to lookto dialogue
journals—a situation wherelanguage teaching and learning areorganized so that
communication is systematicallydialogic" (p. 99). They identified their 'research
gap' (see Chapter 2) in this way:
Little of the previous inquiry into dialogue journals with second
language learners, however, despite its taking a functional approach to
the analysis of communication, has adopted an explicitly Vygotskian
theoretical framework Likewise, previous research about the ZPD
in second languageeducation has mostly focused on spoken, rather than
written, interactions." (ibid.)
These authors set out to examine in detailthe dialog journalexchanges between
Ali and Ellen in terms of what the entries could reveal about the ZPD. The lon
gitudinal data collection process allowed the researchers to see how the system
atic written interactions between the teacher and the child changed over time.
Nassaji and Cumming used the T-unit analysis and the function coding to ana
lyzeany possiblechanges.In terms of the quantitativeand qualitative dimension
discussed in Chapter 1 (see Table 1.1), this case study involves qualitative data
collection—the dialog journalentries.These datawere then analyzed both qual
itatively(through categorization) and quantitatively.

178 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


The findings of thisstudy are interesting andvaried. Wewill note justa few
ofthemhere,startingwiththe quantitative results. These werediscussed in terms
of the frequency of the eleven different functions that appeared in these data as
percentages of the totalT-unit the authors identified in the journal entries.

Quantitative Results
The majority ofAli's entries (58%) involved reportinggeneral facts. Twenty per
centwerereportingpersonal facts, and 18% werereportingopinions. The other
eight functions coded were requesting personal information (0.2%), requesting
general information (2%), requesting opinions (0.2%), requesting clarification
(0.2%), evaluating (0.7%), predicting (0.7%), and apologizing (0.7%).
The functions coded in Ellen's entries were reporting general facts (13%),
personal information (23%), requesting general information (4%), requesting
opinions (10%), requesting clarification (2%), evaluating (10%), predicting
(9%), apologizing (0.7%), and giving directions (2%).
Regarding these quantitative findings, Nassaji and Cumming (2000) say that
the variety and value of Ellen's language functions have to be recog
nized, not simply as proportional frequencies, but for the ways in which
she pitched her discourse to match Ali's basic 'reporting' mode. We as
sume Ellen was striving to scaffold their written interactions to prompt
Ali's potential for learning English in this context, (p. 104)
The numerous examples of dialog journal entries that the researchers provide
help the reader interpret and verify this interpretation.

Qualitative Results
In repeatedly reviewing the dialog journal exchanges, the researchers found five
patterns that "display salient aspects of their mutual process of constructing a
ZPD" (Nassaji and Cumming, 2000, p. 106). These five patterns "of comple
mentary asymmetry" (p. Ill) are discussed and some will be exemplified below:
1. Questioning: "In the early weeks of the journals, Ellen posed simple routine
questions seemingly to engage Ali in the discourse andto show Ali how to inter
act" (ibid.).

Example:
Ali: Today is +14 A.M. May 1995 is23th. Yestbday ismy borday. I am 7yors
old. I love myMom and my Dad. TodayisTeusday. I lovemyTeacher. I love
ApplTREE.
Ellen: Did you have a birthday cake?
Ali: Yes.
2. Give and Take: "At times when Ali increased the frequency of language
functions he used, Ellen correspondingly decreased hers, seemingly to allow

Chapter6 CaseStudyResearch 179


Ali greater 'voice.'... Conversely, every few weeks, Ellen increased her fre
quencyof language functions, possibly to prompt Ali to increase his. In many
cases when Ali wrote shorter journal entries with fewer language functions,
Ellen tended to produce comparatively lengthier responses and more
language functions" (ibid., pp. 108-109). Here are examples of both types of
exchanges:

Example (ibid., p. 108):


Ali: Today is Mounday.June 1995 5th. Yestoday is god Day. Yeslbday I aM
going To park. Today is teperature is +19 A.M. I love grass. I love park.
I Love My scool. I Love My Teacher. I am seven yors old. I love you
Mr. [Ellen]. I Lovebabydish and tadpole. I lovemyclas RooM. Do you laik
me. I love frog. I love sole dog. I love my mom and my Dad.
Ellen: Yes, I like you Ali.

Example (ibid., p. 109):


Ali: Today is Thursday.Nov.the 2nd 1995. The temperature is +15 a.m.
today is a raning Day. Halloweenis ower. I Go checo or churet last Night,
and I saw tisha and Mayer and bengeneh. the End.
Ellen: What did you get for Trick-or-Treat? What wasyour costume? Dear
[Ali], I am glad you listened to and obeyed your mother at 12:00. She is
trying to help you learn. We willsee you tomorrow with a big smile! From
Mrs [Ellen].
Ali: Notheng.
3. Reporting and Requesting: The data showed that reporting and requesting
were the two language functions that Ali and Ellen used the most frequently.
Nassaji and Cumming felt that over the course of the journal entries, Ellen
pitched her journal entries to match both Ali's comments and his language pro
ficiency. They report that "by the final weeks of the journal exchanges, the two
had reached a relatively harmonious balance in terms of their communicating"
(ibid., p. 110). They make the point that, initially, Ellen frequently requested
Ali's opinions, but "as the journals progressed, Ali came to produce more opin
ions and Ellen gradually refrained from requesting them" (ibid., p. 111).This in
terpretation would not have been possible if the researchers had not collected
longitudinal data.
4. Evaluations: Over time, as Ali used more language functions, "Ellen pro
vided more evaluative functions, mostly as praise to Alifor his accomplishedout
put" (ibid., p. 111). Conversely, she produced fewer evaluative remarks when Ali
wrote less.
5. Appropriating Spoken into Written Forms: In analyzing the journal
entries, Nassaji and Cumming (2000) found that "Ali often spontaneously
added a third, closing response to the written interactions, forming the triadic

180 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


I

structure of 'request for information-answer-acknowledgment' common to


classroom and conversational spoken discourse" (p. 112). Ali added a reacting
move after Ellen's comment in forty-seven out: of the ninety-five journal
exchanges, thus making them triadic. This pattern occurred even in the first
week of Ali's journal:

Example:
AH: Today isManday, Mayis 8th.The Yestoday is6th. I lave Mrs.[Ellen] My
frand shawN. May 1995.Today +6.
Ellen: Shawn and I love you too, Ali.
Ali: Me too.

The authors say the use of this typically spoken triadic speech structure in Ali's
writing may suggest that "Ali was learning new mediational means from a vari
etyofsources around him, such as classroom or conversational discourse" (ibid.,
p. 113). HisEnglish abilities were developing through theprocess of"extending
whatis appropriate in one domain to another"(ibid.).
Earlier in this chapter, we cited Duff's (2008) characterization of case stud
ies as being interpretive. The comments from Nassaji and Cumming (2000), as
theydiscussed these five patterns in the qualitative analysis, illustrate the inter
pretivenature of casestudies.

Implications for Understanding the ZPD


Nassaji and Cumming (2000) show how the patterns in Ellen's and Ali's dialog
journal exchanges "sustained—in a complementary, dynamic, and evolving
manner over nearly a year—conditions for an ESL student's learning English
literacy, scaffolded by his teacher" (p. 113). They note that although Ali's
spelling, vocabulary, grammar, and penmanship were faulty, they were emerg
inggradually and Ali and Ellen were able to communicate in the dialog journal
exchanges. Writing English in this way "seemed to helpAli perform in his sec
ond language, while Ellen demonstrated an ongoing sensitivity to his doing
that" (p. 114). i
One charge that is sometimes leveled at case! studies is that they are data-
drivenrather than theory-driven, but this is not always the case (Duff, 2008). In
fact, one of the strengths ofthe case studybyNassaji and Cumming is thisstrong
connection to sociocultural theory. Duff (2008) states,"It is up to the researcher
to articulate the theoretical framework guiding the study, the relationship be
tween the studyand other published research, the chain of reasoning underlying
the study, and the theoretical contribution the study makes" (p.49). Nassaji and
Cumming have both conducted their research and reported their findings in
such a way that readers new to sociocultural theory can understand the concept
of the ZPD through their discussion and the many examples of dialog journal
entries that they include.

Chapter 6 Case Study Research 181


ACTION

As a teacher, how do you scaffold learners' language development? Think


of three clear instances where you feel you were able to help a learner move
ahead in the ZPD. Share them with a colleague or classmate. (Ifyou don't
yet have language teaching experience, think of some sort of scaffolding
you have done in another context.) The purpose of this task is to help you
check your understanding of scaffolding.

PAYOFFS AND PITFALLS

You will recall from Chapter 3 that one threat to validity in experimental re
search is mortality—the loss of subjects from the sample. In case study research,
thisissue is usually called attrition (Duff, 2008). This is one of the biggest pitfalls
of conducting a case study on an individual. Ifyou lose access to that person, you
can no longer collect data—a problem reminiscent of the familiar warning not
to put all your eggs in one basket. This problem has occurred when the family of
a child beingstudied has moved or when new job opportunities have taken adult
learners out of the researcher's reach. Sometimes, individuals decide they do not
wantto be the subjects of an investigation and remove themselves from the study-
entirely.
Another pitfall, perhaps especially for novice researchers, is that to conduct
a longitudinalcasestudy takes time, commitment, and being systematic. Months
or years may be needed to track changes and answer particular research
questions. Embarking on a longitudinal case study also requires the researcher
to be disciplined and dedicated in recording and managing the data. For
instance, if you choose to examine the development of English speaking fluency
among seven-year-old pupils over the course of a school year, you must observe
these children regularly. You will also need to carefully and consistently
labeland store any audiotapes or field notes documenting the children's interac
tions for an entire year. Finally, you will also have to do a substantial amount of
transcription.
In spite of these pitfalls, there are considerable payoffs in carrying out a
well-conducted case study. For example, for novice researchers, working with
one subject or one site may be much more manageable than trying to implement
"large N" research or to collect data at several sites. Looking closely at one
learneror a few learners may allow the researcher to notice and appreciate small
changes occurring over time that might not be noticeable in a cross-sectional
study of many subjects. Likewise, the varied types of data collection used in case
studies, including close and prolonged observation, may reveal important
changes that are not captured by language tests and other sorts of measurements
used in experimental research.

182 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


In this chapter, we have reviewed the role of the case study approach as a viable
and important research method inapplied linguistics, and especially ininvestiga
tions of firstandsecond language acquisition. We have contrasted the strengths
of case studies in the naturalistic inquirytraditionwith the relative weakness and
lowstatusof the one-shotcase studyin experimental research. Wealso discussed
quality control issues and the value of case studies in our field, and wesumma
rized some classroom-based case studies. In general, we find the case study ap
proach to be extremely valuable, provided the data are collected and analyzed
with sufficient care. We believe the case study by Nassaji and Cumming illus
trates this sort of careful investigation. The following questions and tasks are
intended to help you review and solidify the concepts presentedhere.

QUESTIONS AND TASKS


1. In Chapter 2, we talked about the importance of posing research questions
and choosing appropriate means to answer them. Write three to five
researchquestionsthat could be addressed usingcasestudies. Think of the
kinds of cases that would be appropriate to investigate these research
questions. Share you list with your classmates or colleagues.
2. Does the following discussion (Stake, 1988)revolve around the issue of in
ternal reliability, external reliability, internal validity, or external validity?
How convincing do you find the arguments made?
Many peoplecriticize casestudyresearchbecause there is too lit
tle indication of the degree to which the case is representative of
other cases. Usually it is left to the reader to decide. Of course it
is easyto argue that a sample of 'size one' is never typical of any
thing, except itself.
For some research purposes, it will be: essential that the 'case
examined' be representative of some population of cases. Presum
ably, the case could be so unique that it might be unwise to con-
sider any finding as true of other cases, However, ... the unique
case helps us to understand the more typical cases. Whether or
not a case should be representative of other casesdepends on the
purposeof the research. It would be presumptuous to dismiss all
findings as invalid because the casewas not demonstratively rep
resentative. Some findings—for the purposes some readers
have—donot depend on generalizing to a population of cases.
A case study is valid to the reader to whom it gives an accurate
and useful representation of the bounded system. Accuracy of
observing and reporting is not a matter of everyone seeing and
reporting the same thing. Observers have different vantage
points Readers have differentusesfor researchreports. One

Chapter6 CaseStudyResearch 183


reader expects an exact facsimile of the 'real thing'. Another
reader is attending to a new type of problem that hadnot previ
ously been apparent. The validity of the report is different for
each, according to the meaning the readergives to it. (p.261)
3. Canyou think of anystudies in which generalizability might be unimpor
tant? Under whatcircumstances would you find particularity and transfer
ability to be sufficient criteriafor quality control?
4. Look back at the description of Lupita. Try to find some examples of
Lupita scaffolding other children's learningor performance of tasks.
5. If you could look at all the dialog journal entries between Ali and Ellen,
whatwould you want to lookfor? What do you thinkyou would find?
6. In our summary of Nassaji and Cumming's (2000) study of dialog journal
exchanges, wehave reported the percentages from the function codingin a
prose paragraph. The original authors used a bar graph to report these
findings. Using the data provided above about the quantitative results,
draw a bar graph comparing the functions that appeared Ali's and in Ellen's
dialog journal entries. What are the advantages and disadvantages of re
porting these data in a graph or in a paragraph?
7. We have included five excerpts from Ali's and Ellen's dialog journal ex
changes in this chapter. In each excerpt, Ali provided the date. If you
arrange these excerptsin chronologicalorder, can you find any evidenceof
Ali's developing English language proficiency? What data would you con
sider to be evidence of such development?
8. Which characteristics of case studies are apparent in our brief summary of
the study by Nassaji and Cumming (2000)?
9. Read one of the case studies cited in this chapter or in the suggestions for
further reading below. Then see which of the following advantages dis
cussed by Adelman et al. (1976) are present in the study you chose. For
example, in describing Donato and Adair-Hauck's (1992) study of two
secondary school French teachers' lessons,we said that teachers can relate
to the choices Elizabeth and Claire made about teaching the future tense—
an example ofA below.
A. Case studies are 'strong in reality,' and, therefore, likely to appeal to
classroom teachers who will be able to identify with the issues and
concerns raised.
B. One can generalize from a case, from an instance, to a class.
C. Case studies can represent a multiplicity of viewpoints and can offer
support to alternative interpretations.
D. Case studies can also provide a database of materials that may be
reinterpreted by future researchers.
E. Insights yielded by case studies can be put to immediate use for a
variety of purposes, including staff development, within-institution
feedback, formative evaluation, and educational policy making.

184 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


E Case study data are usually more accessible than conventional research
reports and, therefore, capable ofserving multiple audiences.
10. Look at three of four published case studies from our field. Would you
characterize them as data-driven or theory-driven? For thoseyouconsider
tobe theory-driven, what theory ortheories do they address? Which ofthe
six principle advantages identified by Adelman et al. (1976) apply to these
case studies?

SUGGESTIONS FOR FURTHER READING


We recommend Duff's (2008) book Case Study Research in Applied Linguistics as
wellas van Lier's (2005) chapter on casestudies.
An early collection ofsecond language acquisition case studies can be found
in the volume edited by Hatch (1978). The chapters are not technically class
room research studies, but they are interesting to teachers and researchers nev
ertheless. We also recommend all the case studies cited in this chapter.
Recently, case studies have beendoneaboutteachers or teachers-in-training.
For good examples, see K. E. Johnson's (1996)J study of a student teacher
completing her ESL practicum and Tsui's (2003) case study of four teachers in
Hong Kong.

Chapter 6 Case Study Research 185


C H A PT E R

Ethnography

Stories told well and compellingly can have an unparalleledpower to


create and make a difference to the world. And that iswhat ourjobis, to
make a difference. (Christiane Amanpour, 2005, CNN Commentary).

INTRODUCTION AND OVERVIEW

Ethnography is a very important research method in the naturalistic inquiry tra


dition. It isconsidered a form of qualitative research; however, not allqualitative
research is ethnography. In Grotjahn's (1987) terms, ethnography represents the
second of the 'pure' forms of research in that (1) the dataare collected nonexper-
imentally, (2) the data themselves are usually qualitative in nature, and (3) they
are typically subjected to interpretive analysis. (See Chapter 1 above.) In other
words, in terms of Grotjahn's framework, ethnography contrasts with experi
mental research on all three points.
Ethnography is "the study of a people's behavior in naturally occurring,
ongoing settings, with a focus on the cultural interpretation of behavior"
(Watson-Gegeo, 1988, p. 576). Its goal is to provide"a description and an inter
pretive explanatory account of what people do in a setting (suchas a classroom,
neighborhood or community), the outcome of their interactions, and the way
they understand what they are doing" (ibid.).
Ethnography is a particularly valuable research method, but the term also
refers to the results of such research. Watson-Gegeo (1988) explains that "as
product, ethnography is a detailed description and analysis of a social settingand
the interaction that goes on within it" (p. 582).In addition,
as method, ethnography includes the techniques of observation, partici
pantobservation,... informal andformal interviewing ofthe participants

186
observed in situations, audio- or videotaping of interactions for close
analysis, collection of relevant or available documents and other materials
from the setting, and other techniques as required to answer researcher
questions posed bya given study, (p. 583)
In this chapter, we will first summarize the background and some key char
acteristics of ethnography. Then we will consider four principles of ethnography
and the typical stages of an ethnographic study. Next, we will focus on the role
of the researcher within the research process, including participant and nonpar-
ticipant observation, and developing emic and etic perspectives. We will then
grapple with several thorny issues related to quality control in ethnography, first
by examining ethnography in terms of the traditional criteria of reliability and
validity. However, we will see that ethnography can be more appropriately eval
uated by its own criteria.
As you read this chapter, keep in mind our metaphor about research cul
tures. If your main background in research is the experimental method, you will
find that ethnography has a very different culture. Its values, goals, norms, and
vocabulary are not the same as those of the experimental method in the psycho
metric approach.

REFLECTION

Based on your earlier readings, both in this book and elsewhere, what do
you already know about ethnography? If you see a book or article whose
title begins, "An Ethnographyof. . . ," what do you expect to read about
the topic?

Background to Ethnography
Like manv other research methods, ethnography entered applied linguistics
from another field, in this instance, anthropology. In fact, anthropology is one of
the parent disciplines of linguistics, and many eminent early linguists were
trained as anthropologists.
Anthropology is the study of cultures and societies. Anthropologists typi
cally spend long periods of time living and working among the people they are
studying. Originally, this immersion was among little known and so-called
'primitive' groups, such as the indigenous peoples of New Guinea, Africa, and
North and South America.
However, researchers came to see that they could apply anthropological re
search techniques to the investigation of groups and subgroups within their own
cultures. For example, in a classic study carried out in the 1930s,Whyte (1981)
portrayed the street corner societies of the urban poor. Smith and Geoffrey
(1968) conducted a yearlongstudy of a secondaryschoolclassroom in the United
States using ethnographic procedures. In the field of applied linguistics, Heath

Chapter? Ethnography 187


(1983) spent almost ten years documenting the lives and patterns of interaction
among people in two rural communities in the southern United States.

REFLECTION

Based on what you already know, what would you predict might besome of
the differences between experimental research and ethnography?

Because it stems from anthropology, ethnography is understandably


different—in both philosophy and practice—from research based on the scien
tific method. In fact, ethnography contrasts markedly with experiments in its
assumptions, procedures, and attitudes toward evidence. In principle, there is no
reason why a research program should not combine experimental and ethno
graphic procedures, and, in fact, there are increasing calls for 'hybrid' research.
However, a tension between ethnography and the experimental method persists.
This tension reflects the fact that the twotraditions are underpinned bydifferent
beliefs about what counts as evidence, how diat evidence should be interpreted,
and what the role of the researcher is within the research process. In short, these
two research cultures reflect two different ways of looking at the world.
This tension is apparent in the following quote from Heath (1983), who
wrote a classic ethnographic account of the role that language plays in the edu
cational process. In the introduction to her book, she writes,

Educators should not look here for experiments, controlled conditions,


and systematic score-keeping on the academic gains and losses of spe
cific children. Nor should psycholinguists look here for data taped at
periodic intervals under similar conditions over a predesignated period
of time. What this book does do is record the natural flow of commu
nity and classroom life over nearly a decade. The descriptions here [are]
of the actual processes, activities, and attitudes involved in the encultur-
ation of children, (pp. 7-8)

Heath's stance is one that appeals to many applied linguists, including language
classroom researchers. The growing interest in finding alternatives to formal ex
periments has been fuelled by skepticism on the part of some leading researchers
over the ability of controlled experiments to "produce die definitive answers that
some researchers expect" (Ellis, 1990a,p. 67). Asa result, in the past three decades,
more researchers have turned to the naturalistic inquiry tradition—ethnography
in particular—inorder to understand language use, as wellas the practicesand be
liefs of people involved in language teachingand language learning.
Language classroom ethnographies include van Lier's (1996a) report of a
bilingual (Quechua and Spanish) program in Peru; Duff's (1995; 1996) research
on dual-language secondary school programs in Hungary; Harklau's (1994)
three-and-a-half-year study of Chinese immigrant children in California;
Canagarajah's (1993) account of university EFL students in Sri Lanka; Cleghorn

188 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


and Genesee's (1984) report of a French immersion project in Canada; Shaw's
(1996; 1997) investigation ofgraduate-level, content-based language instruction
in the United States; Lin's (1999) comparison of four English classrooms in
HongKong whose students came from different socioeconomic environments;
and Crago's (1992) contrast of Inuit family communication styles and school
communication patterns in Northern Quebec. Two early collections of articles
based on ethnography were edited by Green ana Wallat (1981) and Trueba,
Guthrie, and Au (1981).

Characteristics of Ethnography
Ethnographies fall in thenaturalistic inquiry tradition, butthey differ from other
forms of qualitative research in three main ways: theyare longitudinal, theyare
comprehensive, and theyview people's behaviors in cultural terms.

REFLECTION

If you have already read an ethnography, were these three characteristics


evident in that study? Try to provide an examjjle of each way from your
reading.

In order for ethnographers to capture a society in all of its complexity,


ethnography must of necessity be long-term. In etnnography, data are collected
through "intensive detailed observation over a long period of time" (Watson-
Gegeo, 1988, p. 583). As noted above, Heathspentnearly ten years conducting
her research, and Canagarajah's research on his ownEFL students in Sri Lanka
tooka full year. Duff, the authorof the sample studywesummarize at the end of
this chapter, made several different extended trips to Hungary to collect her
data, and she also lived there for a period of time.
Secondly, the aimof ethnography isto provide a descriptive andinterpretive
account of all aspects of the society, from language and communication to kin
ship patterns, from mating rituals to festivals. This type of research is bydefini
tion comprehensive because everything that is observed hasto be accounted for.
Third, the rules and norms of interaction are described in cultural terms.
Culturesare characterized by (oftenlargely implicit) rulesand norms of interac
tion. Ethnographers try to identify these rules and norms, describe them, and
use them to calibrate the behavior they are describing. For example, when we
were younger, there was an implicit rule that when a second-generation elder
(e.g., a grandparent) was present in a social situation children "didn't speak until
they were spoken to." This rule was never spelled out, but an ethnographer
could have identified this particular aspect of behavior and described it. The
ethnographer could probably also explain why the cultural rule evolved and de
scribe its effect on the discourse and the interactional behavior of the group.

Chapter 7 Ethnography 189


TABLE 7.1 Acomparison ofcase studies and ethnographies
Characteristic Case Studies Ethnographies
Person(s) studied One or a few individuals Groups of people
Main goal of research To document and describe To document and describe
in detail one case (or a few in detail the behaviors,
cases) that may (or may not) speech patterns, and cultural
be representative of a larger valuesof a group
class of cases
Focus of the study Specific focus on the Broad focus on a group as a
bounded case whole
Duration of study Can be lengthy (e.g.,in a Longitudinal by definition—
case study of language often lastingfor years of
acquisition) or brief intense immersion in the
(e.g., where an interaction culture
alone is the case)
Types of data collected Often qualitative (audio or Often qualitative
video recordings, (observational field notes,
observational notes, interview data, documents,
interview data); sometimes life histories); sometimes
quantitative quantitative
Types of data analysis Often qualitative; also Often qualitative;sometimes
quantitative quantitative

It is these three dimensions that distinguish ethnographies from other


forms of naturalistic inquiry, particularly case studies. While it is typical to have
longitudinal case studies, it is not so common to find case studies that extend
over many years. Case study researchers also deal with one or a few cases rather
thanwith culturally defined groups. They areusually selective in whattheywant
to document. They do not attempt to describe all aspects of behavior compre
hensively. Finally, they rarely attempt to frame their accounts in cultural terms.
However, case studies are sometimes embedded in ethnographies to illustrate
and vivify aspects of observed behavior. Table 7.1 contrasts case studies and
ethnographies.

PRINCIPLES OF ETHNOGRAPHY

There are certainprinciples that guide ethnographic research, whether it is car


ried out byanthropologists documenting little-known cultures or byapplied lin
guists trying to understand the very familiar culture of language classrooms.
Watson-Gegeo (1988) has discussed four key principles of ethnographic re
search. These fourkeyprinciples canapply to classroom ethnographies aswell as
to wider ranging studies.

190 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Focus on CulturalPatternsin Groups
First, "ethnography focuses on people's behavior in groups and on cultural pat
terns in that behavior" (Watson-Gegeo, 1988, p. 577). Anexample is Cleghorn
and Genesee's (1984) report on a French immersion program in Canada. Oneof
the findings was that
interaction among the staff was conflictual and that the underlying
tension could be related to societally based group conflict [Tjhe
teachersused two main interaction strategiesto minimize interpersonal
conflict and to maintain a semblance of professional harmony: (1) avoid
ance of social interaction and (2) the predominant use of English in
cross-group communication, (p. 595)
In other words, although a goal of the French immersion program was to pro
mote bilingualism and intergroup communication! the teachers themselves fre
quently did not communicate across groups, and when they did, the code was
usually English.

Ethnography is Holistic
The second principle is that "ethnography is holistic; that is, any aspect of a
culture or behavior has to be described and explained in relation to the whole
system of which it is a part" (Watson-Gegeo, 1988, p. 577). Duff's comment
about the Cleghorn and Genesee(1984) ethnographyof the immersionprogram
in Canada illustrates this principle. In summarizing their interpretation, Duff
(1995)—herselfa Canadian—says that their study
focused on interactions among anglophone and francophone teachers in
a Montreal school with both early French-immersion and regular
English-stream programs at a time of rather acute provincial political/
linguistic tensions and misgivings. Participant observation revealed
that, ironically—and indeed contrary to the publicized objective of
immersion to foster harmony, understanding and bilingualism across
Canada's largest linguistic communities—the teachers from the two
ethnolinguistic groups avoided contactwith one another, resented each
other's presence and resorted to English, the dominant language of
the country but not of that province, in cross-group discussions.
(pp. 507-508)
Thus, the findings werecouchedin the contextof the much broader political and
linguistic issues affecting the wholecountry at the'time.

Theoretical Frameworks as Starting Points


Third, "ethnographicdatacollection begins with a theoretical framework direct
ing the researcher's attention to certain aspects of a situation or certain research
questions" (Watson-Gegeo, 1988, p. 578). For instance, Canagarajah's (1993)

Chapter7 Ethnography 191


study of his students' motivation forstudying English in postcolonial SriLanka
utilized Giroux's (1983) theory about pedagogies ofresistance and the ideologi
cal properties ofsocial institutions to help investigate and explain theambigui
ties of his students' (sometimes conflicting) attitudes toward learning English.
Ethnography is Comparative
Fourth, "ethnographic research is comparative" (Watson-Gegeo, 1988, p. 581).
That is, it allows for comparisons across cultures and across settings. Anapt ex
ample of this principle is found in Watson-Gegeo's ownwork. Based on her re
search in Hawaii, she wrote the following:
First-grade Hawaiian children's reading scores on nationally normed
tests improved dramatically after the introduction of reading lessons
based on "talk-story" speech events in the Hawaiian community. A key
characteristic of talk story is co-narration, the joint presentationof per
sonal experiences, information, and interpretations of events by two or
more storytellers.

When Watson-Gegeo later studied rural communities in the Solomon Islands,


she washoping to find an analogin that culture that could be usedin elementary
schools there because there was a very high failure rate of children in English
immersion classrooms. She wrote,

As an ethnographer, my expectation had not been that I would find an


exact equivalent to talk story (part of a Hawaiian emic framework) in the
Solomons but rather that I might discover a corresponding speech event
that, like talk story, could be adapted for classroom use. It now appears
that a Solomon Islands speech event called "shaping the mind" may
be the right candidate. As a speech event, shaping the mind involvesthe
intensive teaching of language, proper behavior, forms of reasoning and
cultural knowledge in special sessions characterized by a serious tone, a
formal register of speech, and tightly argued discussion, (p. 582)
Clearly "shaping the mind is based on an emic teaching framework different
from both Hawaiian talk story and from American/Western models of educa
tion" (pp. 581-582). It is Watson-Gegeo's comparison of the two teaching
speaking events and their potential as teaching strategies that illustrate the
comparative principle.

STAGES OF THE RESEARCH PROCESS

Three stages of ethnography have been described by Watson-Gegeo (1988). In


the comprehensive stage, the researcher studies "all theoretically salient aspects of
a setting" (p. 584) by conducting broad observations, interviewing members of
the culture, taking a census, and so on. Next is the topic-oriented stage, which in
volvesclarifyingand narrowing the topic. This stage generates focused research

192 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


>•

1)escriptive Focused Selective


Observations Observations Observations

*"

'roject Begins. .Project Ends

FIGURE 7.1 Changes in the scope of observation in ethnographic


research (adapted from Spradley, 1980, p. 34)

questions. Finally, the hypothesis-oriented stage emails testing hypotheses and an


swering research questions. Further focused observations and in-depth inter
views occur at this stage, and some quantification may be used.
The emergent nature of ethnographic research affects the type of observa
tional focus ethnographers use. Spradley (1980) has discussed how the focus of
the researcher's observations changes during the various stages of ethnographic
research:

Participant observation begins with wide-focused descriptive observations.


Although these continue until the end of the field project. . . , the
emphasis shifts firstto focused observations and later to selective observations.
(p. 34)
Figure 7.1 depicts the stagesof observations Spradleydescribes.
These stagesare well documented in an ethnographic study by Shaw (1983),
who investigated the communication patterns in engineering classes at a univer
sity in California. He did so, first, by actuallyenrolling in one course for an en
tire semester. This choice allowed him to be a participant observer in that
course. In addition, Shaw also regularly observed seventeen other engineering
courses in which he was a nonparticipant observer.
Shaw identified several stages in his study. He began by establishing the con
ditions and limitations of the study and choosing the research sites (in this case,
the engineering courses). During this phase, he conducted a pilot study and then
started his longitudinal data collection. Next, he selected a particular engineer
ing class to tape-record and began interviewing professors. Alter that, a general
picture emerged as Shaw observed and began to transcribe his audiotapes.
Thereafter, hypotheses were generated by reviewing the transcribed engineering
lessons. Then the transcripts were completed, reviewed and analyzed, and die
data were compared to the hypotheses that had been generated. Finally, a model

Chapter 7 Ethnography 193


of engineering discourse emerged. (See Chaudron [1988, p. 48] for the
reproduction of a helpfultable from Shaw's doctoraldissertation, whichsumma
rizes this sequence in more detail.)
If you think back to the earlier chapters of this book, you will see that the
order of stages in ethnography is very different from the way experiments
are planned and carried out. In formal experiments, the researchers first review
the literature and pose research questions and/or testable (i.e., falsifiable) hy
potheses. Experimental researchers attempt to identify all the relevant variables
in advance so that they can control and manipulate those variables in order to
determine whatcauses certain observed effects. The datacollection andanalyses
are clearly and carefully planned in advance in order to test the hypotheses with
optimal internal validity.
However, ethnographers oftentakewhat's called grounded theory astheir ori
entation. Grounded theory is "the practice of deriving theory from data rather
than the other way around" (Nunan, 1992, p. 57). Grounded theories are those
"based in and derived from data and arrived at through a systematic process of
induction" (Watson-Gegeo, 1988, p. 583). This sequence leads to the hypothe
sis posing happening later in an ethnographic study than it does in an experi
mental study, as illustrated by Shaw's (1983) research. Richards (2003) reminds
us, however, that "while the [grounded theory] tradition does insist that the re
search process works from data to theory, rather than vice versa, it also insists
that the aim of the process is to generate a theory" (p. 16). He adds that "this
process is open to everyresearcherand not limited to a tiny group of influential
thinkers who develop 'grand' theory" (ibid.).

REFLECTION

Do you have a personal preference? Would you rather pose a hypothesis


first and then collectthe data againstwhichto test it? Or wouldyou prefer
to start with a more general idea and let the specific hypothesis emerge
from the data?

ETHNOGRAPHERS'ROLES

Ethnographers can take a variety of roles within the given research context.
Originally, anthropologists were outsiders who went into a culture to study it.
They were clearly not members of the culture. As a result, two of their chal
lenges were to gain the trust of the members of that culture and to learn how to
participate in that society so that they could collect valid and reliable data. As
they gained entry into the field, anthropologists conducted observations along a
continuum of involvement. They were often engaged in the activities of the
group they were studying asparticipant observers. In other instances, they would
observe but not be actively engaged, in which case they were functioning as
nonparticipant observers.

194 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Later, when ethnographers began to studysocial groups that were moreac
cessible, there were situations where researchers investigated groups of which
they were already a part. In these cases, there was noneed to gain entry into the
field or to gain the trust of the members of an exotic culture: Under these cir
cumstances, the ethnographer was already a member of the culture and could
therefore readily function as a participantobserver.

REFLECTION

What do you think wouldbe the advantages and] disadvantages ofconduct-


ing an ethnography of a groupofwhich you a member? Try to imag-
ine yourselfcollecting data on a sports team, a chingstaff,a community
network, a religious organization, or any other Hnd of clubororganization
(official or unofficial) to whichyou may belong.

Participant and Nonparticipant Observation


Participant observation, one of the hallmarks of ethnography, is defined as "ob
serving whileinteractingwith those under study"(Watson-Gegeo, 1988, p. 583).
The following statementbyFox(2004) captures the essence of the approach and
illustrates the delicate balancing act that an ethnographer must manage between
objectivity and subjectivity, as wellas the tensions, (hlemmas, and contradictions
thatresult: j
Anthropologistsare trained to use a research method known as "partic
ipant observation," which essentially means participatingin the life and
culture of the people one is studying to gain a true insider's perspective
on their customs and behavior, while simultaneously observing them as
a detached, objective scientist. Well, that's the theory. In practice it
often feels like that children's gamewhere you try to pat your head and
rub your tummy at the same time. It is perhaps not surprising that
anthropologists are notorious for their frequent bouts of 'field-
blindness'—becoming so involved and enmeshed in the native culture
that they fail to maintain the necessary scientific detachment, (p. 4)
As Richards (2003) notes, "Adopting this perspective enables the researcher to
move from outsider to insider status" (p. 14). However, he adds that it is not the
ethnographer's goal "to become a complete insider, because this would mean
takingfor granted the sorts of beliefs, attitudes, anq routines that the researcher
needs to remain detached from in order to observe and describe" (pp. 14-15).
What Fox (2004) calls "field-blindness" is also referred to as "going native."
It can be a problem even for researchers working in familiar cultures, and
even when they are in the role of nonparticipant observers. For example, Bailey

Chapter 7 Ethnography 195


(1982; 1984) did an observational study of international teaching assistants
(TAs), who were non-native speakers ofEnglish, teaching inan English-medium
university in California. The research question was, "What are the classroom
communication problems of non-native speaking teaching assistants?" The
question had been spurred by numerous complaints from American undergrad
uate students who claimed their TAs did not speak English well enough to be
teaching at the university. After conducting a ten-week pilot study, Bailey ob
served twenty-four physics and mathematics classes (taught by both native and
non-native teaching assistants) three times each over the course of a ten-week
term. As the term was drawing to a close, she began to relate more and more to
the non-native speaking TAs and to feel that the American undergraduate stu
dents in those TAs' classes were largely unsophisticated, xenophobic, spoiled
children who were not taking responsibility for their own educations. She had to
remind herself" regularly that her purpose was to gather data and to understand
the viewpoints of" both the students and the TAs in addressing the research
question.

REFLECTION

If you were a researcher trying to collect data and you found yourselfiden
tifying too strongly with a subset of your research population, what steps
could you take to overcome this problem in your data collection phase?
What could you do in the data analysis phase?

Another interesting issue has to do with the covertness or overtness of data


collection. That is, when the ethnographer is already a member of the culture
under investigation, it is not necessarily obvious to others that he or she is con
ducting research. In this context, data are sometimes collected covertly as a way
of" overcoming the observers paradox (Labov, 1972).This is the interesting prob
lem that by observing, we may change the veiy thing we wished to observe in its
natural occurrence. (In Chapter 4, we saw two parallel problems in experimental
research—the reactive effects of testing and the reactive effects of experimental
arrangements.) The observer's paradox is likely to be triggered if the ethnogra
pher is an outsider to the culture. Over time, an ethnographer may gain some de
cree of acceptance as people get used to his or her presence. You will recall that
Heath (1983) spent nearly ten years in the community she studied, visiting fam
ilies, engaging in chores, going to church services, and attending community
events. This long-term, diverse involvement helped her overcome the observer's
paradox.
Figure 7.2 shows that there are different possible stances to take on the
continuum of overt-versus-covert data collection as well as participant versus
nonparticipant observation. 'Flic stance chosen depends on the ethnographer's
purpose in studying a culture and how sensitive the behavior is that he or she is
observing.

196 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Overt
Observation

Participant Nonparticipant
Observation Observation

Covert
Observation

FIGURE 1.2 Overtnessand participationin conducting observations


(adapted from Bailey, Curtis, and Nunan, 2001, p. 161)

REFLECTION

Imagine a research situation for each ofthe four quadrants in Figure 7.2.
When would it beappropriate to conduct coverfiobservations? Howto you
feel about the ethics of observingpeople when 1hey are unawarethat data
collectionis going on?

Developing and Documenting Emic andEtic Perspectives


Ethnographers' deep and long-term involvement in the culture under study al
lows them to develop two main perspectives on the behavior they document,
called the "emic" and the "etic" perspectives. These terms are borrowed from
the distinction betweenphonemic and phoneticcharacteristics of speechsounds
developed by Pike (1964). The contrast is explained by Watson-Gegeo (1988) as
follows. In using an emic perspective, the ethnographer discovers the "culturally
specific frameworks used by the members of a society/culture for interpreting
and assigning meaning to experiences" (p. 579). The emic perspective captures
the culture members' point of view. Emic analyses incorporate the participants'
perspectives and interpretations in the descriptive language theythemselves use.
In developing an etic perspective, on the other hand, the "researcher's onto-
logical or interpretive (external) framework" (Watson-Gegeo, 1988, p. 579)
comes into play. This is the outsider's perspective, and it can be informed by ac
ademictheories. We do not mean to suggest that an emicviewis right and an etic
view is wrong, or vice versa. They are simply two points of view that may differ
and should both be documented.
Sometimes the emic view and the etic view are parallel, but at other times
theycontrast. As an example, youwill recall that Bailey was a nonparticipant ob
server in the Russian teaching methods experiment that served as the sample

Chapter 7 Ethnography 197


study in Chapter 4. One ofthe teachers in that research had a very consistent
pattern of treating the learners' oral errors immediately and explicitly. As a
teacher, teacher educator, and language learner, Bailey felt that the teacher's ap
proach to error treatment was somewhat heavy-handed. But, as a classroom ob
server collecting research data, she had to document her own personal reaction
simply as an opinion and then set it aside in her mind while she observed the
classes. Most of the young military students clearly had a different attitude to
ward this teacher. They often responded gratefully and enthusiastically to the
teacher's correcting moves. In fact, one day after class, a student told Bailey that
he really appreciated the way the teacher corrected their errors because it helped
him and his classmates prepare for an important accuracy-oriented test they
would be facingsoon.
This anecdote illustrates the importance of the researcher being aware of
and documenting differences between theemic perspective and theetic perspec
tive. Doing ethnography requires that researchers actively seekout and under
stand the participants' explanations of the world. As van Lier (1990a) notes,
Working withboth emic and eticcategories, the ethnographer continu
ally walks a fine line between naive observation and externally imposed
interpretation. However, this is perfectly acceptable, so long as the re
searcher remains aware of andcommitted to the requirement to analyze
all observations and scrutinize all interpretations and inferences rigor
ously. (p-43)

The connection between thelongitudinal nature of ethnography anddeveloping


the emicperspective isthat gaining entryinto the field, developing the trustof the
participants, and documenting their point of view often takes a tremendous
amount of time (Richards, 2003).

QUALITY CONTROL IN ETHNOGRAPHY


Ethnography is no less rigorous in its quality control procedures than other
kindsof research, but the criteria for qualitymay differsomewhat. For heuristic
purposes, we will first consider quality control in ethnography through the tra
ditional psychometric criteria of reliability and validity. We will then turn to
more appropriate criteria for writing and judging ethnographies. This discussion
necessarily leads us to a reconsideration of causality as a motivating concern in
conducting research.
As with case studies, major questions have been raised over the reliability
and validity of ethnographic research. Most of these criticisms stem, either
explicitly or implicitly, from the fact that ethnographies involve the detailed,
longitudinal investigation of a particular situation or context. In contrast to the
controlovervariables exerted byexperimental researchers, ethnographers try not
to alter the naturally occurring contexts they study. Also, historically, the quali
tative collection and analyses of data have been paramount in ethnographies.

198 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


As a result, the usual criticisms about reliability and validity of" all sorts of"
qualitative data collection and analysis procedures are often raised about
ethnographic research.

REFLECTION

List other criticisms that you think might be leveled against ethnographic
research in terms of reliability and validity. How might an ethnographer
respond to these criticisms?

Challenges Specific to Ethnography


Challenges to the reliability and validity of* ethnographic research include the
quantity of data involved, the descriptive nature of the research, the uniqueness
of much ethnographic research, and die role of the researcher. We will address
each of these points in turn.
Ethnographies produce huge quantities of data. For example, a relatively fo
cused study into the lifelong learning experiences of sixty learners of English as
a second language yielded many hundreds of pages of interview data transcripts
(Benson and Nunan, 2005). For practical reasons, it is only possible to include a
small quantityof the original data in anypublished account of the research, even
in a booklength manuscript.This constraint makes it impossible for outsiders to
either reanalyze the data themselves or to replicate the study.
Because ethnographies involve the detailed description and interpretations
of the behavior of particular people in specific contexts, each ethnography is, in
a sense, unique. Ethnographers are careful about selecting the sites and the
groups for their research, but they would seldom, if ever, be concerned about
randomization. For many ethnographers, these facts render irrelevant the issue
of external validity (i.e., there is no attempt to generalize the results of the study
to a broader population).

ACTION

Turn back to Tible 5.2, which refers to sampling strategies. Which of the
strategies listed there do you think ethnographers might use to identify
their observation sites and the people they wish to study?

One key feature of ethnographies and, indeed, of much naturalistic research,


is the presence of the researcher as an active participant in the research site. Be
cause the researcher is interacting with his or her informants over a prolonged
period of time, he or she will establish a relationship with the informants that
will be unique and, therefore, nonreplicable by another researcher. Likewise, as

Chapter7 Ethnography 199


we saw in Chapter 6 with regard to case studies, there is often a worry about
subjectivity in situations where researchers have close relationships with people
involved in the study.
The active involvement of the ethnographerwith the research site and hisor
her possible influence on thebehavior oftheinformants in thestudy iscaptured
by Heath (1983) in the following vignette. It also shows how careful she was to
minimize her impact on the community she was studying.
I spent many hours cooking, chopping wood, gardening, sewing and
minding children by the rules of the community. For example, in the
early years of interaction in the communities, audio and video record
ings were unfamiliar to community residents; therefore I did no taping
of any kind then. By the mid-1970s, cassette players were becoming
popular gifts and community members used them to record music,
churchservices, andsometimes special performances in the community.
When such recordings became a common community-initiated prac
tice, I audiotaped, but only in accordance with community practices.
Often I was able to write in a field notebook while minding children,
tending food, or watching television with families; otherwise, I wrote
fieldnotes as soon as possible afterwards when I left the community on
an errand or to go to school. In the classrooms, I often audiotaped; we
sometimes videotaped; and both the teachers and I took fieldnotes as a
matter of course on many days of each year. (pp. 8-9)

This purposeful, prolonged involvementwith the participants is a very different


stance from the intentionally detached objectivity of experimental researchers.

Concerns aboutTraditional Reliability in Ethnographic Research


Reliability in the experimental sense has to do with the consistency of measure
ment. There are parallel concerns about consistency in ethnographic research as
well. In Chapter 3, we distinguished between internal reliability and external re
liability (replicability). That distinction will guide this discussionas well.
A detailed and considered analysis of problems associated with the tradi
tional notions of reliability and validityin ethnographic research wasprovided in
an influentialpaper by LeCompte and Goetz (1982). Although they use the term
ethnography rather loosely to cover a range of qualitative methods, their analysis
is helpful and we will summarize it here.
LeCompte and Goetz define external reliability in terms of replication.
They point out that at first sight, ethnography may seem beyond replication in
comparison with laboratoryexperiments. Given the naturalistic setting, the fact
that the researcher may be attempting to record processes of change over time,
and the possible uniqueness of the situation and setting, the use of standardized
controls may be impossible. In reporting the research, constraints of time and
space may preclude the presentation of data in a way that would enable other re
searchers to reanalyze those same data and come to similar conclusions.

200 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


LeCompte and Goetz's study finds that replicability can be enhanced ifre
searchers are explicit about five key aspects of the jresearch (as cited in Nunan,
1992). They should provide details on the following: (1) the status of the re-
searcher(s) within the research site; (2) who the informants are and how they
were selected; (3) the social situations and conditions in place when the research
was carried out; (4) theanalytic constructs and prernises underlying theresearch,
and (5) the methods of data collection and analysis. Let us examine these five
points in more detail.
First, attending to their own status in the research context requires re
searchers to be explicit about the social position they hold within the group
being investigated. LeCompte and Goetz (1982) say that no ethnographer can
exactly replicate the findings of another because even if an exacdy parallel con
text could be found, the second researcher would be unlikely to hold the same
status in the second social situation.
Arelated problem isfinding parallel informants in the second research con
text. This issue emerges as a major concern when we consider that the knowl
edge to begathered will be largely shaped bythose who provide it. It istherefore
imperative for researchers to describe their informants very carefully and to
explain the criteria they used in selecting particular informants for interviews,
detailed observation, embedded case studies, and so on.
Third, the socialsituation and conditions in which the data are obtained also
need to be described explicitly. These contextual issues includeboth the physical
conditionsand the economicand sociopolitical conditions that affect the group
being studied. Having access to this information is central to the readers' ability
to understand and the context and to evaluate the ethnography appropriately.
Fourth, precise definitions of constructs and premises are also crucial. As
LeCompte and Goetz (1982) note,

Even if a researcher reconstructs the relationships and duplicates the


informants and social contexts of a prior study, replication may remain
impossible if the constructs, definitions, or units of analysis which in
formed the original research are idiosyncratic or poorly delineated.
Replication requires explicit identification of the assumptions and
meta-theories that underlie choice of terminology and methods of
analysis, (p. 39)

Ethnographers may define the constructs themselves and/or cite definitions


found in relevantliterature.They mayalsodefineconstructsin terms of the emic
uses of the participants in the study. The keypoint is to be explicit and precise.
Finally, the methods of data collection and analysis must be clearly
explained—as much for the credibility of the results as for concerns about reli
ability. For example, readers should be told if field notes were made during or
after the events that were observed. If the notes were made afterwards, how
much time had elapsed between the event and recording the notes? How often
were the field notes written? Such questions will affect readers' confidence in
the data.

Chapter 7 Ethnography 201


REFLECTION

What other sorts ofdetails regarding the data collection procedures would
you want to see while reading an ethnography?

Internal reliability (often just called reliability) is a tricky concept in ethnog


raphy and other types of qualitative research. In quantitative psychometric
research, it refers to the consistency with which constructs are measured. In
experiments, internal consistency has to do with the certainty with which the
results can be attributed to the treatment. Butin ethnography it has more to do
with how well things are described. We will consider four issues related to inter
nal reliability in ethnography: (1) using low-inference descriptors, (2) involving
more than one researcher or collaborator, (3) utilizing peer examination or
cross-site corroboration, and (4) recording data mechanically.
A low-inference descriptor describes behavior that can be readily identified by
an independent observer. For example, wait time (the length of time that a
teacher pauses after asking a question) is a low-inference descriptor because it
can be verified; that is, it can be independently observed and quantified. Ahigh-
inference descriptor, on the other hand, requires the observer to make inferences
about what is going on. Commentssuch as"student lacks interest in activity," or
"students were on task during the activity" are examples of high-inference
descriptors. While it might be tempting to document only low-inference de
scriptors in order to increase reliability, the most interesting phenomena in
classrooms are likely to be highly inferential in nature.
R. L. Allwright (1980) has described high-inference descriptors as "only a
very weak representation of some underlying criteria that cannot yet be satisfac
torily described" (p. 169). lie adds that "the main defense of the use of such
high-inference categories must be that, if they are tolerably workable, they cap
ture things that are interesting; whereas low-inference categories, though easy to
use and talk about, are liable to capture only uninteresting trivia" (ibid.).

REFLECTION

Think of a study you have read or one that has been described in this book.
Did that study use high-inference or low-inference descriptors, or both?

In all forms of naturalistic research—ethnographies included—one very


effective means ol guarding against threats to internal reliability is to use more
than one researcher. (See the discussion of triangulation below.) However, for
long-term studies, thisapproach can be extremely expensive. One way of getting
around this problem is to enlist the aid of local informants to validate the inter
pretations of the ethnographer. For example, Canagarajah (1983) conducted a
yearlong ethnographic study in Sri Lanka of smdents' motivation for studying

202 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


English in college. Hewas the teacher in that context. As such, he involved his
students by soliciting their written impressions of English in the first week of
class, administering a questionnaire about thestudents' social and linguistic back
grounds, and conducting oral interviews inTamil, the students' home language.
He also made field notesaboutthe course as a participant observer. Canagarajah
(1993) wrote the following about his own role as a teacher researcher:
My daily interaction with thestudents innegotiating meanings through
English and participating in the students' successes and failures, with
the attendant need to revise my own teaching strategy, provided a
vantage point to their perspectives. Moreover, I enjoyed natural access
to the daily exercises and notes of the students and the record of their
attendance without having to foreground myrole as researcher. As the
teaching progressed, I stumbled into other naturalistic data that pro
vided insights intostudents' own point ofview ofthecourse, such as the
commentsstudents had scribbled during class time in the margins of the
textbook (which, due to frequent losses, was distributed before each
class and collected at the end), (p. 606).

REFLECTION

What doyouthinkwould be the advantages and disadvantages of conduct-


ing a classroom ethnography as a teacher reseiarching your own class,
program, or school?

According to vanLier (1990a), "scientists in all:walks of life need to conform


to certainstandards bywhich the peer groupevaluates them.This is no different
in ethnographic research" (p. 44). Peer examination involves the corroboration
byother researchers who have hadexperience in similar settings. LeCompte and
Goetz (1982) say this corroboration can proceed in three ways. First, the re
searchers may utilize outcomes and findings from other field workers in their
report. Secondly, findings from studies carried out concurrently may be inte
grated into the report. (This step provides a form of cross-validation.) Third, if
sufficient primary data are included in the published report, these data may be
reanalyzed by other researchers.
The final strategy that researchers can use to increase internal reliability is
to have available mechanically recordeddata in the; form of audio-and or video-
recordings. The strategy allows others to analyze the primary database for the
research. Electronically recorded data also allow researchers to check and
augment their field notes. (The corollary is equally important: Having good
field notes helps readers to interpret and contextualize their electronically
recorded data.)
While these four strategies are valuable ways of increasing the internal
reliability ofa study, they may not all be practical for someone who has limited

Chapter 7 Ethnography 203


TABLE 7.2 Guarding against threats to the reliability ofethnographic
research (adapted from Nunan [1992], following
LeCompte and Goetz [1982])
Type Questions

External Is the status of the researcher made explicit?


Reliability j)oes me reSearcher provide adetailed description
(Replicability) of the subjects?
Doesthe researcher provide a detailed description of the
context and conditions under which the research was
carried out?
Are constructsand premises explicidy defined?
Aredata collection and analysis methodspresented in
detail?

Internal Does the research utilize low-inference descriptors?


Reliability Qoes jt mvoive m0re than one researcher/collaborator?
Does the researcher invite peer examinationor cross-site
corroboration?

Are data mechanically recorded?

resources. The use of multiple sites and additional researchers will almost
certainly be very expensive. As we have already indicated, the predominantuse
of low-inference descriptors is also problematic because behaviors that are not
directly observable are often very interesting. For example, some of the most
important studies in classroom research have looked at learner characteristics
such as motivation, interest, power, anxiety, authority, and control. (See, for ex
ample, Canagarajah's (1993) study of students' motivation for English learning
in Sri Lanka.) These are all high-inference phenomena.
Table 7.2 summarizessome important questions about internal and exter
nal reliability in ethnographic research. The strategies embodied in these ques
tions can be used to guide ethnographers in conducting research and writing
their reports. They can also be used by readers of ethnography to evaluate such
reports.

Concerns aboutTraditional Validity in Ethnographic Research


As we have seen,validity is a more challenging conceptthan reliability sinceit is
underpinned by that most intangible of constructs—truth. We will divide this
discussion into two parts, focusing first on internal validityand then on external
validity.
Internal validity in experimental research is the extent to which a researcher
couldattribute the outcomesof an experimentto the treatment. In ethnography,

204 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


internal validity involves the question of whether the researcher's conclusions
were justified bythe data collected and the analyses thereof.
LeCompte and Goetz (1982) argue that ethnographies are strong on
internal validity. They base this claim on four characteristics of ethnographic
research:

First, the ethnographer's common practice of living amongparticipants


and collecting data for long periods provides opportunities for contin
ual data analysis and comparison to refine constructs and to ensure the
match between scientific categories and participant reality. Second, in
formant interviewing, a major ethnographic data source, necessarily is
phrased more closely to the empirical categories of participants and is
formed less abstractly than instruments used in other research designs.
Third, participant observation, theethnographer's second key source of
data, is conducted in natural settings that reflect the reality of the life
experiences of participants more accurately than do contrived settings.
Finally, ethnographic analysis incorporates a process of researcher self-
monitoring ... that exposes allphases of the research activity to contin
ual questioning and reevaluation. (p. 43)

(In this context,LeCompte and Goetz are using the term internal validity some
whatmore broadly than it is normally used in experimental research.)
Some of the concerns about internal validity in experimental research arise
in doing ethnography. Because the ethnographer typically studies a group for a
longperiod of time, maturational changes or the attrition of informants occur
ringduring thecourse oftheresearch might affect theoutcomes. Normally, both
theseconcerns are addressed by the very natureof ethnographic inquiry. That is,
the longitudinal data collection procedures capture change asthe daily life ofthe
group is documented. In addition, movement in and out of groups (through
birth, death, relocation, and so on) is a normal part of the cultural existence that
ethnographers strive to document.
Threats to internal validity in experimental research are sometimes related
to biasin the selection of subjects in a study—a threat that is normally dealt with
through randomization. In ethnographic research! informants are often chosen
precisely because theymeeta certaincriterion. For instance, ifyouwish to study
the identity construction of recent immigrants in secondary schools, youwould
choose a groupof recently arrived teenagers to observe. In writing an ethnogra
phy, it is important toexplain exactly why and howjan observation site and group
were chosen, but it is not a requirement that they be randomly selected.
We saw in experimental research that the reactive effects of the testing and
the reactive effects of experimental arrangements can be mattersof concern.The
parallel question in ethnography is whether the outcomes are due in part to the
presence of the researcher in the context. This issue, which we have already
labeled the observersparadox, is alsocalled reactivity. It can be dealtwith by build
ingthe trustofthe participants, byprolonged immersion in the research context,
and by behaving ethically and appropriately in that context.

Chapter7 Ethnography 205


Because ethnography intentionally does notimpose artificial controls, people
who come from the cultureof"experimental research often wonder whetheralter
native explanations for phenomena might not be posed. In fact, ethnographers
work rigorously to examine alternative explanations for phenomena as they de
velop the emic and etic perspectives in their research. When sufficient data from
different sources rule out alternative explanations, they can be excluded.

REFLECTION

You have now seen several contrasts between experimental research and
ethnography in terms of internal validity. Which of these differences did
you predict? Which, if any, do you find troublesome? In which situation
described above (ifany) do you prefer the ethnographic stanceover the ex
perimental stance, or vice versa?

Concerns arise about external validity as well. The results of a study cannot
be generalized if the phenomena being investigated are unique to a particular
group or site. Someone who wishes to generalize the findings of an ethnography
must ask if the historical experiences of a particular group make it so unique that
findings about that group cannot be legitimately extended to other groups.There
mayalso be a worry that constructsand terminologyare not common to different
cultures and research sites. These concerns are summarized in Table 7.3.

TABLE 7.3 Guarding against threats to the validity of ethnographic


research (adapted from Nunan [1992, p. 63], following
LeCompte and Goetz [1982])

Type Questions

Internal Is it likely that maturational changes occurring during


Validity the course of the research will affect the outcomes?
Is there bias in the selection ot informants?

Is the increase or attrition of informants over time likely to


affect outcomes?

Have alternative explanations for phenomena been


rigorously examined and excluded?
Are outcomes due in part to the presence of the researcher?
External Aresome phenomena unique to a particular group or site
Validity and therefore noncomparahle?
Are cross-group comparisons invalidated by unique
historical experiences of particular groups?
lb what extent are abstract terms and constructs shared
across different groups and research sites?

206 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


If ethnography is judged from the experimental perspective, external valid
itycan sometimes be seen as a major weakness. Some would argue that it is im
possible to generalize interpretations from the particular site in which the data
were collected to other sites. For this reason, some researchers (e.g., Watson-
Gegeo, 1988) suggest that comparison rather than generalization is a more ap
propriate goal for ethnographies. By aggregating and comparing the insights
from a series of ethnographies, generalizations may ultimately emerge, but it is
comparisons and contrastsacross cultures that are illuminating.
Other researchers argue that external validity is not something that ethno
graphic researchers need tobe concerned with (see, for example, Richards, 2003;
van Lier, 1990a). They say that generalizability is an important ground rule of
experimental research because the experimental method was devised specifically
to allow generalizations to be made from samples to populations. They argue,
however, that this is not a rule to which other research paradigms need sub
scribe. In effect, requiring an ethnographer to adhere to the rules of" experi
mental research islike asking a basketball player to obey the rules ofvolleyball or
vice versa—it simplydoes not make sense.

REFLECTION

What isyourstance on this issue? Do you thinkthat ethnographies should


strive to meet the 'rules' of external validity?

Alternative Criteria for Evaluating Ethnography


In a sense, we can think of traditional reliability and validityas comprising an etic
framework—one borrowed from experimental research. A more emic perspec
tive would be to judge ethnographies through the criteria developed by the
members of that particular culture—the ethnographers themselves.
Quality control in ethnography goes beyond debates about the traditional
criteria of reliability and validity. Indeed, much has changed since LeCompte
and Goetz (1982) wrote their landmark treatise on validity and reliability in
ethnographic research. These days some qualitatively oriented researchers argue
quite convincingly that there are otherstandards by which ethnography should
be judged. Ethnography is subject to its own quality control mechanisms, and
van Lier describes four of"these (1990a, p. 45; italics in the original):
1. Every studv needs to be scrutinized for its adherence to the emic and holis
ticprinciples.
2. The notion of context needs to be examined in great detail and the role of
context in interpretation must be made explicit.

Chapter7 Hthnography 207


3. Ethnographic research must be open, that is, it must examine and report its
own processes ol inferencing and reasoning, so that its procedures can be
profitably discussed.
4. Ethnography must be either broad (longitudinal) or deep (micro-
ethnographic). . . . Ethnography requires intensive immersion in the data,
whether this is the daily language use of an entire culture, or one small
story told by a child.

Micro-ethnography is the in-depth study of" interaction following emic


principles. That is, an ethnographer might record a story aboutan adventure and
do a detailed analysis of the discourse, relating it to the culture from which it
comes. This is what Watson-Gegeo was referring to when she wrote about
"audio- or videotaping of interactions for close analysis" (1988, p. 583).
In terms of evaluatingethnographic research,van Lier states that the notion
"ofquality as a superordinate concept is crucial. It encompasses both reliability
and validity. However, since these latter terms are associated with experimental
and statistical norms, I prefer to use the terms adequacy (ofargumentation and
evidence) and value (within a theory, i.e., internal, and to human affairs in gen
eral, i.e., external)" (1990a, p. 35). Figure 7.3 depicts the relationship among
these terms.
Qualitative research in general comes under fire from people who are
committed to experimental research as the only appropriate way to discover
truth. Writing about what he calls "qualitative inquiry," Richards (2003) says that
the key issue in evaluating qualitative inquiry is trustworthiness—a concept

/• ^

Quality
V

Adequacy- Value

Internal External
Argument Evidence
Value Value

FIGURE 7.3 Concepts for determining quality in ethnography


(adapted from van Lier, 1990a, p. 35)

208 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


developed by Lincoln and Guba (1985). Following these authors, Richards
(2003) discusses three criteria for evaluating qualitative inquiry, including
ethnography. In the quotes below, he juxtaposes these concepts with theirexper
imental cousins in brackets:

o Credibility [internal validity] depends on evidence of long-tem exposure to


the context being studied and the adequacy of data collected (useof differ
ent methods, etc.). i
@ Transfei-ability [external validity] depends on a richness of description and
interpretation thatmakes a particular case interesting and relevant to those
in other situations.
© Dependability [reliability] and confirmability [objectivity] are to be assessed
in terms of the documentation of research design, data, analysis, reflection
and so on, so that the researcher's decisions are open to others, (p. 286)

(You will recall that transferability and other similar issues arose in Chapter 6
when we discussed quality control in case studies.)
In fact, research of all kinds raisesquality control issues, and each tradition
demands that researchers show that they have taken the appropriate steps to
insure both reliability and validity. As van Lier (1988) has noted, "The black
smith cannot criticize the carpenter for not heating a piece of wood over a fire.
However, the carpenter must demonstrate a principled control over the mate
rials" (p. 42).

ACTION

Fill in eachblankbelow with one of these threi sets of criteria for judging
ethnography. Each criterion is followed by its experimental analog in
brackets.)

Dependability [reUability]
Transferability [external vaHdity]
Credibility [internal vaHdity] and Confirmability [objectivity]
1. depends on evidence of long-tem exposure to the
context being studied and the adequacy of data collected(use of differ
ent methods, etc.).
2. depends on a richness of description and interpreta
tion that makes a particular case interesting and relevant to those in
other situations.
3. should be assessed in terms of the documentation
of research design, data, analysis, reflection, and so on, so that the
researcher'sdecisions are open to others.

Chapter7 Ethnography 209


A central part of this argument hinges on whether a study is seeking to es
tablish causal relationships or not and whether, in fact, causal relationships can
be identified in areas as complex as social behaviors, such as language learning
and teaching. The problem is this: "If it is assumed a priori that L2 learningis
caused by certain sufficient conditions, the researcher's job is to circumscribe
those conditions so that, whenever they obtain, the occurrence of L2 learning
can be accurately predicted" (van Lier, 1990a, p. 36). This point is nicely illus
trated by considering the event of a tree blowing down in a storm:

It is clear that not every time the wind blows against a tree, that tree will
fall down. When we study the phenomenon,we must add qualifications
and amendments which are endless: the wind must blow hard enough,
the tree must be fragile enough, the roots must grip the soil insuffi
ciently, the soil must be loose enough, etc. In addition, we must take
into account the position of the tree among buildings, other trees, and
so on. It would probably be impossible to lay down all the conditions
which would ensure a guaranteed tree-falling-down event. So even if we
are able to say: "The wind caused that tree to falldown," we are still not
able to specify exactly what it will take for another tree, say the orange
tree in the back yard, to fall down. (p. 37)
This comparison concludes with the important caveat that "it is obvious that L2
learning is an event which is vastly more complex than a tree blowing down"
(ibid.).
How does the tree-falling-down analogy relate to this discussion? In exper
imental research,internal validityhas to do with creating a research situation (by
controlling and manipulatingvariables) in which the researcher can confidently
claim a causal relationship between the independent and dependent variable.
However, establishing causal relationships is not an issue if the researcher is pri
marilyconcerned with providinga detailed description, analysis, and interpreta
tion of the chosen context and situation (as is the case with ethnography) rather
than with applying treatments and establishing causality. In fact, van Lier
(1990a) argues that

a simple causalview is inappropriate in classroom research for one very


uncontroversial reason, namely that teaching does not cause learning.
Many times learning takes place without teaching, and, perhaps equally
often, the teaching event is not followed by a learning event, (p. 38)
He concludesthat "most of our efforts at doing experiments or quasi-experiments,
with all the attemptedcontrolsof variables and randomizations of treatment may
be doomed to failure (especially given the complexity of language learning
processes)" (ibid.).
In classroom ethnographies, van Lier (1990a) says that taking a causal view
is not necessary: "It is sufficient to say that the people involved can make an ef
fort to create optimum conditions so that learners can get on with the business
of learning in the best way that they see fit, and can help each other in the

210 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


process" (p. 40). He calls this approach to research interpretive, while causally
oriented experimental research is often called normative.

REFLECTION

Now that you have read about quality control issues in both ethnography
and experimental research, do you have a preference for one or the other?
These are the two "pure" research paradigms in Grotjahn's (1987) frame
work. They are apparendy juxtaposed to each other in (1) the research de
sign, (2) the nature of the data, and (3) the type of analysis used. Do you
feel more comfortable with one research culture or die other?

TRIANGULATION IN ETHNOGRAPHIC RESEARCH

Ethnographies are strengthened and can be judged in part bythe extent to which
they use a number of procedures that fall under the general heading of triangu
lation. This concept is best known in ethnography, but in all forms of qualitative
research it provides a way for researchers working with nonquantified data to
check on their interpretations of those data. Byincorporating multiple points of
view, researchers can check one perspective against another. If more than one
type or source of data leads to the same conclusion, researchers have more con
fidence in those conclusions. (See van Lier, 1988, for further discussion of these
points.)
Triangulation isa geometric concept borrowed from navigation, astronomy,
and surveying. Hammersley and Atkinson (1983) state that if" people wish to lo
cate their position on a map,

a single landmark can only provide the information that they are
situated somewhere along a line in a particular direction from that
landmark. With two landmarks, however, their exact position can be
pinpointed by taking bearings on both landmarks; they are at the point
where the two lines cross, (p. 198)

In qualitative data collection, the triangulation metaphor refers to a quality con


trol strategy. In social research, if"diverse kinds of data lead to the same conclu
sion, one can be a little more confident in that conclusion" (ibid.).
In anthropology, triangulation is used as a process of verification that gives
researchers confidence in their observations. Denzin (1978) described four types
of triangulation. The first is data triangulation, in which different sources of data
(teachers, students, parents, etc.) contribute to an investigation. Secondly, theory
triangulation is used when various theories are brought to bear in a study. (The
ory triangulation is probably the least commonly used in our field.) The third
form isresearcher triangulation, in which more than one researcher contributes to

Chapter7 Ethnography 211


the investigation. Finally, methods triangulation involves the useofmultiple meth
ods (e.g., interviews, questionnaires, observation schedules, test scores, field
notes, etc.) to collect data.
These concepts come to us from anthropology, but they can be applied to
classroom research conducted with a range of methods. Here is an example
from Springer (2003). As you may recall from the sample study at the end of
Chapter 2, she was interested in what factors promoted second language acqui
sition in language classrooms. After having taught a project-based course and
experienced the students' learning in that context, she began to reconsider her
earlierteachingexperiences usinggrammar-based syllabi. Her evolving research
focus led to these questions:

What implications does a much broader view of language and the


language learning process have for my role and responsibilities as a
language teacher? What changes come about in my expectations
concerning the roles and responsibilities of the students? What impact
does the shift to content- or project-based syllabi have on these class
room roles and on the language produced by course participants—both
teachers and learners? (pp. 2-3)

lb address these questions, Springer arranged to observe another project-


based course, taught by a different teacher, for academically oriented ESL stu
dents in California. The course revolved around the students carrying out a
series of four photojournalism projects. Springer took the role of an observer in
the course, switching from a nonparticipant stance to a participant stance as
needed, with the full cooperation of the teacher. In describing the context, she
wrote,

In addition to being project-based, this course also entailed some meas


ure ofstudentcontactwiththe targetlanguage community (bothin per
son or vianewspapers or the Internet) and involved the use of computer
technology for the completion of tasks and presentation of the final
product, (p. 3)

The final project for the course consisted of a magazine that the students pro
duced through sitevisits, interviews, and a variety of computer tools.
In conducting thisstudy, Springer became interested in scaffolding and con
tingent language use. In Chapter 6, we defined scaffolding as "progressive help
provided by the more knowledgeable to the less knowledgeable" (Nassaji and
Cumming, 2000, p. 98). In naturally occurring conversations, contingency is the
tendency of one speaker's turn to be influenced by a preceding speaker's turn.
That is, a subsequent utterance iscontingent upononeor more preceding utter
ances. (See van Lier, 1989.) Springer's (2003) literature review led her to the
following descriptions of these concepts:
For Aljaafreh and Lantolf(1994), three significant characteristics ofscaf
folding [italics added] are that it be graduated, dialogic and contingent.

212 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 7.4 Sources of triangulation in Springer's (2003) study of a
project-based ESL course on photojournalism (p. 46)
Methods running field notes during the photojournalism class
Triangulation (researcher in the role of an overt nonparticipant observer and
entails theuse of participant observer)
multiple methods to journal entries by the researcher based on immediate and
collect data delayed retrospection on photojournalism class observations
(as well as thoughts on research questions and notes on
discussions with the classroom teachers)
audio recordings of the photojournalism class (digital minidisc
[AID] recordings usinga stereo microphone, which was
normally placed in the center of the table; also some recordings
from an additional recorder placed near individual groups
during group/pair work)
copies of student-generated coursework: four major
assignments (biography of a peer, photo manipulation &
commentary, restaurant review, newsstory)
copies of materials used in class: (1) T-generated worksheets;
(2) T-collected authentic materials; (3) 'I-led group-generated
lists and brainstorms; (4) S-collected authentic materials and
homework assignments; (5) S-generated pair-work tasks
group interviews with the students (documented using field
notes and digital audio recording): (1) background biographical
information on each student; (2) prior study of- English; (3) reac
tions to statement: "The photojournalism course looks like fun,
but some teachers would be skeptical that it's not really related
to learning English."
interviews with the classroom teachers (most are optional
and would be documented using field notes and digital audio
recording; retrospective journal entries on three preliminary
meetings): (1) reactions to the class, general +/— impressions;
(2) discussion of student requests to have less or no core next
session; (3) discussion of actual course schedule—flexibility/
changes/retrospective changes; (4) reactions to ideas about
themes and recurring elements
Data Triangulation students
different sources of classroom teachers
data are used observer/researcher
program administrator
Theory project-based learning
Triangulation pedagogical scaffolding
various theories are contingent language use
brought tobear in a
study
Researcher none—only one researcher coiitributed to the study
Triangulation

Chapter 7 Ethnography 213


This means that assistance should be offered along a scale from least to
most explicit only to the extent to which it is needed—no more and no
less—and that the only way to determine this point is through dialogue
with the other person, (p. 18)
Contingency [italics added] is the quality of language use that can most
directly be associated with engagementand learning (van Lier, 1996b,
p. 171). Contingencies draw upon what we know and connect this to
what is new. It is thus part of the essence of learning, (p. 74)
lb understand the roles of scaffolding and contingent language usein the pho
tojournalism class, Springer used a range of data collection and analysis proce
dures. The combination of these procedures made her research very strong in
terms of triangulation. The types of triangulation she used are documented in
Table 7.4.

A SAMPLE STUDY

We have chosen Duff's ethnography of dual immersion programs in secondary


schools in Hungary as the sample study for this chapter for three main reasons.
First, it illustrates manyof the principles and practices of ethnography described
above. Secondly, it provides a fine example of a classroom ethnography that is
embeddedin a widersociolinguistic and sociopolitical context.Third, it is acces
sible to interested readers in various publications.
One of the published papers is entitled"Different languages, different prac
tices: Socialization of discourse competence in dual-language school classrooms
in Hungary" (Duff, 1996).It is that pubHcationfrom which we will draw the bulk
of this summary. The study investigates
the socialization of discourse competence in these two types of instruc
tional environment—the traditional and the non-traditional, the mono
lingual and the bilingual—and explores the impact of educational
reform on school life and the ways in which students are inducted into
academic discoursepracticesin the two contexts through different kinds
of performance tasks, (p. 407)
This study evolved from Duff's participation in an earlier project that investi
gated the efficacy of having dual-language education in Hungary. (See Duff,
1991a; 1991b.)

The Contextof the Study


The contextDuff (1996) investigated consisted of dual-language programs—late
immersion programs in Hungarian secondary schools—and comparison schools
with more traditional curricula. In the dual-language schools, English was the

214 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


medium of instruction in the majority of courses, including mathematics,
history, biology, physics, and geography. However, chemistry courses and
Hungarian language and literature courses were taughtin Hungarian.
The research was conducted from 1989 to 1990, and the effects of
Hungary's transformation from a highly controlled country to an independent
republic greatly influenced the social andpolitical climate in which the studywas
done. Of the historical context, Duff (1995) writes,

In the mid-1980s, educational reforms in Hungary granted schools


more autonomy from the stateandwithinjusta few years, up to 30dual
language programs with an assortment of Western languages had been
established with support from the Ministry of Education.... In the
same period, EFL programs mushroomed in schools, universities, and
other institutions, (p. 509)

Duff (1996) chose history lessons for her data collection because history "is a
compulsory subject in the Hungarian curriculum, is enjoyed by most students,
and is rich in both interaction and in linguistic and cognitive structures (e.g.,
narrative, description, cause-effect)" (p. 413). She observed history lessons in
both dual-language and traditional schools to allow for comparison across the
two types of programs.

Focus of the Investigation


Duff provides a fascinating account of how the wider sociopolitical context
played out in secondary school classrooms. In particular, she focuses on a
speech event that is called the feleles in Hungarian. This is a form of daily oral
recitation in which the teacher calls on a student to come to the front of the
room and report on the previous lesson, responding to questions from the
teacher in the process. The students do not know in advance whether or not
they will be called upon to recite. Their performance is graded on a scale of
one to five (with one being the low mark), and the teacher announces the
student's grade aloud to the class immediately following the recitation. Duff
(1996) says, "In my fieldwork, it was apparent thatfeleles evoked a strong and
often visible emotionalresponse from those familiar with it—which was practi
cally every HungarianI met" (p. 411). She writes;
Students may be quite critical of the chronic stress associated with the
anticipation of numerous possible recitations every day as well as the
formality and rigidity of the structured interaction. Once exposed to
instruction in which the traditional feleles is no longer practiced,
students were especially vocal about its perceived abuses, (p. 421)
The feleles was a sort of educational ritual in traditional Hungarian class
rooms. This recitation was standard practice for assessing students' knowledge
and motivating them to study in traditional classes. Duff(1996) connects this
activity to traditional Prussian educational practices. She notes, "The system

Chapter7 Ethnography 215


creates the ritual, but the ritual also creates the system; hence, change in one is
bound to effect change in the other" (p. 409). As the broader sociopolitical
contextin Hungary began to change, so did educational practices. In the newly
established dual-language programs, thischange extended to the types of oppor
tunities for students to speak in class.
Duff (1995) noticed that while the feleles was being used in the traditional
programs, in the dual-language classes it was being replacedwith other forms of
in-class oral communication, such as open-ended discussions and brief lectures
by students. She posed two research questions about this situation:

1. Was instructional discourse in English-medium history classes different


from Hungarian medium non-dual-language classrooms and, if so, what
might account for this?
2. What parallels existbetween microlevel discursive changes (or differences)
in these lessons and changes taking place countrywide? (p. 505)

Data Collection
In order to answer these questions, Duff videotaped approximately fifty history
lessons, whichwere taught by three teachersworkingat four differentsecondary
schools located in different parts of Hungary. She usually observed a given
teacher's class for a period of two weeks in order to see the development and
review of themes across a unit of instruction.
The videotaped data were transcribed. Any data that originally occurred in
Hungarian were translatedinto English.In addition to makingher observational
field notes, Duff also interviewed students, teachers, and administrators. She in
cluded discussions with consultants and copies of students' essays in her data
base. The reports of her research are replete with excerpts from the transcripts
so readerscan get a strong sense of the interaction among students and teachers,
both in dual-language and traditional classrooms.
The processes of transcription and translation were facilitated by the stu
dents themselves. Duff (1995) says, "Nearly a dozen ... students volunteered to
be my part-time research assistants, helping with transcription, translation, ver
ification, and interpretation, especially during their summer vacation; in this
way, we becamequite familiar over time" (p. 513). Involving the students in this
way helped her to counteract the observer's paradox to some extent.

Key Findings
One of the differences between the traditional programs and the dual-language
programs was the less frequent use of xht feleles in the dual-language programs.
In fact, Duff (1995) observed that, to a large extent, "the daily recitation period
had been rejected by most of the history teachers in the dual-language
programs" (p. 517).
The speecheventsthat replaced thefeleles consisted of briefpreparedlectures
by students, often delivered from written notes, as well as question-and-answer

216 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


sessions, pair work, and group work. Duff contrasts the traditional/e/e/er with the
students' prepared presentations:
Despite some similarities with the recitation (i.e., extended oral dis
course produced by a student on an academic topic), the English lec
tures did not appear to be as taxing or as frequent an activity for
students. . . . But as a relatively new activity, ihe procedures, students'
roles, and responsibilities were still being negotiated, and this revealed
some of the ambiguities of the changing discourse practices at schools,
(ibid., p. 519)
In addition, unlike in the feleles, the students were able to select their speech
topics themselves and prepare in advance. Duff provides a case study of one
particular teacher and class from the broader study thatillustrates this change in
daily classroom practice.
The changes in these speech events in the dual-language schools are con
nected to the broader social and political changes within Hungary:
Schools, generally thought to be conservative institutions, have not
been immune from many of the restructurings and ambiguities that
have resulted from decentralization. . . . Dual-language schools in
Hungary represent an especially vital system in which many of these
changes have occurred at an accelerated and magnified rate, perhaps
because they had ahead start before 1989 and] perhaps also because the
interface of Western (or non-Hungarian) and Hungarian systems and
the languages they promote takes place day by day in classrooms, cor
ridors, and teachers' offices. In those spaces, teachers, visitors and
students, often with different ideological, linguistic, and sociocultural
backgrounds, come together in the name of a progressive new model of
education, (ibid., p. 529)
In making thisrather positive comment abouta "progressive newmodel of edu
cation," Duff is not imposing her own point of view on the data. Rather, these
ideas and opinions represent the emic perspective of the teachers, students, and
administrators in the dual-language schools.
Duff's reportis richly textured andhighly contextualized. We have provided
only a very briefsummary here. Her research provides a clear example of a lan
guage classroom ethnography aswell as guidance for other researchers (e.g., in
terms of the transcription conventions used; see Ehiff, 1996).

PAYOFFS AND PITFALLS

Ethnography is by definition longitudinal (with the exception perhaps of


micro-ethnography). Therefore, conducting ethnographic research and writ
ing about such investigations can be very time-consuming processes. Many
ethnographers have spent years in a site, learning about the culture, gaining the
trust of the informants, and often even learning the language of that culture in

Chapter7 Ethnography 217


order to communicate with the people being studied. The depth required by
good ethnographic analyses means they do not fit easily into what van Lier
(1990a) calls "the cycle of conference presentations" (p. 45)—that is, "a brief
treatment period, a testing session, a twenty-page write up,allprobably wrapped
upin three to six months from startto finish." He counsels that"ethnography is
not conducted that quickly" (ibid.).
In fact, Richards (2003) identifies thelongitudinal nature ofethnography as
its main drawback because

it requires extended exposure to the field, which makes it verydifficult


for the researcher to stay in work during the period of investigation. It
is methodologically unacceptable to settlefor quick forays into the field
in order to scoop up data and retreat, (p. 16)

Richards notes that even though it may be "legitimate to use methods charac
teristic of ethnography, these do not in themselves mean that youareworking
within this tradition" (ibid.). For this reason, it is important to distinguish be
tween true ethnographies and research that might more properly be called
qualitative.
Because of the prolonged immersion of ethnographers in the field, the
sheer quantity of data can be overwhelming. For novice researchers in particu
lar, it is important to take care in organizing, labeling, dating, and storing all
field notes, audio tapes, videotapes, archival documents, photographs, and
maps. We will return to these issues in Chapter 14when we discuss the analysis
of qualitative data.
Another concern is that ethnography as a research method and grounded
theoryas an orientation may not suitsome researchers' personalities and cogni
tive styles. As van Lier (1990a) points out, "the heuristic quality of ethnography
makes it an inherently insecure pursuit, sincethere are no firm external rulesand
guidelines for proper scientific conduct" (p. 41). For this reason, it seems essen
tial to have a high tolerance for ambiguity before undertaking ethnographic
fieldwork. In addition, "the worker in the field isessentially alone, andinevitably
learns as much from opportunities missed, false leads too strenuously pursued,
and insights by-passed in inexplicable ways, asfrom routine description and cat
egorization" (ibid.). For these reasons, doing ethnography requires a substantial
commitment of time and personal fortitude.
Finally, researchers interested in learning more aboutethnography mayen
counter difficulty in finding training courses. Asvan Lier (1990a) notes, without
such training it is difficult to have a clear idea as to what constitutes 'good' or
'bad' ethnography. He adds that

although this problem also exists in normative types of research,


workers in the latter tradition have the advantage that most graduate
degree programs have substantial components of quantitative research
design and statistics training, whereas training in ethnography is
rarely available, (p. 44)

218 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


However, there is some progress being made in this regard as more and more
university faculties recognize thevalue ofqualitative research and begin to offer
courses in its procedures. (See Levine, Gallimore, Weisner, and Turner, 1980,
for an early report on this effort.)
In spite of these numerous pitfalls, there are also several important (but
perhaps less tangible) benefits of ethnography. We have already alluded to
one that is specific to classroom research—involving teachers and learners
while documenting their (emic) pointofview. As Canagarajah's (1993) study in
Sri Lanka illustrates, teachers themselves are sometimes in an ideal position
to do classroom ethnography because they are naturally placed participant
observers. They must, of course, strive to develop the etic pointof view and to
be aware of their own biases if they do conduct ethnographic research in their
own classrooms, but we see no a priori reason why this concern cannot be
addressed.
Another profound benefit of ethnographies is that the information they
provide is different from that provided with the experimental method. This
information, says vanLier (1990a), "may either be compatible or contradictory.
Whichever way things turn out, a diversity |of research programmes is
essential to promote an enrichment of theoretical and professional knowledge"
(p. 50).
Likewise, ethnographies—like case studies-^are often more accessible to
reading audiences without statistical training. As a result, they may have more im
pact or reach a wider public thanwill purely quantitative experimental reports.
We began this chapter with a quote from Christiane Amanpour of CNN.
She said, "Stories told well and compellinglycan have an unparalleled power to
create and make a difference to the world. And that is what our job is, to make
a difference." This journalist's credo can also serve well as guidance for ethnog
raphers. One of the most crucial payoffs of a really good ethnography is the
compelling story that it bringsto readers. To illustrate this point, we close this
chapterwith an extended quote fromvanLier (1996a), whoconducted research
in rural schools in Peru. His brilliant description of "the daily grind" provides
readers with a clear and vivid understanding of students' and teachers' class
room lives.
The project in Peru was the initiation and monitoring of a bilingual
Spanish-Quechua program. Spanish is the dominant national language and
Quechua is the local language and, therefore, the home language of the
children. His portrait of a typical day begins as he leaves his base in the town
of Puno and spends weeks at a time in the rural, isolated altiplano, where sub
sistence farming is the main industry. As you read the following paragraphs,
imagine yourself as an ethnographer in this context. The story begins with
van Lier telling us that "the trip takes about four hours, depending on road
conditions" (p. 368).

We arrive in Tiyana just after eight in the morning. The school, consist
ing of three classrooms (adobe walls and corrugated tin roof) and a
storeroom (almacen), which the director (the principal) has converted

Chapter7 Ethnography 219


into hisliving quarters, isstill locked. There is no community centeras
such(apartfrom the school itself), justa scattering of lowadobe houses
strewn across the pampa, with tilled fields in between the houses and
further out, against the low foothills of the Cordillera. Gradually the
children and the teachers (three in total) begin to arrive from various
directions. Meanwhile, the awicha, or grandmother, a tiny old woman
living in a small house next to the school, comes over for a chat in
Quechua, from which I cannot gather much more than that the teach
ers are good-for-nothing layabouts and drunks, the children are a
nuisance, and the weather continues to be awful.
When the teachers arrive, one byone,wegreet, exchange news, and
discuss plans for the week. Around nine o'clock the profesor de turno
(teacher-on-duty; this week it's Gerardo's turn) whistles for theforma
tion. The children line up, grade by grade, facing the school. The
students receive a pep talk, general announcements about meetings or
various community activities, sing a couple of songs (including the
national anthem), perhaps do a bit of marching in place. All this is in
Spanish, with a few comments repeated in Quechua for the benefit of
the smallest ones (los mas chiquitos).
Studentscontinueto arrivein dribsand drabsthrough theformation
(many of themhaving to walk up to an hour fromtheir widely dispersed
homes across the pampa). Around 9:30 the students are marched into
their classrooms. In Tiyafia there are three teachers, which means that
everyteacher has two grades. The director, Luis, has the first and sixth
grades,which is unusual (adjacentgrades being the rule in combination
classes). I beginbyobserving a coupleof classes and planto start the oral
entry test for the first grade after lunch.
The classroom is fairlyspacious, with barely adequate light from
two rows of small windows in opposite walls. About half of the
windowpanes are broken. There is no electricity (or running water or
other amenities) in the school or in Tiyafia as a whole, for that mat
ter, and the room is quite cold early in the morning. The sixth grade
students (ten children) sit at one end of the room, facing the back
wall, the first grade students (seventeen children) at the other, facing
the front.
At 10:30 there is recess, and Luis and I stand outside against the
wall which is now warmed by the sun to warm ourselves for a while.
Then the teachers, several of the biggestkids, and I playvolleyball after
putting up the net. It is almost 11:30 before the director declares that it
is time to go in again. Another lesson follows and at 12:15 it is
lunchtime.
The children take their plate or mug and go to the back of the
school where the kitchen is. Several sixth graders and a few madres de
familia (mothers who do this voluntarily) have prepared el quaker
(oatmeal), which they dole out to the children.

220 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


At about 1:30 Gerardo blows the whistle for formation. At that
moment the Teniente Gobernador (the elected leader of Tiyafia), Don
Anselmo, arrives on his bicycle. We talk a bit, and he makes some
announcements to the assembled children about vaccinations for
cattle, the importance of hygiene, and the need to study hard in school.
He speaks in a mixture of Spanish and Quechua, with Quechua
predominating.
Around two o'clock the children march into their classrooms again.
Assisted by an older student, I startthe oral Quechua test(about five to
ten minutes per child), using a structured interview format. At three
o'clock there is another recess which lasts 'til four, and then the children
are sent home. One teacher, Ignacio, rides home on hislittlemotorbike.
The other teacher, Gerardo, walks across the pampa to a house where
he rents a room.
Later on Luis fetchesa bucket of water from an open water hole be
hind the school, and I help him cook dinner (the inevitable sopita, or
soup, made with potato, tomato, noodles, and a few other bits and
pieces that happen to be available) on a primus stove, sharing some in
gredients. Gerardo comesover to join us.
The otherdays proceed very much like thefirst one. On Wednesday
I go to Qotokancha (Ignacio gives me a ride over on his motorbike),
where there are only two teachers, each one taking care of three
grades—but otherwise a similar routine is followed. On Friday after
noon I wait for the car to take me back to Puno. (pp. 368-369)

This level of detailed description continues throughoutthe rest of the report, in


which van Lier documents the instructional practices and learning challenges
facing the teachers and pupils.
The paper includes a fascinating micro-ethnographic analysis of the chil
dren's efforts to follow instructions and copywhat the teacher haswritten on the
chalkboard. But because the text is apparently meaningless to them, the children
copy individual letters vertically down the page, instead of writing words and
sentences horizontally. A particularly poignant observation has to do with their
copying of the Spanish word campo (field), when van Lier (1996a) notes:

Many students have written canipu or canipo instead ofcampo, every time
they have copied the word. This is a very puzzling error, until I take a
good look at theblackboard and notice a small spoton it, just above the
third leg of the letter m in campo. Since canipu/-o is not a word in
Spanish, this is clear evidence of how mechanical the students' copying
activities are. (p. 372)

Given the depth and detail of van Lier's writings the reader is left with a vivid
picture of the educational context as well as sense of compassion and clarity
about his understanding of the teachers and children in the altiplano of Peru.

Chapter 7 Ethnography 221


CONCLUSION

In this chapter, we have considered the background and some key characteris
tics of ethnography, which came into applied linguistics from the parent disci
pline of anthropology. We summarized four principles of ethnography and the
typical stages of ethnographic research, exemplifying each point with excerpts
from classroom ethnographies. We looked at two of the important roles and
responsibilities ethnographers take on,including participant and nonparticipant
observation, and the goal ofdeveloping both emic andeticperspectives. Several
important issues were raised regarding quality control in ethnography, but we
found that, rather than focusing narrowlyon the traditional criteria of reliabil
ity and validity, ethnography can be more appropriately evaluated by its own
criteria.
Ethnography represents a very different research culture from experiments
andsurveys. Arigorously researched, well-written ethnography can be a power
ful tool for informing readers about a particular cultural group (whether it is a
remote tribal culture, a neighborhood streetgang, the weekend bowling league,
or a language class), and for relating the daily life of that group to the broader
cultural context within which it is situated.

QUESTIONS AND TASKS

1. Look at the following topics from language classroom research. First, list
two or three behaviors associated with each topic. Then categorize the
nature of eachbehavior as beingeither low-inference (clearly definable in
operational terms; easy to verify) or high-inference (less easily defined or
verified).
Error treatment
Fluent speech
Teaching the past tense
Self-initiated turns
Social climate
Teacher praise for students
Teaching style
Use of the first language
2. From the listabove, choose one of the high-inference topics and/or behav
iors that interests you. Try to write a clear, operational definition for the
term. Checksome keyreferences on the topic to see how other researchers
havedefinedthe construct. Do you see any ways that some elementscould
be more low-inference in nature?
3. Read an ethnographic studyyou have located or one cited in this chapter
or listed below in "Suggestions for Further Reading." Does the studyuti
lize high-inference or low-inference categories or both in its data collec
tion and/or data analysis procedures? Were the categories clearly defined
and explicitly stated so that you could replicate the study if you wished?

222 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


4. In the ethnography you read, did the researcher function as a participant
observer or a nonparticipant observer? Were the data collection proce
dures overt or covert?
5. What kinds of triangulation are utilized in the ethnography you read—
data triangulation, researcher triangulation, methods triangulation, and/or
theory triangulation? If one or more of these strategies was not used, can
you think of a way that the researcher(s) could have improved the study
through additional triangulation?
6. LeCompte and Goetz (1982) say ethnographic researchers should be ex
plicit about the following five key aspects of the research:
A. the status of the researcher(s) in the research site,
B. who the informantsare and how they were chosen,
C. the social situations and conditions when the research was carried out,
D. the analytic constructs and premises underlying the research, and
E. the data collection methods and analyticmethods.
Analyze theethnography you have read. Didthe author(s) reporton all five
of these key points? If not, which are missing from the report?
7. LeCompte and Goetz (1982) also claim that ethnographies are strong
on internal validity on the basis of four characteristics of ethnographic
research:
A. The ethnographer's common practice of living among participants
and collecting data from longperiods provides opportunities for con
tinual data analysis and comparison to refine constructs and to ensure
the match between scientific categories and participant reality.
B. Informant interviewing, a major ethnographic data source, necessarily
is phrased more closely to the empirical categories of participants and
is formed less abstractly than instruments used in other research
designs.
C. Third, participant observation, the ethnographer's second key source
ofdata, isconducted in natural settings thatreflect the reality ofthelife
experiences of participants more accurately thando contrived settings.
D. Fourth,ethnographic analysis incorporates a process ofresearcher self-
monitoring ... that exposes allphases of the research activity to contin
ual questioning and reevaluation. (p. 43)
Which of these characteristics were present (and convincing) in the
ethnography you read? Which were weak or absent? What could the re-
searcher(s) havedone to improve upon any gap(s) you havenoted?
8. A link between teaching and learning, on the one hand, and classroom
ethnographies, on the other, is that if ethnography is "done right, it
actively encourages the participation of teachers and learners" (van
Lier, 1990a, p. 49). How does Duff's ethnography measure up in this
regard?

Chapter 7 Ethnography 223


9. Whatquality control issues were addressed in the ethnography you read?
Did thestudy address reliability and validity or did it talk in more qualita
tive terms about trustworthiness? If the latter, were credibility, transfer
ability, and dependability discussed?
10. Four quality control mechanisms specific to ethnography were described
by van Lier (1990a):
A. Every study needs to be scrutinized for its adherence to the emic and ho
listic principles.
B. The notionof context needs to be examined in greatdetail and the role
of context in interpretation must be madeexplicit.
C. Ethnographic research must beopen, thatis, it must examine and report
its own processes of inferencing and reasoning, so that its procedures
can be profitably discussed.
D. Ethnography must be either broad (longitudinal) or deep (micro-ethno
graphic). ... Ethnography requires intensive immersion in the data,
whether this is the daily languageuse of an entire culture, or one small
story told by a child, (p. 45)
Did any of these criteria emergein the ethnography you read? If not, can
you seeways that one or more of these might have helped to improve that
study?
11. Think of three to five research questions that youmight be able to address
by conducting an ethnography. What group would you wish to study?
What kind of data would youwantto collect? Would you be a participant
observer, a nonparticipant observer, or both at different times? Would the
data be collected overtly or covertly?

SUGGESTIONS FOR FURTHER READING

We recommend all the ethnographies cited in this chapter. If you would like to
readmoreaboutthe bilingual Spanish-Quechua program in Peru,please seevan
Lier's (1996a) article. There is also a booklength account of this context by
Hornberger (1988). For further information about the classroom ethnography
in Sri Lanka, you can read Canagarajah's (1993) originalarticle and an interest
ing response to it by Braine (1994), who is from Sri Lanka himself.
We have cited Watson-Gegeo's (1988) paperextensively. Aslightly different
treatment of ethnographic principles is provided by van Lier (1990a), who
stresses that two main principles of ethnography are (1) an emic point of view,
and (2) a holistic concern for context. See also van Lier's (1988) book.
Freeman (1992) investigated the construction of shared understandings in
a high school French class. This report provides an excellent example of a re
searcher developing the emic perspective, and of the way research questions
emerge and evolve as an ethnographicstudy progresses.

224 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Ifyou wish to develop a research proposal in the general culture ofqualita
tive research, we recommend Marshall and Rossman's (2006) Designing Qualita
tive Research. It provides clear guidance in reader-friendly prose.
Methodological guidance for people who want to doqualitative work in our
field is gradually becoming more available. We recommend Richards (2003),
the special issue ofTESOL Quarterly edited by Davis and Lazaraton (1995), and
a reviewby Henze (1995).

Chapter 7 Ethnography 225


8
Action Research

For teachers "tofully embrace theprinciples andphilosophy ofaction


research, they need to begin by reinventing themselves. ...We can only
create alternatives to the existing method and structures after we have
restructured ourselves" (Mingucci, 1999, p. 16).

INTRODUCTION AND OVERVIEW


In this chapter, we look at action research—an approach thatisparticularly well-
suited for teachers conducting classroom research. First, we will define the ap
proach and then provide an example of the action research cycle. We will talk
aboutgetting started with a plan for an action research study andabout collect
ing data. Then we will consider quality control issues in action research before
summarizing a sample study. We will close with the usual "payoffs and pitfalls,"
but wewill also make a case for the useof action research asa powerful tool that
can empower teachers to take control of their own professional development.

Defining ActionResearch
Action research is becoming increasingly prominent in the research methodol
ogy literature in our field. (See, for example, Burns, 1999; 2004; Edge, 2001;
Nunan, 1990; 1993; Wallace, 1998; vanLier, 1994a.) Asan approachto research,
it has been around sincethe 1940s, when it first appeared in the social science lit
erature (Lewin, 1946; 1948). In the 1980s, it was adapted by educators such as
Carr and Kemmis (1986), who described it as follows:
Action research issimply a form ofself-reflective enquiry undertaken by
participants in order to improve the rationality and justice of their own

226
practices, their understanding of those practices and the situations in
which the practices are carried out. (p. 162)
This description is widely cited and it highlights the practitioner-driven nature
of action research as well as the social justice bias bequeathed to the concept by
Lewin, a left-wing sociologist. However, it is rather too broad to work as a defi
nition for a form of research, beingbasically a statementabout reflective teach
ing. (See Richards and Lockhart, 1994.) J
For us, there are key differences between reflective teaching and action re
search. One is that reflective teaching can be a solitary and private practice, but
in action research, the results of the process—the outcomes or products—should
be published. We are using publish here in its original sense: to make publicly
available to others for critical scrutiny. Another difference is that reflective
teaching could conceivably occur at one point in time, after a particular lesson,
whereas action research is cyclic and iterative.
We define action research as a systematic, iterative process of (1) identify
ing an issue, problem, or puzzle we wish to investigate in our own context;
(2) thinking and planning an appropriate action to address that concern;
(3) carrying out the action, (4) observing the apparent outcomes of the action;
(5) reflecting on the outcomes and on other possibilities; and (6) repeating
these steps again. Toour minds, the cycle described above must be carried out
at least twice (and typically more often) for the investigation to qualify as ac
tion research.
A more philosophical definition of action research is provided by Kemmis
and McTaggart (1982), who suggest that
[t]he linking of the terms 'action' and 'research' highlights the essential
feature of the method: trying out ideas in practice as a means of im
provement and as a means of increasing knowledge about the curricu
lum, teaching andlearning. The resultis improvement in whathappens
in the classroom and school, and better articulation and justification of
the educational rationale of what goes on. Action research provides a
way of working which links theory and practice into the one whole:
ideas-in-action. (p. 5)

In this quote, the authors highlight connections between theory and practice.
They also pointout that action research entails more thansimply providing de
scriptive and interpretive accounts of the classroom, no matter how rich these
might be. Action research is meant to lead to change and improvement in what
happens in the classroom. But, in contrast to experimental research, as Kemmis
and McTaggert (1988) note,
[a] distinctive feature of action research is that, those affected by planned
changes have the primary responsibility for decidingon courses of crit-
ically informed action which seems likely to lead to improvement, and
for evaluating the results of strategies tried out in practice. Action re-
search isa group activity, (p. 6)

Chapter 8 Action Research 227


Thus, action research is not simply some form of investigation grafted onto
classroom practice. Rather, it represents a particular stance on the part of the
practitioner—a stance in which the practitioner is engaged in critical reflection
on ideas, the informed application and experimentation of ideas in practice, and
the critical evaluation of the outcomes.

REFLECTION

Based on your previous knowledge and what you have read so far, what
would you say are the key characteristics of action research that distinguish
it from both naturalistic inquiry and experimental research in the psycho
metric paradigm?

Characteristics of Action Research


To characterize action research in language classrooms, Nunan (1992) empha
sizes the centrality of the teacher. He notes that this approach will have compo
nents similar to other types of research—that is, posing questions, collecting-
data, and then analyzing and/or interpreting those data. However, it is differen
tiated by the fact that it will be carried out by practitioners investigating their
own professional context.
Similarly, Kemmis andMcTaggart identify threedefining characteristics of" the
approach, lulucational action research, according to these authors, (1) is carried out
by classroom practitioners; (2) is collaborative in nature; and (3) is aimed at bring
ing about change. Given this view, a teacher's descriptive observational research
that was aimed at increasing understanding rather than bringing about change
would not be considered action research byKemmis and McTaggart—particular!)'
if the study was conducted without the involvement of others. For these authors,
collaboration and change are defining characteristics of the approach.
Cohen and Manion (1985) offer similar characteristics. Thev argue that
action research is first and foremost situational, being concerned with the iden
tification and solution of problems in a specific context. They also identify
collaboration as an important feature of this type of research, and they state that
the aim of action research is to improve the current state of affairs within the
educational context in which the research is being carried out. If this educational
context involves an entire school or program (as opposed to simply a single
class), certainly teachers collaborating are in a better position to achieve this goal
than are individuals working alone.

REFLECTION

Do you believe that collaboration should be a defining characteristic of


action research? Why or why not?

228 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


We believe that action research has all of the characteristics of 'regular'
research—that is, it requires research questions, data that are relevant to those
questions, analysis and interpretation ofthe data, and some form ofpublication.
We agree thatit isthecentrality oftheclassroom practitioner as a prime mover
in the action research process that defines the approach and differentiates itfrom
other forms of research. We also agree that action research should be aimed at
bringing about change rather than simply documenting what is going on. How
ever, we feel that Kemmis and McTaggart go too far in their assertion that in
order to qualify as action research, the process must becollaborative. Certainly,
collaboration ishighly desirable. But to assert thatsuch a process without collab
oration cannot be calledaction researchis unrealistic. Many practitionerswould
dearly love to collaborate, butthey are simply notina position to do so.

REFLECTION

Based onwhat you have read sofar, howwould krou explain action research
to a colleague whowanted to know more about it?

THE ACTION RESEARCH CYCLE

Most writers on action research agree that it is an iterative, cyclical process


rather than a onetime event. In other words, unlike the "one-shot case study" de
sign in experimental research, at least two action research cycles are required in
order to resolve the problem or puzzle that initiated the research.
There are many visual frameworks that depict the action research cycle. We
like the following image from van Lier (1994a), which is simple and clear. (See
Figure 8.1.) It also depicts the fact that aresearcher's goals may change over the
course of an investigation. In addition, van Lier's model explicitly includes the
step of reportingon the outcomes of the study.
In action research, the practitioner first identifies a problem or puzzle and
conducts a preliminary investigation to gather baseline data. She thenforms ahy
pothesis (though notnecessarily in formal hypothesis testing language) and plans
the intervention. Next, she takes action and observes the outcomes (a step analo
gous to data collection.) Then the researcher reflects on the outcomes (analyzes
and interprets the data) and identifies a follow-up issue (or continues with the
sameissue), whichinformsa new cycle. Table 8.1 provides an example (p. 231).
The sequence depicted inTable 8.1 would not necessarily stop at the endof
the second cycle. The process could continue indefinitely, aslongasnew puzzles
or problems suggest new goals. In thesample study at the endofthis chapter, we
will summarize an action research project that continued through two phases
and included several cycles.

What Action Research Is Not


Sometimes it is helpful in trying to understand the characteristics of something
asabstract as a research approach to determine whatthat entityis not, or whatit
Chapter 8 Action Research 229
FIGURE 8.1 Cycles of action research (from van Lier, 1994a, p. 34)

does not do. Kemmis and Henry (1989) made these statements about what
action research is not:

1. It is NOT the usual thing teachers do when they think about their teach
ing. It is systematic anil involves collecting evidence on which to base
rigorous reflection.
2. It is NOT (just) problem solving: it involves problem posing, too. ... It is
motivated by a quest to improve and understand the world by changing it
and learning howto improve it from the effects of the changes beingmade.
3. It is NOT research on other people. Action research is research by partic
ularpeople on their own work, to help them improve whattheydo, includ
ing how they work with and for others.
4. It is NOT the "scientific method" applied to teaching.... It adopts a view of
social science which is distinct from a view based on the natural sciences (in
which the objects of research may legitimately be treated as"things"); action
research also concerns the "subject" (the action researcher) him- or herself.
Given thesestatements, we cansee that action research isunlike both exper
imental research and naturalistic inquiry. Action research differs from experi
mental research in that the former—like naturalistic inquiry—works with
naturally occurring groups and does not impose artificial control over variables.
However, unlike researchers in the naturalistic inquiry tradition, action re
searchers do seek to intervene and bring about change (as do psychometric
researchers conducting experiments). In fact, the "action" in action research is

230 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 8.1 The action research cycle (adapted from Nunan,
1992, p. 19)

Cycle and Step Example

Cycle 1
Step 1: Problem/puzzle "Student motivation is declining over the course of
identification the semester."

Step 2: Preliminary "Interviews with students confirm my suspicion."


investigation
Step 3: Hypothesis "Students do not feel they are making progress from
formation their efforts. Learning logs will provide evidenceof
their progress."
Step 4: Plan intervention "Have students complete weekly learning logs."
Step 5: Take action and "Reading the learning logssuggests that learners are
observe outcomes not reallyaware of dieir own progress."
Step 6: Reflect on "Motivation is improving, but not as rapidly as
outcomes desired."

Cycle 2
Step 7: Identify follow-up "How can I ensure more involvement and
puzzle commitment by learners to their own learning
process?"
Step 8: Second hypothesis "Developing a reflective learning attitude on the part
of learners will enhance involvement and motivation
to learn."

Step 9: Take action and "At the end of each unit of work, learners complete a
observe outcomes self-evaluation of learning progress and attainment of
goals."
Step 10: Reflect on "Self-evaluations show not all learners feel they are
outcomes improving, even though I think they are."

parallel to the treatment in experimental studies, but external researchers are not
applying the treatment to subjects. Instead, in action research, the participants
themselves decide what to do to bring about positive change.
Like naturalistic inquiry but unlike experimental research, the research
questions may evolve as action research proceeds. And, because it is concerned
with a particular situation, action research "tends to be rather messy and unpre
dictable" (van Lier, 1994b, p. 7), so the data collection and analysis procedures
may also change. We turn now to discussion of the action research cycle with an
example of how this evolution may come about.

Involving Our Students inAction Research


As teachers, to solve problems in our classrooms, we oftenneed to understand the
students' perspectives. There areseveral simple ways to collect data from students

Chapter 8 Action Research 231


in an action research project, including many that are used in other approaches.
Depending on the students' level of proficiency, you can collect data either in
their home language or in the language they arestudying. Whatmatters most in
this context may be their ideas (rather than orinaddition to their target language
proficiency). Forthis reason, you should consider the research question(s) you are
addressing and choose the language that will allow the learners to understand
the data collection processes and express themselves best.
Here are two examples of tools for collecting data from language learners.
These are from a research report by Quirke (2001), who wanted to systemati
cally elicit feedback from his students. He wrote, "Thepurpose oftheinvestiga
tion was to give students a voice in my teaching and thus help me adapt it to
accommodate those students" (p. 82). Quirke carried out his action research for
eightmonths in hisreading classes at awomen's college in Abu Dhabi. Atthe end
of the lesson, he gave students time to complete the form in Figure 8.2.
At the beginning of the term, Quirke's students often left the why questions
unanswered, but they began to provide responses around the third week of the
term. By the end of the course, the students were providing answers to every
question on the form.
Nine approaches to teaching reading were used over the next nine weeks.
(See Figure 8.3.) At the end of the term, Quirke gave his students a form that
listed the approaches they had used and asked for their ratings on the criteria of
usefulness, difficulty, interest, and enjoyment. (His article does not explain the
individual teaching approaches).

Please complete the following honesdy.


I willuseyour responses to plan our approach to readingclasses next
semester.

Week Title of text

Approach

What was the aim of this class?

What did you learn during this class?

What did you enjoy most about the class?.

Why?

What did you find most difficult?


Why?

Other comments:

FIGURE 8.2 \ A form for eliciting feedback on a reading lesson


(from Quirk, 2001 p. 86)

232 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Pleasecomplete the following table with numbers
1 = the most (e.g., the most enjoyable, the mostuseful)
9 = the least(e.g., the leastenjoyable, the leastuseful)

Week Approach Useful Difficult Interesting Enjoyable


1 Jigsawreading
2 Question cards
3 Silent reading
4 Group reading
5 Analysis
6 Clap
7 Prediction
8 Question writing
9 Paragraph matching

. . -

FIGURE 8.3 A form for elicitingstudents! feedback on reading


approaches (adaptedfrom Quirke, 2001, p. 86)
Another technique Quirkeused is called a graffiti board—the use of a large
sheet of paper or achalkboard to gather students' [feedback. The teacher would
then leave the room during the regularly scheduled breaks and invite the stu
dents to add their commentson the graffiti board. j\fter experimenting with var
ious strategies for getting students' critical input, Quirke found that the graffiti
board idea worked well if he started the process
inviting negative feedback, asshown in Figure 8.4.(Thespelling andvocabulary
reproduced here are the students' ownwork.)

What I DIDN'T like about What I found MOST


the lesson difficult

I don't like your lines. The different with the 3 consepts


You give me sentencebut I don't give The draws for the lines becausethey
you one
are not clear.
I cant see the connection to now which
The sentences and examplesare not
clear
you always say.
I get confused wit the past and perfect
together—why do you do this???
Why your line so confessing?
What concept?

FIGURE 8.4, A sampleof a graffiti board (from Quirke, 2001, p. 88)

Chapter 8 Action Research 233


Quirke found that the smdents provided useful feedback that helped him to
rethink and revise his lesson plans. He notes, "As far as my students are con
cerned, I am satisfied that these experiences of having their opinions sought and
seeing them respected, did indeed help students developing their critical think
ing skills" (ibid., p. 90).

REFLECTION

If you are currently teaching, think about an action research study you
could conduct to address an issue in your own classes. What data collec
tion tools and procedures could you use to elicit the students' input in that
context?

QUALITY CONTROL ISSUES IN ACTION RESEARCH


Quality control in action research is largely a matter of being systematic and
committed to engaging in the process over a period of time. Careful records
must be kept regularly (e.g., by dating all diary entries, photocopying students'
assignments, etc.), and the process of data collection must be religiously con
ducted. For instance, it is not acceptable for a teacherto skip writingin histeach
ing journal for a week because he is too busy. If the journal is a data collection
procedure, it must be kept regularly and conscientiously over time. Creating and
protecting the time to do so is part of the research process.
The discussion of triangulation in Chapters 6 and 7 is pertinent to action re
search as well. You will recall that Denzin (1978) described four types of triangu
lation. These are (1) data triangulation, which draws on different sources of data
(teachers, students, parents, etc.); (2) theory triangulation, in which various theo
ries are brought to bear; (3) researcher triangulation, when more than one re
searcher contributes to the investigation; and (4) methods triangulation, which
involves the use of multiple methods (e.g., interviews, questionnaires, observa
tion schedules, test scores, journal entries, etc.) to collect data.
As with case studies and ethnographies, all these types of triangulation can
be usefully employed in action research. In particular, because of the participa
tory nature of action research, data triangulation can be used in most such stud
ies. In our field, this goal usually involves gathering data from teachers and
students. Administrators, young learners' parents, adult students' employers,
and teachers' aides may also contribute data, depending on the focus of the study.
The procedures discussed above (Quirke, 2001) providesome options for elicit
ing data from smdents.
iVlethods triangulation is also a natural fit with action research, which uses
any data that can address the research questions posed when problems and/or
puzzles are identified. Researcher triangulation can be developed by collaborat
ing with colleagues or by inviting an external researcher to help with an action

234 EXPLORING SHCOND LANGUAGE CLASSROOM RESEARCH


research project. Finally, to the extent that an action research project draws on
theory, it is also possible to implement theory triangulation.

SAMPLE STUDY

The sample study presented here incorporates many of these quality control
issues and illustrates a process for eliciting students' ideas. The teacher, John
Thorpe, conducted this action research project inhis own ESL class, buthecol
laborated with a colleague who was teaching thesame group ofstudents in a dif
ferentclass. The colleague focused on vocabulary teaming in hisaction research
project while Thorpe wanted to explore options in teaching listening compre
hension through the use of television news broadcasts.
This focus cameabout because Thorpe was working with three Koreanadult
learners of English whowere enrolled in a two-semester program sponsored by
their government. Upon their return to Korea, the students would be facing a
battery of English proficiency examinations. These students were worried
about a listening comprehension test which consisted almost entirely of news
broadcasts, so Thorpe decided to include television news broadcasts as part of
their regular biweekly two-hour lessons. Thorpe (2004) described his study asa
"two-phase, eight-cycle action research project" (p. 1). He gathered data by
videotaping lessons, keeping a teaching journal, and having his students com
pletea questionnaire each week. He also discussed the videotaped lessons with
his colleague.

REFLECTION

Based on the information given so fur, what t me(s) of triangulation did


Thorpe use in his actionresearch project?

Thorpe began his action research project with a literature review that in
formed his teaching as well as his research focus. Thorpe was an experienced
teacher, but he was open to learning new things. He wrote,
Although I have over twelve years of classroom experience, I have only
usedTV news broadcasts sparingly. A review of the relevant literature...
uncovered only scant information pertaining specifically to the peda
gogical use of TV news broadcasts.... Although more recent resource
books (e.g., Larimer & Schleicher, 1999) do include some useful ideas,
in light of the popularity of TV news, I feel additional research using
TV news broadcasts in the second or foreign language classroom is
needed. This project aimed, then, to seek possible answers... to the
following research question: How can I best teach listening comprehension
using TVnews broadcasts? (pp. 1-2)

Chapter8 Action Research 235


The study also addressed another question: Will participants and the teacher
agree or disagree on the effectiveness of instruction? This question stemmed from
Kumaravadivelu's (1994; 2003) thinkingonperceptual mismatches between teach
ers and students. Thorpe wascurious about whether his interpretation of lesson
activities matched those of his students.
In his graduate courses, Thorpe had read about action research, but this was
tlie first time he actually carried out an action research study of his own. He
knew this was the rightapproach to use, however, since he wanted to investigate
and improve his own teaching while simultaneously helping his students to
improve their listening comprehension. He notes that action projects typically
do not follow the experimental tradition:
My participants, for example, were not randomly selected nor were
they necessarily typicalof other language learners. I did not administer
a pre- or post-test specifically designed to quantitatively measure their
competence or achievement before or after a specified treatment.... In
contrast, action research breaks down "the dichotomy between re
searcher and researched" (Auerbach, 1994, p. 695) and recognizes the
value of intangibles. As van Lier (1996b) has keenly observed, "Intan
gibles are often more influential than tangibles. If you can't see it, that
doesn't mean it isn't there. If you can't count it, that doesn't mean it
doesn't count" (p. 2). (Thorpe, 2004, p. 7)

REFLECTION

Imagine yourself in Thorpe's position. How could you collect baseline data
about your students' listening comprehension? What actions might you
want to implement to address die questions would you pose?

Given this grounding, Thorpe initially planned the project as five research
cycles to be conducted over five weeks during one term of instruction, but the
project continued into a second phase the following semester. The interventions
that he tried as the course progressed included giving the students their choice
of which news story to listen to, providing transcripts of the story, letting the stu
dents choose when to view the transcript, providing written comprehension
questions about the broadcast, having the students themselves write the ques
tions, and varying whether the topic of the news story was familiar or unfamiliar
to the learners. The two phases consisted of eight cycles, which are summarized
in Table 8.2.
The data collection processes Thorpe used were varied and systematic.
They gave him a clear picture of the students' responses to these interventions.
To collect data, he videotaped all the listening activities in his class for a month.
These included brief broadcasts from CAW Student News, which had not been
simplified for language learners. Thorpe also used the transcripts of these broad
casts, which were available on CNN's Web site.

236 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


T&B'LE 8.2 Two phases of action research cycles (adapted from
Thorpe, 2004, p. 40)
Story Topic Control
Cycle Selection Familiarity of Task Transcript Interventions

PHASE 1

1 Teacher Unfamiliar Teacher Not used Participants choose


story. Assumedthey
had background
knowledge.
2 Participant Familiar Participant Not used

3 Participant Unfamiliar Teacher Used before Use of transcript prior


listening to listening.
4 CNN Unfamiliar Teacher Not useii CNN "chooses" story.
Teacher prepares
comprehension
questions. Participants
choose when to view
questions.
5 Participant Familiar Participant Used aft er Participants choose
listening •
story, participants write
questions. Transcript
used at the end.

PHASE 2

1 Teacher Familiar Teacher Not used Parody instead of


"straight" news.
2 Teacher Familiar Teacher Optional use The News Hour
after initial replaced CNN
listening Student News. Longer
news story; grid format.
3 Teacher Familiar Teacher Optional Task given after initial
use after listening; no pre-task
listening schema activation.

Thorpe had two lessons per weekwith these students, so he planned each
week to comprise one action research cycle. During the Tuesday class, he
would implement the actions he had planned. Then each Thursday, after they
had completed the listening activity, the students filled out a short question
naire eliciting their ideas about both the news story and the classroom proce
dures used with it. The teacher himself also completed this questionnaire. The
questionnaire formatwas based on an idea from Christisonand Bassano (1995)
in which the learners simply placed a vertical mark on a nine-centimeter line
to indicate their opinion. This procedure allowed Thorpe to compare the

Chapter 8 Action Research 237


Directions: Please draw asingle, straight line (|) on the horizontal continuum
(—) to indicate your opinion. For example:
Compared to the United States,Japan is
very big I 1 1very small

The newsstory was


very easy I I very difficult

I found the teaching activity


not at all I I very helpful
helpful

I would recommend thatJohn use the activity again.


strongly | I strongly
disagree agree

Compared to the activitywe did last week,I liked this one


a lot less I I a lot more

In general, I feel my aural understandingof TV newshas


not | | increased
increased significantly
at all

Other comments (optional):

FIGURE 8.5 Student questionnaire about lessons used (adapted from


Thorpe, 2004, p. 39)

participants' responses—both to each other's ideas and to their own subsequent


responses over time. The student questionnaire is reprinted as Figure 8.5.
Before having the studentscomplete the questionnaire, Thorpe explained to
them how he was changing the teaching procedure. He also showed them their
earlier questionnaires so they could compare their current reactions to those of the
previousweek.It wassimple and quick for the students to fillout the form, and the
data it yielded gaveThorpe information he would not have otherwise had.
Thorpe also recorded his own impressions of the lessons in his teaching
journal, noting any particular problems or insights. He gave his colleague a
videotape of the lesson, and they discussed it before he planned the next cycle.
This process continued for five weeks—the five cycles of the first phase. Then

238 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


The newsstory was
very easy | J very difficult
35 2 4

I found the teaching activity


not at all I J very helpful
helpful 12 5 4

FIGURE 8.6 Students' cumulative responses to five cycles of the action


research project

FIGURE 8.7 ! Individual student's ratings of the story difficulty

Thorpe compiled allthe students' responses, asshown in Figure 8.6. The num
bers refer tothe five research cycles. |
Thorpe (2004) measured how far from the left end each mark was and cre
ated line graphs using these data. This procedure allowed Thorpe to compare
the students' various responses to the questions and to find patterns occurring
over the five-week investigation. He couldalsocomparehis reactions to those of
the participants.
Figures 8.7 and 8.8, respectively, show the datafromthe teacher (J) andthe
students regarding the difficulty of the news stories and the helpfulness of the
teaching activities. The nine-pointscale derives from the fact that the students
marked their impressions on the nine-centimeter line. This process gave Thorpe
a clearvisual wayto represent and comparehis students' opinions.

Chapter8 Action Research 239


91

^^^^^
8-

7-

« 6-
u

J*" 4 -
_S^—-•-—^r^^
"33
* j- -•-M
2- -»-K
-♦-C
1- -•-J
n -
U 1 i i i i i

1 2 3 4 5
Cycle

FIGURE 8.8 Individual students' ratings of the teachingactivity

REFLECTION

What patterns do you notice in Figures 8.7 and 8.8? If you were the
teacher,how would you interpret these data?

These data were compiled after the fifth cycle(at the end of the first course
Thorpe [2004] had with these students and at the end of the first phase of the
action research project). Here's part of what the teacher had to sayabout his in
terpretation of these data:

The line charts depicting the participants' collective responses are


both enlightening and puzzling. On the one hand, a clear pattern can
be seen between how participants rated the difficulty of a story and
the teaching activity used: as perception of difficulty goes down, ap
proval of the activity goes up. . . . The participants also appeared to
link easeof understandingto the activity itself. In other words, if they
found the story easy, it was in part due to the teaching procedure; if
they found it difficult, this was at least partially due to the procedure
as well. (p. 20)

Throughout the report,Thorpe uses initials to referto the students, thereby


protecting their confidentiality. They are known to readers as C, K, and M. In

240 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


one case where the students had trouble understanding a news broadcast,
Thorpe wrote,

I was unsure if their difficulties were due to language, rate of speech,


context, or subject matter. The situation was clarified upon reading the
questionnaires. M wrote, "I found a big gap of understanding between
international news and domestic news. It could be helpful to balance
both of them." (p. 12)

Thorpe wondered if interest in the news storiesmight alsobe a factorin the stu
dents' listening comprehension. He and his colleague discussed this issue and
decided he should let the students themselves choose the particular news story
based on the brief "teasers" that CNN uses at the start of the broadcast to tell the
listeners what stories will be covered. Of this decision he wrote,

The primary drawback—that it would be more difficultfor me to plan


becauseI would have to prepare all the stories, not just one—led to a
personal insight that had eluded me over my 12-year teaching career:
I like to be in complete control. The realization brought to mind a
series of nightmares I had had several years ago in which I entered my
classroom only to realize that I had completjely forgotten to prepare
for the lesson! Despite being a very vocal advocate of student-centered
classes, my actions showed that I prefer to control the classroom situ
ation, exemplified by my use of carefully created and planned class
room material. What is particularly odd about this revelation is that I
encourage students to take risks .. . and to be fearless in testing their
hypotheses about language. Yet I did not appear willing to take my
own risks by sharing control with the students. Thus the action re
search process of collaborative investigation transcended my original
intentions. ... I became aware of something previously unknown to
me. As I would now consciously cede this absolute control I wondered:
How would it go? (p. 12)

On Tuesday, the first day of the new cycle, Thorpe played the preview of the
news broadcast and let the students choose the story they would listen to. They
chose one about household germs. The teacher wrote,

Although I had viewed the story before, my knowledge wasnot as thor


ough or complete as last week's news stories because my preparation
time had been evenly spread across all four stories. However, since the
participants would play a more active role in the process, I did not feel
my knowledge of the story was insufficient. I felt the power distribution
was more in balance because I, too, was seeking information from the
news story, (p. 13)

At the Thursday class, continuing with this particular intervention, Thorpe had
the students select a story from the news broadcast. They chose a report about a

Chapter8 Action Research 241


Burger Kingrestaurant in Baghdad. Of this experience, Thorpe said,
Although there were parts I thought might give them problems I sup
pressed my desire to control the activity. My assumption was that, un
less they specifically asked me, I would assume they understood the
story. Of course, such reliance on the students is not always advisable.
I could think of past situations where it would not be appropriate, but
I believe these particular learners were capable of taking responsibility
for their own learning, (p. 15)
Because the students took a more active role, Thorpe and his colleague felt that
this cycle was better than the previous week's cycle, but only one student (K)
thought so. The teachers wondered about the students' evaluations as the two
of themweretryingto promote a morelearner-centered approach. They decided
that culturally based expectations might be influencing the students' evaluations.
Perhaps these learners were accustomed to a more traditional view of teachers'
and students' roles (i.e., a knowledge-transmission model of education). To ad
dress this issue, Thorpe decided on a different action step. He would assign a
particulartask, but the studentswouldstill choosethe news story theywanted to
listen to. They selected a report on a serious wildfire in California. Before the
class, Thorpe had cut the transcript of the story into strips that represented the
turns of the various speakers in the broadcast.

Each participant was given a packet of the strips and asked to arrange
them in order. In reviewing their results, I was pleased not only with
their abilityto accomplish the task but in their reasoningas well. For ex
ample, C explained his arrangement by saying that he realized that the
newscaster would probably speak first and last and that the reporter's
speech would be interspersed with eyewitnesstestimony or some other
corroborative statement, (p. 16)

After the students predicted the structure of the newsstory and possible alterna
tives, the teacher played the videotape for them. When he played it the second
time, he stopped at the end of each segment so the students could ask questions.
At the next classmeeting, on Thursday, he repeated this process. He wrote,

This time, the participants chose a story about a mine rescue in Russia.
One additional insight participants gained from this activity was the
use—very common in TV news—ofword plays and puns. In this case,
the newscasterreferred to the rescue as a "miner miracle." What the par
ticipants realized was that the outcome of the story dictated the use of
these playson words. If, for example, all or most of the miners had died,
the pun would not have been appropriate. This insight would be helpful
in anticipating the outcomes of subsequent TV news stories and poten
tial reasons why, for example, the newscaster smiled without apparent
reason. I have found that these word plays and puns are often very
difficult for learners to understand, even at advanced levels, (p. 16)

242 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Thorperead the questionnaires and found thatthestudents viewed thestory
as relatively easy. He thought this perception was due to the fact that they had
read the transcript before listeningto the story. He wrote,
K and M both marked the recommendation ofactivity veryhighlyas well.
C, however, marked the activity as the lowest to date. What was most
surprising about the results was that Stuartand I both thought that C
was clearly the mostinvolved in the task ashe asked more questions and
volunteered more comments than M or K What could account for our
misinterpretation of his classroom behavior? (pp. 16-17)
[Teachers'] subjective interpretations of participants' cues—e.g., in
volvement, participation, and success in completing tasks—cannot be
assumed to be signs of pedagogical approval. These are components of
our self-defined criteria of a successful lesson. Students, however, have
different goals and different ways of evaluating the effectiveness of ped
agogical procedures. I could not tell whether it was the task itself—the
unscrambling of the transcript—thathe did not. like or doing it prior to
viewing the story. Perhaps the procedure conflicted with his own pre
ferred metacognitive strategy, which Oxford (2002) notes includes an
individual's learningstylepreferences and howthey plana learningtask.
It could also be the result of a pedagogic mismatch (Kumaravadivelu,
2003), or a gap between what C felt were the learningobjectives of the
task and what I was trying to accomplish, (p. 17)
When Thorpe read C's comment to see what he had not liked, he found that the
student had written, "We can listen first, and themanswer about the story, then
take a look at the transcript" (p. 17).Thorpe decided to use this student's sugges
tion during the next action research cycle. He also decided to address another
student request—namely, that the class review the previous lesson's news story.
After a discussion with his colleague, Thorpe decided to use CNN's lead story
and write six questions in advance that required specific information about the
story. He wouldgive the students their choiceof either hearing the questions be
fore or after he played the tape.
For the fifth and final cycle, Thorpe decided to havethe students themselves
write comprehension questions about the news story. He reasoned that this
process would
require them to construct a base meaning... from the scant input pro
vided by the news tease. To put the control of the activity in their
hands even more, I would allow them to again choose the news story. I
would also give them a copy of the transcript, this time after playing
the clip.The final changewould be the elimination of the previous les
son's replay as my journal entries for this week expressed displeasure
over the inordinate amount of time devoted to listening, (p. 19)
In myjournal I noted that these lessons wereamong the best. The ques
tions the participants generatedwere logical and well-planned and they

Chapter8 Action Research 243


controlled more of the lesson. Their reactions, however, were mixed.
None liked the activity more than when I wrote the questions, (p. 20)

REFLECTION

At this point, Thorpe had finished the five-cycle action research projecthe
had originallyplanned to conduct. What might he do next, if he decided to
continue the project through further action research? Brainstorm a list of
possible options for further investigation.

These procedures continued for five weeks. After completing these five cy
cles, Thorpe decided to extend the projectfor an additional three cycles. During
this second phase, the data collection procedures were similar but not identical
to those of the first phase. In the data collection for the second phase, Thorpe
continued to videotape his classes, write in his teaching journal, and have his stu
dents complete the questionnaire about each lesson's activities. However, a
schedule change meant that Thorpe and his colleague could now observe each
other's teaching instead of being limited to viewing videotapes of one another's
classes. Thorpe also made a change in his source materials from the minute-long
CNN StudentNews reports:

In an effort to accommodate the participants' request and also to collect


data on the use of longer video news segments, I decided to seek alter
native news sources. I subsequently selected The News Hour with Jim
Lehrer for several reasons. First, the stories are much longer—some as
long as 20 minutes—and have a mix of straight news reporting and
panel-like discussions often consisting of two people with opposing
views, (pp. 22-23)

Thorpe posted a link to The News HoursWeb site on his course Web site since it
contained complete transcripts of the broadcasts as well as links to the audio and
video clips. This resource enabled the students to review the news stories if they
wished.

Prior to beginning the second phase, I showed die participants a story


from The News Hour to ensure their approval of the change. They told
me that the style of reporting was similar to a Korean news broadcast and
that, although the story they viewed was much longer, it was fairly easy
to understand because both Jim Lehrer and The News Hour reporters
Spoke slowly and pronounced their words carefully- All three also indi
cated that they liked the change from CNN Student News because of the
longer, more in-depth news stories the program offered, (p. 23)
As he had done after the first phase, Thorpe plotted the students' and his own
data from phase two on line graphs. Even though there were only three action

244 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


research cycles in this phase, the datayielded information that was helpful to the
teacher. About the second phase, Thorpe wrote,

An interestingand noteworthypattern involved the similarities between


my questionnaire responses and those of the participants. Unlike in
phase one, where there were noticeable differences, in this phase our
evaluations were more closely aligned. In particular, M's scores and
mine were far more similar in phase two than in phase one. I think one
possible explanation could be that by including the participants in the
data collection process I was able to "align" my perceptions with theirs,
(p. 30)

It seemed that the regular feedback Thorpe got from the students helped him
to predict their needs and select materials and tasks they would find useful as
well as judging the kinds of news broadcasts that would be appropriate. He
wrote,

The project had the added benefit of fostering awareness-raising skills,


which Freeman (1989:28) believes trigger a "processof decisionmaking
based on the constituents of knowledge,skills,attitude, and awareness."
In other words, it is through awareness that one's professional develop
ment—his or her knowledge, skills, and attitudes about teaching—will
grow.My realization that I am, to be blunt, a control freak, is one exam
ple of my heightened awareness. The results of phase one also showed
me that it is a good tactic to give participants the power to choose
teaching material and to give them a say in deciding pedagogical
procedures. However, I also found that these participants do not want
complete control and prefer that I ask specific "answerable" questions
derived from the story to gage their comprehension, (pp. 30-31)

Thorpe found that the perceptual mismatches between his view and the students'
were less pronounced in phase two than in phase one. He attributed this differ
ence to the systematic practice of collecting their feedback regularly in each
cycle. He wrote, '

I think that including the participants in the project also increased the
likelihood they would get involved in their own learning as well In
volvement in action research—both in this study and my collaborative
role in [my colleague's] project as well—can "make our work more pur
poseful, interesting, and valuable, and as such it tends to have an ener
gizing and revitalizing effect" (van Lier, 1994a, p. 33). Although action
research does not aim at generalizability, I have discovered both per
sonal and professional insights that definitely can be applied to many
other teaching situations, (pp. 31-32)

This summary shows how Thorpe used the action research cycles to systemati
callyvary the changes he made in his teaching. He collaboratedwith a colleague

Chapter8 ActionResearch 245


and linkedhis own thinking to some related professional literature. Afterhe fin
ished this project,Thorpe wrote,
One of the things I like best about doing action research is it motivates
me to try something new and different and not to fret if the lesson tanks.
In short, I think that the process of recording data in a systematic way
by videotaping lessons, creating and then analyzing the results of stu
dent questionnaires, being observed by colleagues,and writing a journal
can, in van Lier's (1994a) words, "make our work more purposeful, in
teresting, and valuable, and as such it tends to have an energizing and
revitalizing effect." (p. 33)

PAYOFFS AND PITFALLS

There are both advantages and disadvantages of conducting action research. In


this section, we will address likely pitfalls first and comment on possible solu
tions. We will then discuss the payoffs of doing action research.

Challenges in Doing Action Research


Nunan once worked as an advisorto a network of high school languageteachers
who were conducting action research. These teachers taught a variety of lan
guages (e.g., Spanish, Italian, Vietnamese, Indonesian, Polish, and Greek) in
Australia. The ideaof setting up a support networkwith Nunan as the facilitator
was to assist teachers in developing the basic skills of research design. These
included (1) identifying a problem and turning it into a researchable question;
(2) deciding on appropriate data and data collection methods; (3) determining
the bestwayof collecting and analyzing the data; and (4)evaluating the research
plan and reducing it to manageable proportions.
This group kept journals of their experiences during the semester-long
project. Nunan asked them to document the challenges and difficulties they
encountered. A content analysis of their records at the end of the semester
revealed five major areas of concern as follows: (1) lack of time, (2) lack of ex
pertise, (3)lackof ongoing support, (4)fear of being revealed as an incompetent
teacher, and (5) fear of producing a public account of their research for a wider
(unknown) audience.
Lack of time was the single biggestimpediment for these teachers to carry
out their action research. It wasmentioned by every teacher in the network, and
some teachers mentioned it virtually every time they made comments in their
journals. Teachers are busypeople, and involvement in the action research net
work, without release time from any of their other duties, added considerably to
the burden of their daily professional life.
Not surprisingly, the second most frequently nominated roadblock on the
road to success was lack of expertise. The word research raises all sorts of fears
and uncertainties in the minds of teachers. It conjures up images of scientists in
white coats with measuring instruments and mysterious methods of carrying out

246 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


statistical analyses. In fact, one of the benefits of engaging in actionresearchis to
demystify the notion of researchand the idea that one needsa license to practice.
We believe most teachershave the potential to do classroom research, and they
should be encouraged to add a reflective teachingand/or action researchdimen
sion to their professional armory.
The third most frequently nominated challenge waslack of support 'on the
ground.' This lack of support most often came from the individual to whom
the teacher reported (most typically the department chair or panel head) or the
schoolprincipal. In some cases, the principalrefused to sign the release allowing
the research to go ahead. In other instances, permission was granted reluc
tantly—the attitude being, "Well, this is a lot of nonsense, but if you want to go
ahead and waste your time, feel free. However, don't let it interfere with your
proper job—which is to teach."
Interestingly, resistance sometimes came from colleagues. This negativity
took the form of an attitude that to do research indicated that one had ideas
above one's station. Lurking behind these negative attitudes was the notion that
the appropriate job for a teacher is to teach, not to do research, and that this
'make-believe' role as researcher was not a legitimate thing for a teacher to be
doing. To be fair, the opposite reaction was also encountered. A number of
teachers reported that their status and esteem had risen among their peers as a
result of having taken part in the action research network.
Some participants worried that by conducting action research, they might be
revealed as incompetent teachers. Indeed, anyform jofresearch carries within itthe
possibility ofa negative result—or no result at all. This view is reinforced to a cer
tain extent by mainstream published research, which rarely reports that research
outcomes were inconclusive. These teachers were investigating aspects of their
own practice. Somefelt that an inconclusive or negative outcomecould be inter
preted asa signoffailure,an indicationthat the personwasan incompetent teacher.
The fact that the resultswouldbe made publicadded to this particularanxiety.
Indeed, the fear of having to produce a public report on their research—
particularly for awide, unknown audience—was tjie final most frequently nom
inated problem area. Teachers who had no trouble developing a sensible and
coherent plan and putting it into action baulked when it came to writing up and
making their research public. A number wanted to stop at this point, asking
"Why do we have to make it public?" and "I find writing so difficult." The an
swer, of course, is that without a public account, other teachers cannot benefit
from your insights, andyou cannotbenefit from their feedback. The publication
need not be an article in an international refereed journal, or even a formal
written report. It might take the form of a presentation at a local teachers'
conference or a discussion among colleagues at a brown-bag lunch.

Possible Solutions
We have experimented with a number of solutions to these problems. Chances
of success for any given project will be maximized if there is someone to 'own'
the project. Likewise, it is advisable that one or more advisors with training in

Chapter8 Action Research 247


research methods and experience in doing research are available as needed to
provide assistance and support to teachers. In addition, teachers shouldbe given
some release time from face-to-face teaching during the course of their action
research. Collaborative teams can be formed, desirably across schools or teach
ing sites, so that teachers involvedin similar areas of inquiry can support one an
other. Finally, it is important that teachers are given adequate training in
methods and techniques for identifyingissues, collecting data, analyzing and in
terpreting data, and presenting the outcomes of their research. We will address
each of these issues in turn.
First, completing an action research project is a little like completing a
marathon at the same time as you carry out a wide range of other tasks. In order
to succeed, teachers have to be in it for the long haul. After an initial burst of en
thusiasm, some teachers'hit the wall' (asmarathon runners say). Energy and en
thusiasm can begin to wane, and many teachers are tempted to put off" essential
data collection or analysis tasks, or even abandon the project completely.
Having an enthusiastic partner or team member can go a long way towards
maintaining enthusiasm about the project. In the two action research networks
Nunan advised, local facilitators filled this role. Both were senior teachers who
had considerable experience as educational administrators. One of the local fa
cilitators was completing a doctorate and was able to answer many teachers'
queries directly.The other facilitator had recently completed a master's degree
and was able to get help from his former professors. Importantly, these two facil
itators had also successfully carried out action research projects of their own.
This experience gave them credibility among the teachers and enabled them to
act as a bridge between teachers and administrators.
These local facilitators were proactive as well as reactive. They maintained
frequent contact with the teachers involved in the network through telephone
conversations, e-mailcorrespondence, and occasional face-to-face meetings, and
they were able to identify those teachers who were at risk of dropping out. When
teachers contacted them with practical problems and blockages, the facilitators
were able to offer advice from their own perspectives.
Secondly, evenwith the support of a collaborative network of fellow teach
ers, doing action reach can be lonely and isolating. The chances of long-term
success will be enhanced if someone is available at reasonably short notice to
providetechnical advice. This support is important at all stagesof the action re
search cycle.
Third, in order for teachers to succeed at conducting action research, it is
important that they get some relief from their normal duties while they are
doing action research. Asan alternative, participatingin an action researchproj
ect could be counted as meeting professional development hours required by
many school systems.
Fourth, establishing collaborative teams of action researchers, preferably
across schools or teaching sites, can be very helpful. Such teams provide a struc
ture whereby teachers involved in similar areas of inquiry can support one
another. In addition, it is harder to abandon a project if your colleagues are
counting on you to support them in their action research endeavors.

248 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Another challenge is to convince teachers that qualitative data collectionand
analysis are, in fact, legitimate approaches to research. Many who havehad min
imal contact with research hold the mistaken idea that research must necessarily
always involve statistical analyses. Ironically, it is this notion that lies behind
much of the trepidation that teachers feel about doing research.
Finally, teachers should receive adequate training in procedures for identify
ing issues, for collecting, analyzingand interpreting data, and for presenting the
outcomes of their research. Like any other form of classroomresearch, successful
action research demands adequate planning and preparation. Providing teachers
with training in research methods and adequate planning time before they embark
on the project will enhance the chances of success. At the beginning of the
process,once teachers haveidentified an issue,problem, or puzzle,it is important
for them to think small. Many teachers, in the first enthusiastic flush of the
project, begin sketching out a proposal that would require substantial doctoral-
levelresearch. (Remember the story about Japanese flower arranging!)

Benefits of Action Research


There are several significant payoffs for teachers who carry out action research
investigations in their classrooms. In the first place, the research is centered on
real problems, puzzles, or challenges teachers face in their daily work. It can
therefore carry immediate benefits and tangible improvements to practice. Sec
ondly, it can lead teachers to see connections between 'mainstream' theory and
research and their own practice. Third, by increasing the teacher's control over
and active involvement in his or her immediate professional context, it can em
power the teacher.
One of the strong claims of proponents of action research is that it leads to
improvements in practice. In our experience, teachers have been able to change
their lesson plans and curricula, build collegial relations, and increase their con
fidence by conducting action research. Nunan (1993)documented changes made
to classroom practice by the teachers as a result of being involved in the action
research network described above. Although several teachers collaboratively in
vestigated a particular issue (for example, implementing task-based teaching in
their classrooms), most teachersworked on individualprojects. However,once a
month, they all met together for a half-day workshop to exchange ideas, share
problems, and generally support each other. These teachers reported on the
changes they had made in their classroom practice as a result of being involved
in action research. (See Table 8.3.)
Table 8.3 illustrates a positive'ripple effect.' These teachers felt that engage
ment in action research not only helped them solve specific problems in their
classrooms, but it also led them to improve their classroom management and
interaction skills.
If we think back to the issues of control over variables in experimental
research leading to internal validity, we can see that the lack of control in action
research means there can be no strong internal validity in the classical psycho
metric sense. As action researchers, we do not try to exert control over variables

Chapter8 ActionResearch 249


TO^BILiE .$.$ j Changes made to classroom practice as a result of taking
part in action research (adapted from Nunan, 1993, p. 47)
About the
Action More Same Less

tend to be directive 1 14 10

try to use a greater variety of behaviors 16 6 0


praise students 15 10 0
criticize students 0 11 13

am awareof students' feelings 18 6 0

givedirections 4 16 5

am conscious of my nonverbal behavior 11 14 0

use the target language in class 19 6 0

am conscious of nonverbal cues of students 12 12 0


try to incorporate student ideasinto my teaching 20 5 0

spend more class time talking myself 1 9 15

try to get my students working in groups 15 8 0

try to get divergent open-ended student responses 14 10 0


distinguish between enthusiasm and lack of order 9 15 0

try to get students to participate 18 7 0

that might influence the outcomes, so we cannot unequivocally say that the
planned interventions caused the observed results. Instead, in conducting action
research, teachers seek out options that seem to them to be convincing solutions
to problems or classroom puzzles.
Likewise, there can be no external validity in action research because the
conditions in any given setting could never be duplicated in another. Indeed,
generalizability is not a goal of action research. Rather, action researchers seek
localunderstandingand wish to improvetheir own practice.
However, like case studies and ethnographies, action research reports are
often thought-provokingand comparable. We willcite just one example here. In
Chapter 3, we reported on Sato's (1982) observational researchon turn takingby
Asian students in English classes. This concern is one that many teachers have
faced. In fact, Tsui (1996) wrote a report on action research projects conducted
by thirty-eight teachers in Hong Kong about this very issue.
In Tsui's report, the teachers first collected baselinedata on the interaction
in their classrooms. The teachers themselves identified five possible reasons to
account for the students' apparent reluctance to speak in class. These were
(1) the students' low English proficiency, (2) their fear of making mistakes and
being derided for doing so, (3) the teachers' intolerance of silence, (4) the teach
ers' uneven allocation of turns, and (5) the incomprehensible input the students
experienced in class—oftenin the teachers' speech.

250 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


As a resultof their analyses, the teachers came up with several strategies for
encouraging their students to speak during their jEnglish classes. The teachers
chose various ways to document these strategiesl which included lengthening
their wait time, improving their questioning techniques, and accepting a variety
of answers. Someteachers createdmore opportunities for peer support and group
work and for establishing good relationships with the learners. Some chose to
focus more on the content of students' contributions rather than the form.
Tsui(1996) did not make anyclaims about the generalizability of these find
ings, but the fact of the matter is that gettingstudents to speak the target lan
guage in language classes isa concern for teachers in many partsof the world. It
maybe that some of the strategies theseteachers used will be directly helpful to
others, or indirectly helpful in suggestingalternative strategies.

CONCLUSION

We end this chapter where it began, with a quote from Mingucci (1999), who
wrote about the role of action research in professional development. She said
that for teachers

to fully embrace the principles and philosophy of action research, they


need to begin by reinventing themselves. Practitioners must look at
themselves and their practices,as if for the first time, and try to see them
selvesas the central object of their research if true change is to occur. We
can only address the outside world after we have addressed our individ
ual internal ones. We can only create alternatives to the existingmethod
and structures after we have restructured ourselves, (p. 16)

Thorpe's experience of learning to yield some cbntrol in his lessons illustrates


this point.
In our view, action research can contribute to both the knowledge base of
the field and the ongoing professional developmentof the teachers who use it to
investigate important issues in their own classrooms. As van Lier (1994a) has
noted, "Action research is hardly ever short-term, but... a way of working in
which every answer raisesnew questions, and one can thus never quite say, 'I've
finished.'" (p. 34)

QUESTIONS AND TASKS


1. Look back at the summary of Thorpe's (2004) action research project.
Which of the "payoffs" described above were present in his case?
2. There are now several Web sites devoted to educational action research.
Here are five that we believe are useful:
1. A Web site of the Madison Metropolitan School District, Research
Abstracts, is geared specifically to elementary and secondary school

Chapter8 Action Research 251


contexts. The address is
http://www.madison.kl2.wi.us
2. The Action Learning ProjectWebsite provides reportsand a descrip
tion of a large project in Hong Kong. The address is
http://www.acad.polyu.edu.hk/~etwalp/
3. A Web site at Southern Cross University, Australia, while not specifi
cally aboutlanguage teaching research, isvery helpful. It canbe found at
http://www.scu.edu.au/schools/gcm/ar/arhome.html
4. A good Web site for a useful list of action research URLs that demon
strate lessons learned from other researchers can be found at
http://www.cudenver.edu/~mryder/itc/act_res.htm
5. An extensive database of articles and links to other resources can be
found at
http://carbon.cudenver.edu/~mryder/itc/act_res.html
Visit one or more of these Web sites and see what sort of information and
support are provided that could be helpful to you.
3. Read one or more of the action research reports described in the "Sugges
tions for Further Reading" below. What problems or puzzleswere investi
gated? WTiat data collection procedures were used?
4. If you are currently teaching, identify a problem or puzzle in your work
that could be investigated by using the action research cycle.The first step,
as illustrated by the teachers in Tsui's (1996) report, is to gather baseline
data to help you understand the situation better. How could you gather
such data? What sorts of data collection procedures are appropriate, given
your context and your focus?
5. After you have collected some baseline data, determine your first action
step. What exactlydo you want to do in hopes of improvingyour situation?
6. Determine how you will gather data while and after your implement the
first action step. Would videotaping or tape-recording be appropriate? Or
could you invite a colleague to observe your lessons? Is there some sort of
information that your students will produce which you could legitimately
use in your study?
7. Turn back to Chapter 5 and review the various types of questionnaire
items. What sort of questionnaire might be helpful in the action research
project you are envisioning?
8. Think about how you could analyze the data that you do collect. What
sorts of data analyses might be involved in your action research project?
9. Imagine a debate between a committed action researcher and a commit
ted experimental researcher. What would they disagree about in terms of
(1) permissible research questions, (2) ideal data collection procedures,
and (3) appropriate data analysis procedures? What would they probably
agree about?

252 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


SUGGESTIONS FOR FURTHER READING

Two booklength treatments about action research in language education are


Collaborative Action Research for English Language Teachers by Burns (1999) and
Action Research for Language Teachers by Wallace (1998). A collection of action
research studies by language teachers was edited by Edge (2001). Michonska-
Stadnik and Szulc-Kurpaska (1997) produced volume of action research
reports done by teachersin Poland.
Some early influential writing about action research in general education
was done at DeakinUniversity in Australia. Resulting publications include Carr
and Kemmis, (1986), Kemmis and Henry (1989), and Kemmis and McTaggart
(1982; 1988).
A number of publications from the National Center for English Language
Teaching and Research (NCELTR) in Australia report teachers' investigations
using action research. See, for example, the volumes edited by Burns, de Silva
Joyce, and Hood (1995; 1997; 1999a; 1999b; 1999^; and 2000).
Other publications on action research in contexts of second or foreign
language education include Burns (1995; 1997; 2000), Chamot (1995), Chan
(1996), Crookes (1993; 1998; 2005), Curtis (1998; 1999), Duterte (2000), Kebir
(1994), Knezedvic (2001), Knowles (1990), Kwan (1993), McPherson (1997),
Mingucci (1999), Mok (1997), Nunan (1990; 1993), Ruiz de Gauna, Diaz,
Gonzales, & Garaizer (1995), Szostek (1994), Tinker Sachs (2000), and van Lier
(1992).
D. Allwright (1997; 2003; 2005) describes an alternative to action research
that he calls exploratory practice (Allwright & Lenzuen, 1997). It emphasizes un
derstanding over bringing about improvement. See also Fanselow and Barnard
(2006).
If you would like to know more about perceptual mismatches between
students' and teachers' views of language lessons, see Block (1996) and
Kumaravadivelu (1994; 2003).

Chapter8 Action Research 253


PART III

Data Collection Issues


Getting the Information
You Need

I n this section, we look at three families of procedures for obtaining language


classroom data. The first of these is classroom observation, the second is
the use of introspective methods, and the third is elicitation procedures.
Although they have some natural alignments with some traditions rather than
others, all of these procedures can be useful in any of the research traditions dis
cussed in earlier chapters.

Chapter 9: Classroom Observation


By the end of this chapter, readers will
• describe five options for collectingdata during classroom observations;
• evaluate, discuss, and critique two widelyused observationschemes for
coding classroom data;
• discuss the advantagesand disadvantages of observation schemes as
opposed to ethnographic narratives;
• articulate the issues involved in transcribing classroominteraction;
h use diagrammatic schemes such as maps and seating charts for capturing
classroom interaction.

Chapter 10: Introspective Methods of Data Collection


By the end of this chapter, readers will
• define introspective data collection and differentiate between introspection
and retrospection;
• define and exemplifythe following introspective methods: think-aloud
protocols, stimulated recall, diary studies, and [auto]biographical research;
• discuss quality control issues in introspective data collection.

255
Chapter 11: Elicitation Procedures
Bythe end of this chapter, readers will
• describe and exemplify a range ofelicitation procedures, including
interviews, production tasks, role plays, questionnaires, and tests;
• identify five different types ofinterviews and explain thedifferences among
them;
• describe the relationship between production tasks and learnerdiscourse;
• discuss the advantages and disadvantages of using elicitation procedures to
gather data.

256 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


C H A P T E R

Classroom Observation

Observation isalways selective. Itneeds a chosen object, adefinite task,


an interest, apoint ofview, aproblem.... Itpresupposes interests, points
ofview andproblems (as cited in Cohen and Cohen, 1980, p. 266).

INTRODUCTION AND OVERVIEW]


In this chapter, we look at methods for directly documenting life inside theclass
room. As language classrooms are specifically constituted in order to facilitate
learning, it makes eminent sense to observe what goes on there.
In early classroom research, when product studies were dominant, there were
relatively few investigations that involved a solid observational component. Since
process studies have become more widely accepted inclassroom research, the use of
observation procedures indata collection has increased. In fact, with therecent em
phasis onprocess studies and process-product studies, observational data collection
procedures have become widely accepted and, insome cases, are seen as essential.

The Contexts of Classroom Observation


In Chapter 1, we quoted van Lier's (1988) definition of a classroom: "The L2
classroom can be defined as the gathering, for a given period of time, of two or
more persons (one ofwhom generally assumes therole ofinstructor) for the pur
poses of language learning" (p. 47). We also noted that now, with the advent of
online learning, the concept of a classroom has been expanded to include "vir
tual" classrooms, in whichlesson participants maybe separated by miles—even
bycontinents! Nevertheless, D. Allwright's (1988) characterization ofclassroom

257
research still holds true, whether teachers and learners areinteracting in a phys
ical space or an electronic lesson:
Classroom-centered research is just that—research centered on the
classroom, as distinct from . . . research that concentrates on the inputs
to the classroom (the syllabus, die teaching materials) or the outputs
from the classroom (learner achievement scores). It does not ignore in
any way or try to devalue the importance of such inputs and outputs. It
simply tries to investigate what happens inside the classroom when
learners and teachers come together, (p. 191)
That is, indeed, a key purpose of observation: "to investigate what happens in
side the classroom when learners and teachers come together" (ibid.).

REFLECTION

Given the information above about classrooms, how would you define
class-room observation ?

Defining Classroom Observation


For our purposes in discussing language classroom research, we will define class
room observation as a family of related procedures for gatheringdata during actual
language lessons or tutorial sessions, primarily by watching, listening, and
recording (rather than by asking). These procedures are both electronic and
manual in nature, as shown in Figure 9.1.

f Field Notes

f Manually •<
Observation
Schedules

Collecting
Information During <
Observations Video-Recordings

v. Electronically < Audio-Recordings

Synchronous and
^ asynchronous chat records

FIGURE 9.1 Options for collecting data during classroom observations


(adapted from Bailey, 2006, p. 95)

258 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Manual data collection can be either open-ended, as in the case of field
notes that lead to ethnographic narratives, or they can be constrained by an
observation schedule. This term may seem like it refers to a calendar of ap
pointments for visiting classes, but actually an observation schedide is acodified
system of observation categories. (People also say observation scheme or obser
vation system.) !
Historically, electronic data collection has been done with audiotape and
videotape recorders, butrecent technological developments are influencing our
electronic data collection options. For example, online classes involve synchro
nous chats and asynchronous discussions, both of which yield records of the
interaction.

APPROACHES TO CLASSROOM OBSERVATION


There are many ways of documenting classroom life, whether you are working
as a participant or a nonparticipant observer. (See Chapter 7.) For example,
you can use a range ofmanual data collection procedures, including field notes,
observation systems, maps, and seating charts. If you have made electronic
recordings of a lesson, you can code the data, but you can also transcribe the
actual student and teacher utterances that occurred during the lesson. For
many forms of online learning, the actual synchronous or asynchronous
records of written turn taking provide natural transcripts that can be analyzed
for some purposes. The various tools that researchers use to document what
goes on in the classroom willhave an impacton what they see and howthey in
terpret what they see. I
With strong records (transcripts, audiotapeor videotape recordings, or high
quality field notes) you can also use stimulated recall (Gass and Mackey, 2000).
Stimulated recall, as the name suggests, is a procedure by which a researcher
stimulates the recollection of a participant in an event by having that person re
view data collected during the event. The data used in stimulated recall usually
consist of videotape or audiotape recordings, or transcripts made from such
recordings, though some researchers have also used field notes. The benefit of
using stimulated recall in classroom research is that you can document the per
spective of lesson participants without interrupting them while the lesson is in
progress. Also, by prompting their memories with data from the event, you can
getbetterinformation thansimply byasking themto remember the lesson with
out supporting data. We will return to stimulated recall in Chapter 10, when we
discuss introspective methods.
Three basic approaches to documenting classroom interaction are
(1) through the use of observation systems to code data (either in real time or
using recorded data), (2) by recording and transcribing classroom interactions,
and,less commonly, (3) byproducing ethnographic narratives. In thissection, we
will look briefly at eachof these approaches to show you how they provide very
different perspectives on the lessons they document.

Chapter 9 Classroom Observation 259


Using Observation Systems to Code Classroom Data
Some ofthe earliest classroom research in general education involved training
observers to use category systems to document students' and teachers' behaviors
and speech during lessons. Most observation systems can also beused to code in
teraction captured with videotape or audiotape recordings. Observation systems
can be useful in focusing the observers' attention and in addressing some
research questions.

REFLECTION

Study the data in Table 9.1.This record was produced bya classroom ob
server who used an observation scheme and made a tally mark every time a
particular behavior occurred. How much can you tell about the lesson in
which the observation took place? Can you make inferences about the
following, for example?
• what the size of the class is
• whether the students are children or adults
• whether it is an EFL or ESL class
• what the focus or objective of the lesson is
• how long the interaction lasts
• when the interaction took place (at the beginning, middle, or end of the
lesson)

REFLECTION

Given the tallies in Table 9.1, try to imagine the discussion that produced
these data. As a hint, we will tell you that the interaction takes place in a
classroom at the beginning of a lesson after lunch.

A coding system like die one depicted in Table 9.1 can either be used while
the lesson is proceeding (in "real time") or later with videotapes or audiotapes of
the lesson. If the system is used by an observerduring the actual lesson, the tally
marks are either made at regular time intervals (e.g., every three seconds) or
every time there is a category change. Long (1980) describes and names these
two approaches:

When each event iscoded each time it occurs,we are dealingwith a true
category system. When each event is recorded only once during a fixed
period of time, regardless of how frequently it occurs during that
period, we have a sign system, (p. 6)

260 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


For instance, with some sign systems, observers code the activities every three
seconds. Over time, the cumulative data provide a picture of predominant
behaviors.

REFLECTION

What would be the advantages and disadvantages of using a category sys


tem or a sign system to record classroom interaction?

Producing Ethnographic Narratives


In Chapter 7, we discussed ethnography. Many of the procedures of ethno
graphic research have been adopted by classroom researchers. Figure 9.2 pres
ents an ethnographic narrative of the same interaction that was analyzed in
Table 9.1. You will get a very different perspective on the interaction from this
data set. Here the actual language used by the teacher and students is brought
out, along with the interpersonal dynamics and the affective climate of the
classroom.

TABLE 9.1 Use of an observation scheme for analyzing classroom


interaction

Category Tallies Total

1. Teacher asks a display question (i.e., a question III 3


to which he or she knows the answer)
2. Teacher asks a referential question (i.e., a question mi 4
to which he or she does not know the answer)
3. Teacher explains a grammatical point 0

4. Teacher explains the meaning of a vocabulary item 0

5. Teacher explains a functional point 0

6. Teacher explains point relating to the content i 1


(theme/topic) of the lesson
7. Teacher givesinstructions/directions UUI 6

8. Teacher praises i 1

9. Teacher criticizes 0
->

10. Learner asks a question in

11. Learner answers a question llll 4

12. Learner talks to another learner 0

13. Period ot silence or confusion 0

Chapter9 Classroom Observ ation 261


The teacher enters the classroom in conversation with one of the students.
"Ofcourse I had lunch," hesays. "Notenough. Why? Why?"
The student gives an inaudible response andjoins the restof thestudents
who are sitting ina semicircle. There are eighteen students in all. They are a
mixed groupin both ageand ethnicity.
The teacher deposits three portable cassette players onhis table and slumps
inhis chair. "Well, like I say, I want togive you something to read—so what you
do is,you have to imagine whatcomes in between, that's all..." He breaks off
rather abrupdy and beckons with his hand, "... Bring, er, bring your chairs a little
closer, you're too far away."
There issome shuffling as most ofthestudents bring theirchairs closer. The
teacherhaltsthem by puttinghis hand up, policemanlike, "Er, ha, not that close."
Thereissome muffled laughter. The teacher isabout tospeak again when a
young male student breaks inwith a single utterance, "Quiss?" The teacher gives
him a puzzled look. "Pardon?"
The student mutters inaudibly to himself and then says, "Itwill be quiss? It
will be quiss? Quiss?" Severalother students echo, "Quiss. Quiss."
The teacher grins andshakes his head, "Ahm, sorry. Tryagain." The student
frowns in concentration and says, "I ask you..." "Yes?" interjects the teacher.
"... You give us another quiss?"
Slowly the light dawns on theteacher's face. "Ohquiz, oh! No, no, nottoday.
It'snot going to bea quiss today. Sorry... But, um, what's today, Tuesday, isit?"
"Yes,"says the student.
The teacherfrowns and flicks through a notebookon his desk. "I think on
Thursday, ifyou like. Same one as before. Only I'll think upsome new questions—
the other onesweretoo easy."
The students laugh, thenthe teacher, holding up the daily newspaper, contin
ues, "Um,Okay, er I'll take some questions from, er,from the newspaper overthe
lastfew weeks, right? So—means you've got to watch the news and readthe news
paper and rememberwhat's going on. If you do, you'll win.If not, well, that's life."
One of the students, a Polish woman in her earlythirtiessays, "Willbe
better fromTV" There is laughter fromseveral of the students.
"From the TV?" echoes the teacher. "What, er, what programs?"
"News, news," interject several of the students.
There is an audible comment from one of the students. The teacher turns
sharply and begins, "Did yousay... ?"He breaks offabrupdy. "Oh, okay. We'll
have, er,it'll bethe s ..., it'll be the same." He pauses andthen adopts an
instructionaltone as he attempts to elicit a response from the students. "There'll
bedifferent... ?Er,there'll bedifferent... ?Different? Different? The ques
tions will be on different... what? Different?"
"Talks," ventures one of the students near the front.
"Tasks? What?"says the teacher, giving a slightfrown.
"Subject?" suggests anotherstudentrather tentatively.
The teachergives her an encouraging lookand says, "Differentsub..." He
extends his hand and narrows his fingers asif to say, "You've nearly gotit."
"Subjects,"saysthe student, beaming.
The teacher beams back, "Subjects, subjects, thankyou. Right, yes."

FIGURE 9.2 Ethnographic narrative ofa segment ofclassroom interaction

262 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


As you can see, the ethnographic narrative provides a more complete pic
ture of the interaction than do the coded tally data.The actual language used by
the participants is documented, along with comments about the social climate
of the classroom. The major disadvantages of this particularmethod of carrying
out classroom observations are the extremely time-consuming nature of the task
and the biases inherent in the authorial comments that are woven into the
narrative itself.

ACTION

Compare the tally data in Table 9.1 with the narrative data in Figure 9.2.
List three to five differences in the sorts of information these data provide.

REFLECTION

What do you seeas the advantages and disadvantages of thesetwo methods


of recording classroom interaction (tallies and ethnographic narratives)?

In a workshop on classroom interaction and research, a group of teacher re


searchers came up with the pointsin Table 9.2 in response to the prompt above.

Transcribing Classroom Interaction


The diird main way of presentingthe interaction is through transcription. Making
a transcript of classroom interaction canbevery time-consuming. In fact, based on
their experience, Allwright and Bailey (1991) estimate diat it takes up to twenty
hours to produce a high quality transcript of one hour of classroom interaction,
depending on how detailed the transcript needs to be. (Good recordings of pair
work or small groupworktake less timeto transcribe, since the audiotape recorder
can be closer to the individual speakers, and there are typically fewer overlapping
turns or extraneous noises to complicate the transcription process.)
Transcripts can be analyzed through varied means, including coding. This
procedure involves identifying selected bits of data asbelonging to a certain class
or category ofbehaviors. Here isa transcript that has been coded using a scheme
devised by Bowers (1980), whose interaction categories are listed in Table 9.3.

REFLECTION

Before reading further, usethe categories inTable 9.3 to analyze theverbal


data reported in Figure 9.2. Which categories are clearand easy to apply?
Which categories, if any, seemunclear or difficult to apply to die data?

Chapter9 Classroom Observation 263


TABLE 9.2 Advantages and disadvantages of observation schemes
and ethnographic narratives for recording classroom
interaction

Advantages Disadvantages
Observation May seem objective Likelyto distort actuality
System Good for observer to use while Does not show the human
watching class element
Good for self-analysis by Very abstract
teacher
Focuseson quantity not quality
Easy to compare different Does not indicate success or
interactional categories failure
Easy to focus on specific Does not indicate sequences of
elements
interaction
Orients one's mind set as Open to misinterpretation
observer
Categories create straitjacket
Visual presentation—easy to
overview
Categories may be biased
toward teacher
Does not indicate length of
interaction
Ethnographic Displayssignificant Difficult to use for clinical
Narrative paralanguage purposes
Reflects rapport between Time-consuming to write
teacher and students Allows distraction by focusing
Gives overall effect of on unimportant details
interaction
Open to emotive reporting
Can be used to carry out Inadequate on its own
subsequent tally-sheet analysis
Cannot be done in real time
Shows real nature of questions
asked
Requires high-quality recording
equipment and/or note-taking
Context given to support the skills
language

The data reported in the ethnographic narrative in Figure 9.2 are presented
below as a transcript (adapted from Nunan, 1989, pp. 80-81). It has been ana
lyzed using Bowers' (1980) categories.
T Of course I had lunch ... not enough ... why? Why? (sotiating)
Well, like I say, I wantto give you something to read (organizing)
—so what you do is, you haveto imaginewhat comesin between,that's all
... (organizing)
... Bring,er, bring your chairsa little closer, you're too far away er, ha, not
that close (organizing)
S Quiss? (eliciting)

264 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 9.3 Bowers'(1980) categories for analyzing classroom
interaction

Category Description
Responding Any actdirecdy sought bytheutterance ofanother speaker, such as
answering a question.
Sociating Any actnotcontributing direcdy to the teaching/learning task, but
rather to the establishment or maintenance of interpersonal
relationships.
Organizing Any actthatserves to structure thelearning task or environment
without contributing to the teaching/learning taskitself.
Directing Any actencouraging nonverbal activity as an integral part ofthe
teaching/learning process.
Presenting Any act presenting information ofdirect relevance to the learning task.
Evaluating Any actthatrates another verbal actpositively or negatively.
Eliciting Any actdesigned to produce a verbal response from another person.

T Pardon? (responding)
S It will be quiss? It will be quiss? Quiss? (eliciting)
Ss Quiss ... quiss (eliciting)
T Ahm, sorry... try again (eliciting)
S I ask you ... (eliciting)
T Yes?
S You give us another quiss? (eliciting)
T Oh, quiz, oh! No, no, not today... It's not goingto be a quiss today...
sorry ... but, um, what's today, Tuesday, is it? (eliciting)
S Yes (responding)
T I think on Thursday, if you like . . . same as before . . . only I'll think up
some new questions—the other ones were too easy . . . um, okay, er I'll
take some questions from, er, from newspapers over the last few weeks,
right? so—means you've got to watch the news and read the newspaper
and remember what's going on ... if you do, you'll win ... if not, well,
that's life (organizing)
S Will be better from TV (sociating)
[laughter]
T From the TV? ... What, er, what programmes ... (eliciting)
Ss News, news (responding)
T Did yousay... ? Oh, oh, we'llhave, er, it'll be the s ..., it'll be the same ...
there'll be different... ? er, there'll be different... ? Different? Different?
The questions willbe on different... what? Different? (eliciting)
S Talks (responding)

Chapter 9 Classroom Observation 265


T Tasks? (evaluating)
What? (eliciting)
S Subject? {responding)
T Different sub . . . (eliciting)
S Subjects (responding)
T Subjects, subjects, thank you . . . right, yes (evaluating)

ACTION

Check your own analysis against the categorizations above.

SYSTEMS FOR OBSERVING LANGUAGE


CLASSROOM INTERACTION

In the history of language classroom studies, researchers have developed numer


ous observation schemes for documenting the life of language classroom. (Such
systems are also called observation schedules, but in this book we will use the terms
observation schemes or observation systems to avoid the possible confusion with a
time schedule.) This proliferation has from time to time been criticized on the
grounds that studies employing different schemes are difficult to compare. The
reason for the proliferation is that different research questions and issues de
mand different data collection tools.
The earliest systems for observing language lessons were heavily influenced
by research in general education. For instance, Moskowitz (1971; 1976) adapted
Flanders' (1970) Interaction Analysis system to produce an instrument called
Foreign Language Interaction Analysis, or "FLint." This instrument used the
basic categories for coding teacher talk and student talk that Flanders had devel
oped, but it also allowed for coding the interaction in two languages as needed
for \2 educational research.
Observation systems can focus on many different facets of interaction:
verbal, paralinguistic, nonlinguistic, cognitive, affective, pedagogical content,
aspects of discourse, etc. (For an elaboration on these and other aspects of obser
vation schemes, see Chaudron, 1988.) Copies of several different observation
systems can be found in the appendices of Allwright and Bailev (1991).
In evaluating and selecting an observation scheme, the following questions
need to be taken into consideration.

1. Does the scheme require the observer to check a behavior (such as the
teacherasking a question) every time the behavior occurs, or is it necessary
to make a check at regular intervals?
2. Doesthe scheme deploy high- or low-inference categories? (A high-inference
category requires the observers to interpret die behavior they observe: For
example: "Students are on task," or "Students are interested in the lesson.")

266 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


3. Does the scheme allow a particular event or utterance to be assigned to
more than one category?
4. Is the instrument intended to be used in real time or with videotape or au
diotape recordings?
5. Is the scheme intended principally for research or teachereducation?
6. What is the focus of the instrument?

ACTION

Evaluate the observation system in Table 9.3 against these six questions.

Keep these questions in mind as your read the next sections of this chapter.
We will discuss two observation systems that have been influential in second lan
guage classroom research and teacher education.

Foci for Observing Communications Used in Settings (FOCUS)


Some of the observationsystems that have been used in language classroom re
search have been influential in teacher education and teacher supervision as well.
An example isFanselow's FOCUS system. FOCUS stands for Foci for Observing
Communications Used in Settings. According to Fanselow (1977),
In this system, communications both inside and outside of the classroom
are seenas a series of patterned events in which two or more people use
mediums such as speech, gestures, noise or writing, to evaluate, interpret
and in other ways communicate separate areas of content, such as the
meaning of words, personal feelings, or classroom procedure, (p. 19)
Each of these areas of content could be categorized as serving "one of four ped
agogical purposes: structuring, soliciting, responding and reacting" (ibid.).

ACTION

Compare these four pedagogical purposes with the seven categories in


Bowers (1980) system shown in Table 9.3 above. Summarize the differ
ences and overlaps with a classmate or colleague.

Fanselow's F'OCUS system is visually depicted as categories in five columns,


each of which answers a question. The five column headings and the possible
subcategories are listed below:
1. Who communicates? (teacher, individual student, groups, whole class)
2. What is the pedagogical purpose of the communication? (to structure, to
solicit, to respond, to react)

Chapter9 Classroom Observation 267


3. What mediums are used tocommunicate content? (linguistic, non-linguistic,
paralinguistic)
4. How are the mediums used to communicate areas of content? (attend,
characterize, present, relate, re-present)
5. What areas of content are communicated? (language, life, procedure,
subject matter). (Fanselow, 1977, p. 29)
The FOCUS system can be used to code hve lessons in real time or videotaped
lessons. Learning to use FOCUS involves studying the category descriptors and
working with videotaped excerpts of classroom data to gain speed and confi
dence in coding. The value of the system is that it provides a way of characteriz
ing teaching and learning activities that is meant to be descriptive rather than
prescriptive.

Communicative Orientation to Language Teaching (COLT)


Over the years, observation schemes have become more elaborate. One of the
most sophisticated is the Communicative Orientation to Language Teaching
(COLT) Scheme, which wediscussed briefly in Chapter1.COLTwas developed
to enable researchers to evaluate intact classes (that is, existing classes that have
been constituted for teachingand learning, not for research purposes). Like all
observation systems, the COLT is ideologically loaded,and the various observa
tion categories reflect assumptions about what makes an effective classroom.
The ideology behind COLTis explained asfollows byits designers:
[the aim of the scheme is to describe] some of the features of communi
cation which occur in second language classrooms. Our concept of
communicative features has been derived from current theories of com
municative competence, from the literature on communicative lan
guage teaching, and from a review of recent research into first and
second language acquisition. The observational categories are designed
(a) to capture significant features ofverbal interaction in L2 classrooms,
and (b) to provide a means of comparing some aspects of classroomdis
course with natural language as it is used outside the classroom. (Allen,
Frohlich, and Spada, 1984, p. 233)

Ten years later, Spada and Frohlich (1995), in their booklength manual on the
COLT system, stated that three major 'themes' in the L2 teaching and learn
ing literature influenced the design of the COLT scheme. These were (1) the
widespread introduction and acceptance of communicative approaches to L2
learning; (2) the need for more and better research on the relationship between
teaching and learning; and (3) the need to develop psycholinguistically valid
categories for classroom observation schemes.
The COLT consists of two parts. Part A focuses on the description of class
room activities and contains five subsections: the activity type, the participant
organization, the content, the student modality, and the materials. Part B is

268 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 9.4 Questions relating to the principal features ofthe COLT
scheme (after Nunan, 1992, p. 99)

Feature Questions

Part A: Classroom Activities


la. Activity type What is the activity type—e.g., drill, role play, dictation?
2a. Participant organization Is the teacherworking withthe whole class or not?
Are students working in groups or individually?
If thereis group work, how is it organized?
3a. Content Is the focus on classroom management, language, (form,
function, discourse, sociolinguistics), or other?
Is the rangeof topicsbroador narrow?
Who selects the topic—teacher, students, or both?
4a. Student modality Are students involved in listening, speaking, reading,
writing, or a combinationof these?
5. Material What types of materials are used?
How long is the text?
What is the source/purpose of the materials?
Part B.Classroom Language
lb. Use of target language To what extentis the target language used?
2b. Information gap To what extentis requested information predictable in
advance?
3b. Sustained speech Is the discourse extended or restricted to a single
sentence, clause, or word?
4b. Reaction or code or Does the interlocutor react to code or message?
message

5b. Incorporation of Doesthe speaker incorporate the preceding utterance


preceding utterance into his or her contribution?
6b. Discourse initiation Do learners have opportunities to initiate discourse?
7b. Relative restriction Does the teacher expecta specific form or is there no
of linguisticform expectation of a particular linguistic form?

intended to capture the communicative features oftheclassroom. Seven features


are identified: the extent to which the target language is used, whether there is
aninformation gap, the extent to which sustained speech isencouraged, whether
the teacher responds to codeor message (that is, to form or meaning), incorpo
ration of preceding utterance, discourse initiation, and relative restriction to
linguistic form. Some of the key questions relating to each of these aspects and
features are set out in Table 9.4.
In Table 9.4, we have presented the key categories of COLT in simplified
form, to give you some idea of the scope and orientation of the system. It is
worth noting that, while the authors of COLT have tried to devise a set of

Chapter 9 Classroom Observation 269


procedures that enable trained users to obtain reliable data, the categories and
communicative features are subjective, and, as we noted above, ideological. The
COLT scheme, for example, is predicated onthe assumption that the existence
ofan information gap, the deployment ofsustained speech, the opportunity for
learners to initiate discourse and so on, will facilitate language development. At
the time the scheme was devised, research lentsupport to these features. How
ever, if such a scheme were to be developed today by a different group of re
searchers, thenthechances are thatthecategories would besomewhat different.
This comment underlines the point that there is no such thing as totally 'ob
jective' observation—that what we see will be conditioned by what we expect to
see. Ourvision will also be colored by theinstruments we adopt, adapt, or de
velop to assist us in our observations. While the use of observation schemes can
provide a sharper focus for our research than the use of unstructured observa
tion, it can also blind us to aspects ofinteraction and discourse that are notcap
tured bythescheme butthatare important to anunderstanding ofthelesson we
are observing.

OBSERVATIONAL DATA AND DISCOURSE


ANALYSIS

We will address discourse analysis more fully in Chapter 12. Here we just wish
to raise some issues related to the use of observation to collect data that can be
subjected to some discourse analytic process. Some forms of observation result
in records of classroom data that can beanalyzed using thevaried procedures of
discourse analysis, while others do not.
The advantage of observation schemes is that they serve to condense data
and facilitate the process of identifying patterns. However, unless such schemes
include the collection of actual direct quotes, theytypically obscure or lose the
verything that is of most interest to language teachersand researchers—the lan
guage itself. Observation systems in which the observer codes (i.e., interprets
and assigns to categories) interaction as it occurs result in data for which readers
onlysee the tallies—not the interaction that led to thosetallies. For this reason,
transcripts that can be subjected to discourse analysis are extremely valuable
sources of information about classroom interaction.
Two of the first linguists to develop a system for the analysis of classroom
discourse were Sinclair and Coulthard (1975). They showed that much class
room interaction consisted of a recurring pattern of teacher initiation, student
response, and teacher feedback evaluating the response (the so-called "IRF
pattern"). Here is an exampleof such a pattern:

Teacher: The questions will be on different subjects, so, er,well, one will
be about, er,well, some of the questions will be aboutpolitics, andsome of
them will be about, er ... what? (Initiation)
Student: History. (Response)
Teacher: History. Yes, politics and history. (Feedback/evaluation)

270 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Transaction Transaction Transaction
Exchange Exchange Exchange Exchange Exchange Exchange
move move move move move move move move move move move move
act act act act act act act act act act act act act act act act act act act act act act act act

FIGURE 9.3 Acts, moves, exchanges and transactions (after Sinclair


and Coulthard, 1975)

Teacher: And, um, and ... ? (Initiation)


Student: Grammar. (Response)
Teacher: Grammar's good,yes. (Feedback/evaluation)
The three-part IRF pattern was called an exchange. A series of exchanges
made upa transaction, and transactions made up lessons. Exchanges consisted of
finer-grained moves and acts. A transaction could consist of several different
exchanges, which could include many moves and acts, as shown in Figure 9.3.
Each of these categories was further defined and exemplified in Sinclair and
Coulthard's (1975) original description. For instance, moves included opening
moves, answering moves, follow-up moves, framing moves, andfocusing moves.
This system was meant to be used for coding transcripts based on videotape or
audiotape recordings rather than as a real-time data collection procedure.

REFLECTION

Based on your experience and what you have read so far, as well as your
particular research interests, doyouhave apreference for real-time coding,
orwould youprefer to categorize behavior usingvideotapes, audiotapes, or
transcriptions? j

OBSERVING SOCIAL ASPECTS 0^


CLASSROOM INTERACTION !
In addition to the actual utterances of learners and teachers, broader social as
pects ofclassroom interaction are important. With theemergence ofsociocultu
ral theory (see, e.g., Lantolf, 2000), the social life of the classroom has come to
be viewed as an important dimension of classroom research. By using the term
sotial life in this context, we are referring to the interpersonal relationships,
friendships, personality clashes, etc., thatdevelop as students and teachers getto
know each other and begin to relate to each other on a personal level. In tradi
tional second language acquisition research, these aspects of classroom interac
tion are typically not included in the analysis. They can, however, have an

Chapter 9 Classroom Observation 271


(Teacher's
. "I 1 I ,. ul Zone)
"

t 2 . -

(Students'
Desks)
1 1 F4 A A M2

I I O

FIGURE 9.4 Sample SCORE data for a twenty-minute grammar review


lesson with six students (from Bailey, 2006, p. 107)

important influence on the learning that goes on in the classroom and the kind
of language that is generated.

SCORE Data

Capturing this aspect of the classroom can be challenging because it often re


quires us to make inferences about the mental and emotional states of teachers
and learners. The process can be aided by seating charts (often called Seating
Chart Observation Records, or SCORE data). One advantage of SCORE data is
that teachers are very familiar with seating charts so they can easily interpret
such records. SCORE data can also provide information about individual stu
dents, small groups, or an entire class (depending on the number of students
present and the observer's vantage point). Acheson and Gall (1997) discuss using
SCORE procedures for studying students' time on task, verbal flow, and move
ment patterns within the classroom. Figure 9.4 shows a simplified example
based on the data for a teacher and just six students.

REFLECTION

Using this key, write a description of the interaction shown in die SCORE
chart in Figure 9.4. Or, with a colleagueor classmate, describe the interac
tion orally.

The key to interpreting these SCORE data is shown in Figure 9.5:

272 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


SYMBOL KEY

M or F = Male or female

= Teacher'sgeneral solicit to the class

= Teacher'spersonal solicit to an individual student


tl
or - = Studentspeaks to another student

= Individual student responds to a personal solicit from the teacher


I

= No response from thestudent toa personal solicit from the teacher


o

= Student response to a single general solicit from the teacher

" = Student has responded to five generalsolicits

FIGURE 9.5 Key to SCORE Data Symbols (from Bailey, 2006, p. 108)

The SCOREdata depicted in Figure 9.4give information about who initiates


verbal turns and who responds to them. Thissimple system provides fast records of
who speaks towhom and how often. Ofcourse, these data tell us nothing about the
length orcontent ofthe turns. Nordothese data convey any information about the
participants' target language use, their accuracy, or their fluency. ASCORE chart
like this one simply provides a frequency count of turns taken. Nevertheless, for
some purposes and contexts, SCORE data are quite helpful and informative.

Classroom Maps
While visual representations like seating charts and maps donot preserve theac
tual discourse that occurred in the classroom, they do enable researchers to
record the extent to which interactions occur. They also often allow patterns to
emerge thatare notimmediately apparent in transcripts or other forms ofdocu
mentation. For example, Bailey (2006) drew a map of a class taught by a non-
native speaking teaching assistant in mathematics, whose class she observed on
three different occasions. The map shows that there are forty-two desks in the
classroom, but at the three differentobservations, eighteen,three, and sevenstu
dents were present. At the scheduled time for one observation, no students
attended the class. Bailey was puzzled as to why such a small class had been
scheduled in such a large room, but when she checked with the mathematics

Chapter 9 Classroom Observation 273


department, she learned that thirty-five students were actually enrolled in the
class. The number of empty desks might not have jumped out at herifshe had
only recorded the number ofstudents present without drawing the map.
This particular teaching assistant had avery passive teaching style. His classes
were characterized by long periods ofsilence as he wrote complex math problems
on the board with his back to the students. When he finished writing on the
board, he would turn to the class and ask them, "Any questions?" Later, when the
observational field notes were analyzed, this particular teaching assistant was
categorized in a type of TAs called die mechanical problem solvers. It seemed his
English proficiency was notsufficient to allow him to write the problems on the
board and talk about them at thesame time. The way this issue connects to the
observational maps of this classroom is thatin revisiting the maps in the three sets
of field notes, Bailey found thatthere had been an overhead projector in theclass
room at all three lessons, but the TAhad not used it. One possible implication to
help him communicate better with his smdents would be that he could write the
problems on a transparency before the lesson so he could discuss them more
"asily with the class. This realization was possible because ever)' set of field notes
documented the presence of the unused overhead projector.
Visual representations of movements through space can also "track and
trace" students and teachers as they move about the classroom (Freeman, 1998,
p. 203). Records such as these are excellent for revealing 'action zones' in the
classroom (Shamim, 1996), and for determining whether particular students are
singled out for the teacher's attention. Theynot only show the physical arrange
ment of the classroom, but they also allow the observer to record a range of
behaviors, including the amount and type of interaction between participants.

ACTION

If you can observe a class, compile a SCORE chart such as the one in Fig
ure 9.4 for a lesson or part of a lesson. As an alternative, you could com
plete the following procedure adapted from Freeman (1998, pp. 203-204).
1. Outline a bird'seye view of die classroom spacethat shows the walls and
other structures as if you were looking at diem from above.
2. Identifyeverything you cansee; be as detailed asyou reasonably can.In
clude yourself in the picture. Scale is less important than accurately in
cluding as much as possible.
3. Use the same symbol for a category (e.g., circles for students and squares
for desks). You can create ways of showing differences within a category
(e.g., green circle is a bilingual student; red square is a teacher's desk).
4. To record students' movements, draw a line from where the student
starts to where he or she ends the movement. If the student makes the
same movement more than once—perhaps he or she goes back and
forth to the teacher's desk for help—you can put a check on the basic
trace line to keep track of the number of steps.

274 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Advice: Thefocus isonwhatyou see as opposed to whatyou hear orthink.
To get students' views ofthe classroom, the teicher can ask students to
create their own maps. This procedure also allows people tocompare their
own special perceptions. Now review what you lave and brainstorm a list
ofresearch questions thatarestimulated bythe data.

THE CENTRAL ROLE OF OBSERVATION IN


LANGUAGE CLASSROOM RESEARCH
As noted inChapter 1, some classroom research carried outinthe 1960s through
tothe 1980s involved experimental comparisons ofdifferent methods, materials,
and teaching techniques. For example, Scherer and Wertheimer (1964) com
pared the use of grammar-translation and the audiolingual method with two
groups ofcollege students learning German as a foreign language in the United
States. The grammar-translation group focused on reading and writing, while
the other group received instruction principally inlistening and speaking. At the
end of the extensive and expensive two-year study, subjects were tested to see
whether one group was superior to the other. The results were disappointing
since thestudy was unable to demonstrate thesuperiority ofone group over the
other. The grammar-translation students did slightly better onreading and writ
ing than the audiolingual students, while the latter did somewhat better onlis
tening and speaking. However, as there had been noobservation ofactual lessons
during the investigation, it was difficult to say what was really going on "inside
the 'black box'" (Long, 1980, p. 1).
In his overview of observation in language research, D. Allwright (1988)
suggested thatperhaps thewrong question was being asked:
[This research was conducted] on the assumption that it made sense to
ask "Which is the best method for modern language teaching?", and
that presumably on the additional assumption thatonce the answer was
determined, it would then make sense to simplyprescribe the 'winning'
method for generaladoption, (p. 10)
Since then, the value of observation has been demonstrated in process studies
(those based primarily on observational data) and process-product studies (those
based on classroom observational data and the examination of outcome meas
ures). As a result, the product-only focus has largely been replaced in language
classroom research.
Ellis (1990b) suggests that there are three broad approaches to classroom re
search. These approaches have different goals and methods, as shown inTable9.5,
but allcaninvolve observation asan importantapproach to data collection.
In animportant and influential review, Ellis (1988) synthesized theresults of
a large number ofclassroom observation studies and identified eight key factors

Chapter 9 Classroom Observation 275


TABLE 9.5 Empirical research onL2 classrooms (from Nunan, 1992,
p. 93, after Ellis, 1990b)
Category Goal Principal ResearchMethods
1. Classroom The understanding of how the The detailed, ethnographic
process 'social events' of the language observation of classroom
research classroom are enacted. behaviors
2. The study of lb test a number of hypotheses Controlled experimental studies;
classroom relating to how interaction in ethnographic studies of
interaction the classroom contributes to L2 interaction
and L2 acquisitionand to explorewhich
acquisition types of interaction best
facilitate acquisition
3. The study of To discover whether formal Linguistic comparisons of
instruction instruction results in the L2 acquisition by classroom
andL2 acquisition of new L2 and naturalistic learners;
acquisition knowledge and the constraints experimental studies of the
that govern whether formal effects of formal instruction
instruction is successful

that were associated with successful second language classroom acquisition. The
following keyfactors are adapted and summarized from Ellis:
1. Quantity of'intake': The amount of the target language that learners at
tend to is significant—quantity alone is insufficient (i.e., the quantity of
language produced by the teacherasinput).
2. A need to communicate: This need can be provided if the target lan
guageserves as the medium as wellas the target of instruction.
3. Independent control of the propositional content: Learners have a
choice over what is said, and part of this should be content known to the
learner but not the teacher.
4. Adherence to the 'here and now* principle: In the early stages at least,
encoding and decoding are facilitated if the things being talked about are
present in the learning environment.
5. The performance of a range of speech acts: The learner should be
encouraged to use a range oflanguage functions and to perform a variety
of roles with the classroom discourse (for example, initiating as well as
responding to discourse).
6. An input rich in directives: Particularly in the early stages of learning,
directives occur in familiar and frequently occurring contexts, they refer to
the 'here and now,' they are morphosyntactically simple, and, as they
require a nonverbal result,they are more likelyto count as successful com
munication than interactions requiringa verbalresponse.

276 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


7. An input rich in 'extending' utterances: These are teacher utter
ances which pick up, elaborate, or in other ways extend the learner's
contribution.
8. Uninhibited 'practice': This concept refers to the right of the learners
to practice the target language without intending to communicate and to
repeat utterances that are meaningful to the learners themselves.

ACTION

Using these eight categories from Ellis (1988), analyze the transcript
above, in which the students ask about the possible "quiss."

QUALITY CONTROL ISSUES


As in other forms of data collection, reliability and validity are concerns in ob
servational studies. You will recall that reliability embodies the concept of
consistency. Likewise, in studies that collect data through observational proce
dures, avalidity concern is whether the data collected represent the reality ofthe
classroom.
In studies where data are recorded electronically, the issue is often one of
coverage: How adequately does the audiotape or videotape recording capture
what is going on? Audiotape recordings, ofcourse, lack the visual mode, butthey
are often easierto make since small recorders can be placed on students' desks or
in the middle of a table during group work. Video recordings capture the visual
channel, but cameras can be more intrusive than audiotape recorders and may
influence teachers' and students' behavior, especially if a camera person is pres
ent. However, in our experience, ifyou take time to acclimate the participants to
the presence ofthe camera, and ifthey trust you, you can overcome much ofthe
possible interference.
When a researcher takes field notes or codes data in real time, the human
observer is the data collection instrument. In these cases,it is important to estab
lish the viability of the observations. One procedure for doing so is to calculate
inter-observer (or inter-coder) agreement. This process involves having two
observers in the classroom at the same time or two observers working with
the same video recording or audio recordings. Normally, observers are trained
together with video recordings before they begin gathering "live" data in
classrooms. By convention, an agreement level ofatleast 85% is required for the
observers to be considered reliable. (See Chapter 14 for more information.)
Based on herexperience with things thatwent wrong in herown observa
tions, Bailey (1983a) wrote a brief article about Murphy's Law in language

Chapter 9 Classroom Observation 277


classroom research. (Murphy's Law is the belief that ifanything can go wrong,
it will.) The result was ten "lessons" for classroom observers:

1. Ifyou are going to take notes, always carry paper and pens or pencils and
something firm to write on, like a clipboard, (p. 4)
2. Ifyou are going to tape record, make sure you have access toa good tape
recorder and thatyou know how to operate it correctly, (p. 4)
3. Always investigate the classroom where you will be observing before the
actual observation begins, (p. 4)
4. Ifyou are observing regularly scheduled classes, always leave room inyour
plan to reschedule anobservation as needed, incase a class iscancelled, or
the teacher gets sick, or you miss yourbus. (p. 5)
5. Always plan free time immediately after an observation so you can write
your field notes ifyou aren't able to record them during the actual obser
vation for any reason, (p. 22)
6. Always arrange a big enough subject pool that you can re-sample from
among the possible subjects if your presence seems to affect someone's be
havior noticeably, and always allow time foryour subjects to become com
fortable with yourpresence before you tryto collect data on theirbehavior,
(p. 22)
7. Always carry extra batteries (ifyou are using a battery-powered recording
device), (p. 22)
8. Never allow yourself to be entrapped in an unwanted discourse act with a
subjectduring an observation, (p. 22)
9. Always use, or consider using, multiple data collection procedures, (p. 22)
10. Always do a pilot study, (p. 22)

Some of the points made above may seem obvious, but we know of many
projects—including our own and thoseofour graduate students—that have been
negatively affected by breakdowns caused by very simple problems thathad huge
consequences.
Ouradvice, based onyears ofboth successes and frustrations, is that ifpossi
ble, you should collect data that are detailed and in-depth rather than relying en
tirely on data coded live in "real time," thatis, during the observations. Data that
are audio- or videotaped can later be coded or transcribed, as needed. But data
that are simply coded cannot be reconverted to direct quotes or descriptions.
Wealso strongly encourage you to carry out a pilot study to try out and re
fine your observational procedures before you attempt to collect the data you
wish to analyze. Doing a brief pilot study can reveal problems in coding cate
gories, thoroughly familiarize observers with data collection procedures, andac
quaintobservers with the local conditions if theyare not insiders to the school or
program.
Observation can be used with other data collection procedures to provide
methods triangulation. You will recall from Chapter 7that methods triangulation

278 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


involves the use of multiple data collection procedures (e.g., interviews, test
scores, questionnaires, observation schedules, field notes, etc.) to collect data
(Denzin, 1978; van Lier, 1988). For example, the team that worked on R. L.
Allwright's (1980) study ofturns, topics, and tasks in ESL classrooms included a
participant observer who took notes during the lessons which were also tape-
recorded. The recordings were later transcribed and the observer's notes were
invaluable in interpreting ambiguous utterances and understanding the context
ofvarious comments. Block's (1996) investigation of an English course in Spain
incorporated theteacher's and students' diaries and audio-cassette recordings, in
addition to Block's own observational notes.

A SAMPLE STUDY

As the sample study in this chapter, we will discuss just a briefsegment from an
interesting article by Lynch and Maclean (2000), two teachers inacourse entided
English for Medical Conferences, which "caters for health professionals who
want to improve their ability to present papers inEnglish at international meet
ings and conferences" (p. 226). These authors wanted to study the outcomes of
building repetition into atask inwhich the learners explained aposter based on a
research article to people who visited theposter exhibit (their classmates).
By repetition Lynch and Maclean do not mean the sorts of pattern drills
where students repeat after ateacher. Rather they are referring towhat they call
recycling orretrials, "where the basic communication goal remains the same, but
with variations of content and emphasis depending on the visitor's questions"
(p. 227). The procedure that was used—both for teaching and for data collection—
was this:

1. Participants are paired upand each pair is given adifferent research article.
They have one hour to make a poster based onthatarticle.
2. The posters are displayed around alarge room. From each pair, one partic
ipant (A)—the 'host'—stands beside their poster, waiting to receive 'visi
tors' asking questions. The B participants visit the posters, one by one,
clockwise. Their task is to askquestions abouteach poster. The host is in
structed nottopresent, buttorespond to questions. They are allowed only
limited time (approximately 3 minutes) at eachposter.
3. When the Bparticipants arrive back at base, they stay by their poster and
the A participants go visiting.
4. Once the second round is completed, there is plenary discussion of the
merits of the posters (by the participants) and the teachers provide feed
backon general language points, (p. 227)
So the "poster carousel," as these authors call this teaching-learning activity,
provides built-in opportunities for reiterating and rephrasing core content re
peatedly ina brief period oftime to aseries ofinterlocutors.

Chapter 9 Classroom Observation 279


Two research questions were posed in this study: (1) "Do learners gain from
repetition in the poster carousel—and do they think they gain?" And (2) "In
what ways do they gain from repetition—and in what ways do they think they
gain?" (p. 228). To address these questions, the researchers tape-recorded the
poster session participation by fourteen students (radiologists and oncologists)
and had thelearners complete a self-report questionnaire. In terms of the focus
pof this chapter, their audio-recording process was aparticularly interesting data
collection strategy:

We recorded all six interactions between each host and visitor by plac
ing an audiocassette recorder near each of the seven posters. This sort
ofrecording is a routine part ofthe course and so the participants were
used to being recorded bythe time they did the poster carousel. All 14
sets ofsix interactions were transcribed, (p. 229)

Thus theparticipants were familiar with tape-recording as aninstructional tool,


so we may assume that the observer's paradox(Labov, 1972) was minimized.
Forthepreliminary analysis, theauthors focused on two particular learners,
"whose pseudonyms are Alicia and Daniela. These learners were chosen because
of the extreme differences in their language proficiency. Alicia was from Spain
and her English was the weakest in the class. She had a 4.0 on the IELTS (the
International English Language Testing System) and a 400 on the TOEFL
(the Test ofEnglish as aForeign Language). Inaddition, she had scored only 5%
on adictation administered atthe start ofthe course. Daniela was from Germany
and had a score of 7.0 on the IELTS and a 600 on the TOEFL. Her dictation
score had been 97% at the beginning of the course.
The authors provide a close analysis of the transcripts and show that both
learners improved, though indifferent ways. However, their self-report data dif
fered. Alicia said that she had not consciously changed her language during the
carousel activity and that she hadn't been aware of any unplanned changes. In
Contrast, Daniela reported that she had intentionally changed the way she ex
pressed herideas as she talked with thevarious visitors toherposter. Specifically,
she said,

I wanted to use phrases I have learned during the course and I worked
at it 1tried to find outifdifferent explanations were accepted [by the
visitors]. I felt I was quite relaxed allthe time. I got to know the vocab
ularybetter during the time. (p. 231)

Thearticle provides further data from the two learners' self-report statements as
well as numerous illustrations from their transcribed conversations with the vis
itors to their posters. These data provide a good example ofmethods triangula
tion. What we find particularly compelling about this study is how the two
teachers conducted practical research about a teaching activity and incorporated

280 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


viable observations that did not detract from the students' normal focus on
English language learning and use.

PAYOFFS AND PITFALLS

There are certainly some pitfalls to be avoided when you collect data through
observational processes, and several of these have been alluded to above. To
recap the main points, having an observer in the classroom, perhaps especially
one with a video camera, can be disruptive. Students and teachers may act in
ways that are not typical of their usual classroom behavior—an example of the
Observer's Paradox (Labov, 1972). lb counteract this problem, it is important
that observers spend enough time in a site so that the participants in that context
become familiar with the visitors and accept their presence in the classroom. It
also helps if the students are familiar with the data collection devices, as they
were in the study by Lynch and Maclean (2000).
Another major pitfall is the worry that what is observed can be influenced
very strongly by the observer's own experiences and preconceptions. If a data
base consistssolelyof observational fieldnotes or real-time coding done by a sin
gle observer, it is difficult to demonstrate the reliability of the coding or the
validity of the observations. For these reasons, we recommend methods triangu
lation, particularly those combining observational field notes with electronic
records. Video- or audio-recordings permit the researcher to transcribe interac
tions, and the resulting transcripts provide opportunities for more precise
analyses than does real-time coding.
In addition, having electronically recorded data, such as audiotapes, video-
recordings, or chat transcripts, is very useful in studies employing stimulated
recall to get the participants' interpretations of events. Having been an observer
in the lessons where such data were collected can give the researcher a first
hand vantage point for asking key questions about the interactions that
occurred.
Transcription itself, while valuable and informative, can be a daunting
process.Unless you have good quality recordings, transcribing language learn
ers' speech can be terribly time-consuming and frustrating. This problem is
substantially reduced in studies involving typed chats or online forum discus
sions, in which a transcript of the interaction is automaticallyproduced by the
program.
In spite of these problems, the benefits of using an observation component
in classroom research are too numerous to overlook. Without observational
data, we are limited almost entirely to product studies—and even if their out
comes are interesting and provocative, we cannot say with much confidence
what elements of classroom interaction and instruction led to any significant
gains or differences that may emerge. Some form of observation is essential in
any process study or process-product study.

Chapter9 Classroom Observation 281


CONCLUSION

This chapter has focused on observation as an essential tool in conducting lan


guage classroom research. We defined languageclassroomobservationas a fam
ily of related procedures for gathering data during actual language lessons or
tutorial sessions, primarily by watching, listening, and recording (rather than by
asking). Such procedures include audio- and video-recording, coding, making
maps and seating charts, and writing ethnographic records.
The observational procedure(s) you choose depend on the research ques
tion^) you address in your study. In general, we recommend collecting more de
tailed data (rather than just doing five coding during a lesson) and using methods
triangulation so you can verify your data and establish the reliability of your
observations.

QUESTIONS AND TASKS

1. If you are currently teaching a class, tape-record or videotape a lesson.


After class,make a detailed journal entry for yourself about things that in
terested you or puzzledyou during class. Then use the recording to inves
tigate your questions.
2. If you have a colleague who can observe your teaching, carry out the task
abovewith an observerin the room taking fieldnotes or gathering SCORE
data. Compare your journal entry with the observer's data and with the
electronic recording.
3. If you are not currently teaching, ask a language teacher if you could ob
servea class for nonevaluative purposes. Use a procedure described in this
chapter to gather data. Discuss those data with the teacher.
4. Think of a researchquestionthat interestsyou.What role might be played
in answering that question by any of the approaches to observation de
scribed in this chapter?
5. Look back at van Lier's (1996a) description of a typicalday in the bilingual
Quechua-Spanish schoolin Peru, whichwas reprinted at the end of Chap
ter 7. Based on just this description, drawa bird's eyeviewof the classroom
depicted in that account.
6. Look back at the description of the child called Lupita, which appeared
near the beginning of Chapter6. Drawa mapor make a SCORErecordof
her movement throughthe action zone and the people sheinteracted with.
7. Here is a SCORE chart (Nunan, 1989,p. 93) which records the number of
times the teacher asked display questions (D) and referential questions (R)
to various students, as well as the students' responses and the times they
initiated turns. Student-student interaction is also recorded. Write a sum
mary of the interactions in depicted in these SCORE data. (Note: It may
help if you number the students in this chart.) Compare your interpreta
tion with that of a classmate or colleague.

282 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


8. Transcribe a brief recording of some classroom interaction—five or ten
minutes should be sufficient. What problems do you encounter as you
transcribe? What benefits are gained by the transcription process?
9. Think of an activity that you use (or would like to use) in your language
classes. What research question(s) could you ask about its efficacy? How
could you address your research question(s)? What role would observa
tional procedures play in your study?

SUGGESTIONS FOR FURTHER READING

For a history of the role ofobservation in language classrooms,see D. Allwright's


(1988) book. If you would like to learn more about the FOCUS system for ob
servation, see Fanselow (1977; 1987).
There are several sources available in case you would like to learn more
about the COLT system for observing teaching. These include Allen, Frohlich,
and Spada (1984) and Spada and Frohlich (1995).
If you would like to get started learning about observational processes, or
training teachers about observational processes, we recommend Wajnryb's
(1992) book Classroom Observation Tasks: A Resource Book for Language Teachers
and Trainers.
K. E. Johnson's (1995) book on understanding communication in second
languageclassrooms looksat how patterns of communicationare established and
maintained and how these affect learning.

Chapter9 Classroom Observation 283


C MA PT E R

10

Introspective Methods
of Data Collection

"A moment's insight issometimes worth a life's experience"


(Oliver Wendell Holmes, 1981, p.225)

"The universe is transformation: Our life iswhat ourthoughts make it."


(Marcus Aurelius Antonius, Meditations, IV, ascited in Cohen andCohen, 1980, p.3).

INTRODUCTION AND OVERVIEW

One of the challenges confronting language researchers is that a great deal of


the hard work involved in language learning is invisible. It goes on in the heads
of language learners. During the days when behavioristpsychology dominated
languageresearch, it was considered both futile and irrelevant to investigatethis
invisible work since researchers were only interested in the observable charac
teristics of human behavior. Those days are long gone, and the focus is more
and more on the cognitive processes underlying human performance and abil
ity (see, e.g., Freeman, 1996a). It is widelyaccepted that if we want to under
stand what people do, we need to know what they think. Researchers often go
to considerable lengths to derive insights into the mental processesunderlying
observable behavior.
This chapter discusses procedures for collecting or generating data through
introspective means. First, we will consider two important data collection
procedures during which research subjects talk about their thought processes
and/or feelings. These are (1) the creation of think-aloud protocols, and (2) the
use of stimulated recall. Then we will look at diary studies of language teaching
and learning, which typically involve written data derived largely from
introspection and retrospection. We will also briefly consider biographical and
autobiographical research.

284
Defining Introspective Data Collection
As a research procedure, introspection istheprocess ofobserving and reporting on
one's own thoughts, feelings, motives, reasoning processes, and mental states,
often with a view to determining the ways in which these processes and states
shape behavior. The tradition has been imported into applied linguistics from
cognitive psychology, where it has aroused considerable controversy (Ericcson
and Simon, 1987). Particularly contentious is the assumption that the verbal re
ports resulting from introspection accurately reflect the underlying cognitive
processes giving rise to behavior. Critics of the approach argue that there might
be a discontinuity between whatthe subjects believed theywere doing and could
articulate, and what they were actuallydoing.
In this chapter, we use the term introspection to cover techniques in which
data collection happensat the sametime as or veryshortly after the events being
investigated. We will also use it as a general rubric to cover research contexts in
which the data are collected retrospectively, that is, some time after the events
themselves have taken place. One challenge with this approach is the fact that
the length of time elapsing between the mental event and the reportingof that
event may distort what is actually reported. (In a sense, all the techniques re
ported here are retrospective because there will always be a gap, however fleet
ing, between the event and the report.)
Cohen and Hosenfeld (1981) distinguished three types of introspective data
collection. The types are defined by the timing of the introspecting relative to
the timing of the event being investigated. First, concurrent introspection (during
the event) represents a particular point in time. Concurrent introspection occurs
simultaneously with the event being investigated. In contrast, immediate retro
spection occursright after the event,and delayed retrospection occurshours or more
following the event. Thus, immediate retrospection and delayed retrospection

REFLECTION

In which of the following situations would ccncurrent introspection be


advisable or acceptable? In which situations would immediate or delayed
retrospection be preferable?
• While a student is taking a high-stakes entrance examination
9 During a class discussion of vocabulary
• While astudentis revisingacomposition usijig the teacher's and apeer's
feedback J
• During a group work activity in which three students are reaching a
consensus

• While a student is reading a story in his second language as homework


i
Explainyour choices to a classmate or colleague.

Chapter 10 Introspective Methods of Data Collection 285


Event or Process
Being Investigated

Distance in Time from the Kvent/Process

Concurrent Immediate Delayed


Introspection Retrospection Retrospection

V.

Introspection

FIGURE 10.1 Introspection immediacy continuum

represent spans of time instead of particular moments. The more general cover
term, introspection, entails all three zones, as depicted in Figure 10.1 (adapted
from Bailey, 1991, p. 64).

ACTION

Try engaging in all three of these introspective processes.


1. For concurrent introspection, say aloud what you are doing and think
ing as you're reading this sentence.
2. For immediate retrospection, try to verbalize what you were doing just
before you began reading this chapter.
3. For delayed retrospection, verbalize what you were doing exactly one
week ago.

What are the internal mental differences you experience as you tiy each of
these tasks?

Think-Aloud Protocols
In think-aloud techniques, subjects complete a task or solve a problem and
verbalize their thought processes as they do so. The researcher records the
verbalization and then analyzes the thought processes the subjects report. In
this procedure, the gap between the mental process and the reporting is closer
than with other techniques, such as stimulated recall and diaries. However, we
may still question whether the verbalization accurately reflects the mental

286 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


processes thataccompany problem solving. It may be that the actofverbaliz
ing the thoughtprocesses alters themin some way. (Thisconcern is not pecu
liar to introspection. All researchers need to be aware of the possibility that
their results may in some ways be artifacts of the data collection procedures
they have used.)
Think-aloud protocols are the results of a data collection technique that in
volves verbal concurrentintrospection. That is,a research subject talks about the
process under investigation while he or she is engaged in doing that process—
whether it is taking a language test,revising a composition, or reading a textin a
foreign language. As theperson thinks aloud (i.e., talks about his current thought
processes), his self-report is audiotape- or videotape recorded. It is then tran
scribed and the written result is the "protocol."
In verbalizing their thoughtswhile performinga task, according to Ericcson
and Simon (1993), research subjects "do not describe or explain what they are
doing—they simply verbalize the information they attend to while generating
the answer" (p. xiii). These authors maintain that if subjects verbalize only the
thoughts they have as part of performing the research task, their thought se
quence will not be changed by the process of thinking aloud. However, if people
are told they must explain their thoughts, "additionalthoughts and information
have to be accessed to produce the auxiliary descriptions and explanations. As a
result, the sequence of thoughts is changed, because the subjects must attend to
information not normally needed to perform the task" (ibid., xiii).
Ericcson and Simon (1993) describe what they call "three levelsof verbaliza
tion" elicited from research subjects:

1. The first level is simply reporting—that is, "the vocalization of covert ar-
ticulatoryor oral encodings,as required in the tasks" [beingdone]. "At this
level, there are no intermediate processes and the subjectneeds expend no
specialeffort to communicate his thoughts" (p. 79).
2. The second level involves some description and "explication of the thought
content" (ibid.).
3. The third level "requires the subject to explain his thought processes or
thoughts" (ibid.).
Ericcson and Simon caution that at the third level "an explanation of thoughts,
ideas, or hypotheses or their motives is not simply a recoding of information
already present in short-term memory, but requires linking this information to
earlier thoughts and informationattended to previously" (ibid.). They stressthat
verbalization at the second level "does not encompass such additional interpre
tative processes" (ibid.).
It is important to be aware of the sorts of mental processing we are asking
participants in a research project to do for two reasons. First, we want to be sure
to obtain the type of data that willaddress our research questions or hypotheses.
Secondly, we don't want to impose additional mental tasks on the participants
that wouldundulyinfluence the actualmental processes or emotionalstates that
we are investigating.

Chapter10 Introspective Methods of DataCollection 287


ACTION

Here isanexample of instructions to experimental subjects (reprinted from


Ericcson and Simon, 1993):

"In this experiment we are interested in what you think about when
you find answers to some questions that I am going to ask you to
answer. In order to do this I am goingto askyou to THINK ALOUD
asyou workon the problem given. What I mean by think aloud is that
I want you to tell me EVERYTHING you are thinking from the time
you first see the question until you give an answer. I would like you to
talkaloud CONSTANTLY from the timeI presenteach problem until
you have given your final answer to the questions. I don't want you to
try to plan out what you say or try to explain to me what you are say
ing.Just act as if you are alone in the room speaking to yourself. It is
most important that you keep talking. If you are silent for any long
period of time, I will ask you to talk. Do you understand what I want
you to do?" (p. 378)

How would you change this text to make it more easily understood by some
one for whom English is a second language? Think about an intermediate
ESL/EFL learner as the person who would be getting the instructions. You
can either translate some version of this text into the learner's mother tongue
or modify die English instructions above.

In addition to getting clear instructions, it is important for research subjects


to understand that the think-aloud session is not a social event. In fact, Ericcson
and Simon recommend that the researcher actually sit behind the subject who is
introspecting and stay completely out of his or her field of vision (ibid., p. xiv).

Stimulated Recall in Second Language Research


The greatest limitation of the think-aloud technique in classroom research is
that it cannot be used to collect data directly from real classes, as this would se
riously disrupt the flow of the ongoing lessons. In such situations, researchers
can use techniques such as retrospection and stimulated recall.
Retrospective data are collected some time after the eventsbeing investigated
have taken place. Retrospection is quite controversial. For example, Nisbett and
Wilson (1977) argue that the gap between the event and the reporting of the
event will lead to unreliable data. It has also been claimed that if subjects know
they will be required to provide a retrospective account, this knowledge will in
fluence their performance on the task. Ericsson and Simon (1984) argue that the
reliability- of the data can be enhanced by ensuring that the data are collected as
soon as possible after the task or event has taken place. In the case of collecting

288 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


retrospective data on a lesson, ideally it should happen immediately after the
lesson.
Like think-aloud protocols, stimulated recall isa procedure forgenerating in
trospective data, butit isused after theevent under investigation instead ofcon
currently. The researcher uses data thatwere collected during the event (e.g., a
videotape, audiotape, field notes, etc.) tostimulate therecollection ofthepeople
who participated in the event. In this way, the participants will not be distracted
byhaving to introspect during a task, butit is hoped that the record ofthe orig
inal event willstimulate their memories sufficiently to produce good introspec
tive data after the fact.
In our own research, for example, we haveused stimulated recall to investi
gate teachers' decision making during language lessons. Nunan (1996) collected
lesson plans and thentape-recorded language lessons taught bynine ESL teach
ers in Australia. He made notes while observing the lessons. Immediately after
the lesson, he asked the teachers to focus on those points where they had de
parted from theirlesson plans. Then hetranscribed therecordings and asked the
teachers to annotate the transcripts.
In a parallel study, Bailey (1996) observed and tape-recorded ESL lessons in
California and made running field notes as the lessons proceeded. The audio
tapes were transcribed and the teachers reviewed the transcripts and read the
field notes. In the process, Bailey asked them to explain their decision making at
the pointswhere they had chosen to depart from their lesson plans. Those con
versations were also tape-recorded. In both these studies, the teachers' recollec
tions of their thought processes as they taught were stimulated by the data that
had been collected during the actual lessons.
In classroom research usingstimulatedrecall, the researcherrecordsa lesson
and then gets the teacher and, when feasible, the students to commenton what
was happening at the time that the teaching and learning took place. The in
formants can look at a video or listen to a tape of the lesson, pausing at particu
lar points of interest. Alternatively, field notes or transcripts of the lesson can be
used as a memory aid.
Stimulated recall can yield insights into teaching and learning processes that
would be difficult to obtain by other means. It is particularly useful in collabora
tive research because it enables teachers and students as well as the researcher
to present their various interpretations of what happened in the moment-by-
moment interactions that define a given lesson or classroom event. The inter
pretationscan be directlylinked to the classroom events that gave rise to them.
One of the most comprehensive investigations to use stimulated recallis re
ported by Woods (1989). The focus of the investigation was teachers' decision
making. Woods used three data collection techniques—ethnographic inter
views, ethnographic observation over time, and stimulated recall—to collect
data on the decision making of eight ESL teachers. He describes the procedure
as follows:

[Stimulated] recall elicited teachers' comments about the options con


sidered, decisionsmade, and actions taken in the classroom A lesson

Chapter 10 Introspective Methods ofData Collection 289


was videotaped and subsequently viewed and commented on by the
teacher. Bypressing a remotepause button to freeze the video and then
making a comment [captured on a composite videotape asa voice-over],
the teacher provided a commentary about the lesson, the students or
about what s/hewas trying to doas the lesson transpired. The compos
ite videotape containing the lesson and the superimposed comments
was analyzed to determine the processes and bases of decisions made
during the lesson, (p. 110)

The stimulated recall technique, along with follow-up interviews, enabled


Woodsto drawsomeinterestingconclusions about processes of classroom deci
sion making, includingthe following:

1. The overall process of decision makingwithin the classroom context is in


credibly complex, not only in terms of the number and types of decisions
to be made, but also because of the multiplicity of factors impinging on
them.
2. In terms of procedures for course planning, the most surprising finding
was the tentativeness of teachers' advance planning. "Lessons were
sketched out only in veryvague terms and detailed planning occurred at
most a couple of lessons in advance even by the most organized of the
teachers" (ibid., p. 116)
3. Based on an analysis of the teacher interviews, Woods concluded that each
teacher's coursewas internallycoherent. He citesas an example one of the
teacherswhosedecision makingwas drivenby her desireto develop the in
dependence of the students as learners.
4. The final major point to emerge was the fact that different teachers had
quite different approaches, criteria for success, and so on, and that differ
ent teacherscould takeidenticalmaterials and use them in class in verydif
ferent ways.

The aim of this study was to investigate teachers' interactive decision making,
that is, the decisions they made 'online' as they taught. (In the context of re
search on teacher cognition online means during a lesson—a decision in real
time.) As it was clearlynot feasible for the researcher to interrupt the teachers in
the middle of their lessons, he recorded the lessons and replayed the recordings
for the teachers immediately afterwards.

REFLECTION

Givenwhat you knowso far, think of a research question that you'd like to
address that could incorporate the stimulated recall procedure. What
sort(s)of data wouldyou use to prompt your participants'memoriesof the
events being investigated?

290 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Here are some data from Nunan's (1996) investigation into language teach
ers' interactive decision making. The researcher observed and taped a series of
lessons and then used the tapes as a stimulated recall device. The following
transcript is from a lesson based on a piece of authentic listening in which an
interviewer questions people about their lifestyles. After the students listened to
the tape, thefollowing classroom interaction occurred:
T: What question does the interviewer ask? The interviewer? What
question does the interviewer ask? What's the question in here?
S: You smoke?
T: You smoke? You smoke? That's not a proper question, is it really?
Properquestion is do you smoke? So he says "you smoke?" We know
it's a question, because ... why? You smoke? ...
S: The tone.
T: The tone ... the ... the .. . what did we call it before? You smoke?
What do we call this?
S: Intonation.
T: Intonation. You know by his intonation—it's a question, (p. 51)

REFLECTION

What would you have asked the teacher about whatwas goingon in this
piece of interaction if you were using this transcript to stimulate the
teacher's recall? i

The researcherwas interested in why the teacher had in effect'deauthenti-


cated'the piece of authentic data byasserting that the interviewer was not asking
'proper' questions. The teacher reflected on the interaction and said that the
issue of questions had not been part of the lesson plan.
T: ... and also the on-the-spot decision like when it said "drink"
I: So you hadn't actually planned to teach that?
T: No I hadn't. I mean, really, that would be an excellent thing to do in a
follow-up lesson—you know, focus on questions.
I: In fact, whatyou're asking them to do in their workisfocus on the full
question forms, and yet in the tape they're using a ...
T: ... wasn't, yeah. So, I suppose it's recognizing one question form by
the intonation, then being able to transfer it into the proper question
"Do you drink?" rather than "Drink?" I mean, that would be good
to spend a lot more time on at another point. But it seemed like it
was good to bring up there. Just to transfer the information, (ibid.,
pp. 50-51)

Chapter 10 Introspective Methods ofData Collection 291


Introspecting on the lesson segment led the teacher to make the following
comment:

When I first looked at the material, I thought it was quite a straightfor


ward listening so therefore if I give them a split listening it'll make it
more challenging for them. I took the decision to do that [switch the
focus to question forms] and I don'tregret that. I mean, question forms
arealways difficult to do, they're always difficult to slotin unless you do
a whole lesson on question forms, so to throw them in now and again
like that is quite valid, (ibid.)
The stimulated recall session thus gave the teacher an opportunity to reflect on
an online decision she had made. In this case, it reaffirmed the decision she had
makein the heat of the teachingmoment.

DiaryStudies
Since the late 1970s, entries recorded in teachers' and learners' journals, or di
aries, havebeen used as data in studiesof secondlanguage acquisition and teach
ing. Language learning diaries have been kept in both formal and informal
learning contexts, in foreign language and second language situations. Teaching
journals have been kept by both novice and experienced teachers.
What is a diary study and how doesit differfrom a diary orjournal? (Wewill
use these two terms interchangeably, as they havebeen used in the existing liter
ature for the past three decades.) According to Bailey and Ochsner (1983),
A diarystudy in secondlanguagelearning,acquisition, or teachingis an
account of a second language experience as recorded in a first-person
journal. The diarist may be a language teacher or a language learner—
but the central characteristic of the diary studies is that they are
introspective:The diarist studies his own teaching or learning. Thus he
can report on affective factors, language learning strategies, and his
own perceptions—facets of the language learning experience which are
normallyhidden or largelyinaccessible to an externalobserver, (p. 189)
The learner's or teacher's experiences are "documented through regular, candid
entries in a personal journal and then analyzed for recurring patterns or sahent
events" (Bailey, 1990, p. 215).
Youwill recall from Chapter 1 that Grotjahn (1987) characterized research
paradigms in terms of (1) research design (nonexperimental, pre-experimental,
quasi-experimental, and experimental designs); (2) the type of data collected
(qualitative or quantitative); and (3) the type of analysis conducted (interpretive
or statistical—i.e., qualitative or quantitative). In Grotjahn's terms, diarystudies
are typically pre-experimental or nonexperimental. They are basedprimarilyon
quahtative data (the written or tape-recorded diaryentries), and they are usually
analyzed interpretively (thoughsomehave been analyzed quantitatively aswell).
Diariescan be kept by teachers or by learners. However, undertakinga diary
study requires discipline and application because if the entries are not made

292 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


consistently over time, patterns are unlikely to emerge. When keeping a diary
for research purposes, a five-step procedure is recommended:
Step 1: Provide a context for the study by giving an account ofyour per
sonal language teaching and/or language learning history (depending on
whether the focus of the diary is on teaching or learning).
Step 2: Keep regular, uncensored accounts of the teaching or learning
experience, trying to be as candid as possible
Step 3: Analyze the account for patterns and significant events. (See
Chapter 14 for a discussion oftechniques for qualitative data analysis.)
Step 4: Revise the 'raw' account for public consumption. For instance,
students' names may be changed to pseudonyms, local abbreviations will
be spelled out, and so on.
Step 5: Document and discuss the factors that appear to be important in
language teaching/learning.
When keeping a diary, it isa good idea not to embark on the analysis too prema
turely. If Steps 3, 4, and 5 are delayed until a substantial amount of data have
been collected, then you will avoid coming to premature conclusions. Also, in
the early stages, the accounts may appear to be rather inchoate and random.
In our experience, many patterns onlyemerge in the longerterm.
Since their appearance in the late 1970s, diary studies have gained viability
as a data collection procedure in studies on language learning and teaching.
Originally, the journal entries wereanalyzed by the diarists themselves, but later
studies involved analyses by someone other than the diarist. The same pattern
occurredin teachers' diarystudies: At first, the datawere analyzed by the diarists
themselves, but later, researchers other than the diarists did the analyses. When
the diarists themselves were also the analysts, the analytic processes were re
ferred to as "primary" or "direct" or "introspective." When the analyses were
done by someone other than the diarists, the process was called "secondary" or
"indirect" or "non-introspective" (Curtis and Bailey, in press).

REFLECTION

In your viewpoint, what are the advantages and disadvantages of having


language learners analyze their diaryentriesas iapposed to an independent
researcher? What about language teachers wh<j) keep journals about their
ownteaching? Should they analyze the data, or is that task better left to a
researcher who was not teaching the classunde r investigation?

Table 10.1 lists a number of published diary studies on language learning.


In some cases, the data were analyzed by the diarists, while, in some studies,
other researchers analyzed the data. Some studies involved both primary and
secondary analyses. Some of the diarists were teachers, but the diary studies
listedin Table 10.1 are about language learningrather than teaching.

Chapter 10 Introspective Methods of Data Collection 293


TABLE 10.1 Selected published research based on language learners'
journals
Research Using Language Language(s) Agent(s) of
Learners' lournals as Data Involved Analysis
Allison (1998) English Other
Bailey(1981) French Diarist
Bailey(1983b) Various Languages Diarist & Other
Birch (1992) Thai & Mandarin Other
Brown, C. (1985a; 1985b) Spanish Other
Campbell (1996) Spanish Diarist
Carroll (1994) English Other
Carson & Longhini (2002) Spanish Diarist & Other
Danielson(1981) Italian Diarist
Ellis (1989) German Other
Grandcolas & Soule-Susbielles (1986) French Others
Halbach (2000) English Other
Hilleson (1996) English Other
Huang (2005) English Other
Jones (1994; 1995) Hungarian Diarist
Krishnan & Hoon (2002) English Others
Lowe (1987) Mandarin Other
Matsumoto (1989) English Other
Moore (1977) Danish Diarist
Parkinson & Howell-Richardson French, Spanish, Others
(1999) & English
Peck (1996) Spanish Other
Porto (2007) English Other
Richards (1992) Various Languages Other
Rowsell & Libben (1994) Various Languages Others
Rubin &Henze (1981) Arabic Diarist & Other
Ruso (2007) English Other
Schmidt & Frota (1986) Portuguese Diarist & Other
Schumann (1980) Arabic & Farsi Diarists
Schumann & Schumann (1977) Arabic & Farsi Diarists
Simard (2004) English Other
Tinker Sachs (2002) Cantonese Diarist
Tyacke & Mendelsohn (1986) English Others
Warden, Lapkin, Swain, & Hart (1995) French Others
Woodfield & Lazarus (1998) Swedish Diarists & Others

294 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Apaper by Rivers (1979; 1983) is not listed in Table 10.1 because we don't
consider it to be a diary study since no explicit analysis was provided. The publi
cation consists solely of the authors diary entries about learning Spanish.
Both pre-service and in-service language teachers have kept journals about
their teaching practice (e.g., Appel, 1995; Bailey, 2001b; Verity, 2000), or about
their experiences in training programs (C. H. Palmer, 1992; G.M. Palmer, 1992;
Polio and Wilson-Duffy, 1998). Like the language learning diary studies, in
some cases, the teaching journals were analyzed by the diarists themselves, but,
in others, the analysis was done by someone else—researchers or instructors in
the training programs. Several of these studies are listed in Table 10.2.
The division of these twotables intodiary studies aboutlanguage learning and
language teaching is not as neat as it appears. Many of the language learning di
arists have been language teachers and/orapplied linguists (see, e.g., Bailey, 1980,
1983b; Birch, 1992; Campbell, 1996; Carson and Longhini, 2002; Danielson,
1981; Grandcolas and Soule-Susbielles, 1986; Jones, 1994; 1995; Lowe, 1987;
Rubin and Henze, 1981; Tinker Sachs, 2002; Schmidt and Frota, 1986;
Schumann, 1980; Schumann and Schumann, 1977). And, as noted above, some of
the diaries kept by teachers have focused on their language learning (see, e.g.,
Lowe, 1987; Porto, 2007; Richards, 1992).

ACTION

Ifyou are taking or teaching a language class, or can putyourself in a context


to useyoursecond language fora period of time, try making daily journal en
tries for at leasta week. What challenges arise? What insights do you gain?

REFLECTION

After you have tried making entries in a learning or teaching diary, con
sider the following issues:
1. Should a diarist read other language learning or teaching diary studies
while keeping a diary, or does this lead to "contamination" of the re
ported experience?
2. Should a diarist read about and comment on language learning theories
while keeping a diaiy, or does this mold the diarist's recollections and
insights to fit the theories?
3. Should a diarist try to take notes during the actual language learning or
teaching experience (e.g., during an ongoing language lesson), so the im
pressions are more concurrent with the event? Or would that process be
so distracting that it would interfere with language learning or teaching?
4. To whatextentdoesthe process of keeping a diary (forinstance, of exam
ining one's own language learning experience) influence the experience?
Compare yourresponses to those of yourclassmates or colleagues.

Chapter 10 Introspective Methods of Data Collection 295


TABLE 10.2 Selected pubhshed research based on language teachers'
journals

Language Pre-/In-service
Authors(s) and Date (to Be)Taught Teacher(s) Analysis
Appel (1995) English In-service Diarist
Bailey(1990) Various Pre-service Others
Languages
Bailey(2001b) English In-service Diarist
Block (1996) English In-service Other
Brinton&Holten(1989) English Pre-service
Brock, Yu, & Wong (1992) English In-service Diarists
Cole, Raffier, Rogan, & English Pre-service Diarists
Schleicher (1998)
Delaney& Bailey (2000) English Pre-service Diarist &
Other
Grandcolas & French Pre-service Diarists
Soule-Susbielles(1986)
Ho & Richards (1993) English In-service Others
Jarvis (1992) English In-service Other
Lee & Lew (2001) English Pre-service Others
Matsuda & Matsuda (2001) English In-service Diarists
McDonough (1994) English In-service Diarist &
Other
Numrich (1996) English Pre-service Other
Palmer, C.H. (1992) English In-service Other
Palmer, G.M. (1992) English In-service Other
Polio & Wilson-Duffy (1998) English Pre-service Others
Pennington & English In-service Others
Richards (1997)
Porter, Goldstein, Various Pre-service & Diarists &
Leatherman, & Conrad (1990) Languages In-service Others
Ruso (2007) English In-service Diarist
Santana-Williamson English Pre-service & Other
(2001) In-service
Tsang (2003) English Pre-service Other
Verity (2000) English In-service Diarist
Winer (1992) English Pre-service Other
Yahya (2000) English In-service Diarist

296 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


(Auto)biographical Research
(Auto)biographical research has been a long-standing tradition in anthropology
and sociology (see, e.g., Lieblich, Tuval-Mashiach> and Zilber, 1998; Plummer,
1983). However, it is a relatively recent arrival on; the language research scene.
The term is used to describe a broad narrative approach that "focuses on the
analysis and description ofsocial phenomena as thby are experienced within the
context of individual lives" (Benson and Nunan, 2005, p. 4). In autobiographical
research, the researchers draw on their own experiences in generating dataforthe
study. In biographical research, they draw on the introspective and retrospective
accounts ofothers. |
The studies reported in Benson andNunan (2005) arewide ranging in terms
of their researchfoci, the educational and geographical contexts, and the stories
the language learners document. Substantively, the theme that holds the collec
tiontogether is'difference and diversity.' In contrast with experimental research
andsurvey research, which attempt to connect learner differences to differences
in proficiency and/or attitude, autobiographical studies examine a life story more
holistically.
Methodologically, thisbody of research consists of retrospective case studies
of individual learning experiences. Central to these case studies are the individ
ual learners' narrative accounts of their experiences. Benson and Nunan (2005)
make the following observation about this approach to research on language
learning:
One of the distinguishing features of (auto)biographical research is
that it offers a longitudinal portrait of the phenomena under investi
gation. This enables us to generate insights that are beyond the reach
of 'snapshot' research which captures a single reality, or a limited
number of realities, at a single point in time. A common thread run
ning through most of the accounts in this volume is that language
learning practices and attitudes are unstable and change over time. In
other words, difference and diversity exist not just between learners,
but within learners at different stages of their language learning expe
rience, (pp. 155-156)
In other words, autobiographical and biographical research provide in-depth
case studies using retrospective narration as a primary method to generate data.
Benson and Nunan note that mainstream second language acquisition research
often tries to "isolate psychological and social variables such as motivation,
affect, age, beliefs and strategies, identity and setting from each other" (ibid.).
However, for learners it seems that often "these factors are intimately entwined,
not just with each other, but with the learners' larger life circumstances and
goals" (ibid.).
One of our graduate students,Julie Choi (2006) kept a diary account of
her language learning experiences over a period of years. She used these
records as the basis for a graduate dissertation documenting how she

Chapter 10 Introspective Methods ofData Collection 297


developed native-speaker competence in four languages: English, Korean,
Chinese, and Japanese. Here is a description of the process she used:
I began keeping a diary when I was 10. It was a way ofexpressing diffi
culties I experienced moving to different countries and learning new
languages while constantly trying to fit in. Understandably, my diary
became my best friend and a haven where my truest feelings could run
free. I wrote daily—sometimes even four or five short entries during
lunch, in class (half doodling), onthe bus orat home. Predictably, most
entries during my adolescent years dealt with friendships, relationships
and feelings of "not belonging." I habitually transcribed daily events.
They usually involved friends, teachers, family members or myself. I
also noted things I saw on TV, overheard on the bus or in other con
texts. In retrospect, I realize these observations also contained the seeds
of many potential research questions. For example, "What does Silvia
mean when she says, "We're not 'real Koreans' because we're Korean-
American?" or why did the clerk speak to me slowly using his hands
when it was clearI spoke fluentJapanese? Questions wandering through
my diary somehow always found their way back to issues of language
and culture.
In my teaching career and graduate studies, I worked closely with
language learners and intensified my curiosity with language, culture
and identity. I became increasingly curious about certain aspects of lan
guage and howit affected my personal identityas I moved across geo
graphical and cultural borders. Then I began to see a distinct pattern
emerge in mywritings. When it cametime to choosea topic for my dis
sertation, my diary came to the rescue. With a bit of direction and re
search, I was able to open my eyes to the possibility of using the data
from my diary for my thesis on language and identity. Embarking on
this journey—traveling back to my past to figure out who I was in the
present motivated me deeply and fervently (which helps when one is
writing a thesis!).
From hundreds of diary entries, letters, emails from friends, pic
tures and other data from my past, I carefully chose items on the
themes of language learning, acceptance and how my identity trans
formed in the course of my life. Each segment of my narrative was
based on one of the major stages in my life—childhood, adolescent and
adult years. Once the relevant entries were found, I entered an interac
tive processwith my own narrative and analyzed the data as honestly as
possible. I read the collected materialand generateda headingfor each
section. I reached a general hypothesis and changed and revised my
interpretative conclusion as necessary by reading current theories
within the literature on identity and language learning. (Choi, personal
communication)
Choi's diary entries encompass many years of experience and her disserta
tion provides a detailed portrayal of a multilingual, multicultural person. She

298 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


comments about possible connections between narrative data and second
language acquisition theories:
What attracts me about research using diary studies is when one looks
carefully into the entries, patterns start to emerge and thestories shape
and construct the narrator's personality and reality. By reading, analyz
ing and linking the stories to theories within SLA, I am able to find ex
planations to questions I have been asking throughout my life, the most
noteworthy question being, "Who am I?"—a question I could nothave
answered without looking at my history. Even the most minor details
can offer insights into important questions. By writing about my expe
riences, I have offered myself some sense of closure and assuaged my
feelings of anger and sadness. My diary was a storehouse of authentic
data. The insights I gathered from analyzing it make me realize how
grateful I must be to the languages I own because they have given me
prestige, self-confidence, a deeper understanding of myself, jobs and
friends. Besides these gains, they taught me that I am more than just a
Korean-American. I can be who I want to be depending on the language
I speak and make my home from my own will and my imagination.
These realizations empowerme. They alsohumble me to a small, worn
journal—my diary—that I wrotemylifeinto and that then shoneit back
for me to understand and appreciate. (Choi, personal communication)

REFLECTION

Write three insights or observations that you get from Choi's account
(e.g., learning a language can have a powerful impact on personal identity). If
possible, share these insights with other people. Did you and your col
leagues find similar or differentissues in her copiments?

An autobiographical approach to examining teachers' language learning


histories was taken by Bailey, Bergthold, Braunstein, Fleischman, Holbrook,
Tuman, Waissbluth, and Zambo (1996). As a seminar assignment, seven pre-
service teachers wrote retrospective accounts of their language learning experi
ences addressingthe following questions:
1. What language learning experiences have you hadand howsuccessful have
they been? What are your criteria for judgingsuccess?
2. If you wereclearly representative of all language learners, whatwould we
have learned about language learning from reading your autobiography?
What can be learned about effective (and ineffective) language teaching by
reading your autobiography?
3. How has yourexperience asa language learnerinfluenced youasa language
teacher?

Chapter 10 Introspective Methods of Data Collection 299


By reading, discussing, and comparing one another's language learning histories,
these teachers found some common themes running through the documents.
These included the importance oftheir own teachers' personalities and teaching
styles (as opposed to methods and materials), the authors' concepts ofgood and
bad teaching, their teachers' attitudes and expectations for students' success, and
reciprocal respect between students and teachers. The authors also discussed
students' and teachers' respective responsibilities for maintaining and support
ing motivation to learn, as well ascomparing the learning atmosphere in natura
listic versus formal instructional settings.
The motivation for this collaborative autobiographical research was the
idea that "teachers acquire seeming indelible imprints from their own experi
ences as students and these imprints are tremendously difficult to shake"
(Kennedy, 1990, p. 4). Writing their language learning histories enabled these
pre-service teachers to bring those past experiences to the level of awareness in
order to examine the possible influence of those experiences on their own
teaching philosophies and practices.

OUALITY CONTROL ISSUES IN INTROSPECTIVE


DATA COLLECTION

One difficulty with having language learners report on their learning is that not
all learning processes may be available for introspection. Some processes may
happen outside awareness. There are also conscious learning processes of which
we are aware, and those are available for introspection. Some subset of those
learning processes will be written about by the second language learner, as
shown in Figure 10.2 below:

All language learning processes,


both conscious and unconscious, which occur

All conscious language learning processes


available for introspection

All language learning processes


DATA written about or explained
by the second language learner

FIGURE 10.2 Subsets of language learning processes reported in


introspective data collection (adapted from Bailey,
1991, p. 80)

300 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Giving ClearInstructions
In cases where the diarist is not the data analyst, it is important to give the
teachers or language learners clear instructions about making journal entries.
The following instructions for keeping a diary are from Matsumoto (1989). The
language learner was a native speaker ofJapanese studying English:
Please make daily entries in Japanese describing your classroom learn
ing experience in the ESL program you are participating in this sum
mer. You are asked to write about the content of your class or learning
activities, and what you thought or felt about the class and any other
things which are involved in your language learning experience. Please
write yourcomments and feelings in as much detail as possible, honestly
and openly, as if you were keeping your own personal, confidential
diary. Tryto write your entry before you have forgotten about the class
content—assoon as possible after the class, (p. 170)
Here is another example of instructions to language learners keeping jour
nals. The researcher (C. Brown, 1985b) compared the requests for language
input of older (fifty-five to seventy-five years old) and younger (eighteen to
twenty-five years old) adult learners of Spanish:
This journal has two purposes. The first is to help you with your lan
guage learning. As you write about what you think and feel as a language
learner, you will understand yourself and your experience better.
The second purpose is to increase the overall knowledge about lan
guage learning so that learning can be increased. You will be asked to
leave your language learning journal when you leave the [training pro
gram]. However, your journal will not be read by the teachers at the
[training program]. It will be read by researchers interested in language
learning.
Your identity and the identity of othersyou may write about will be
unknown (unless you wish it otherwise) to anyone except the re
searchers.
You will be given 15 minutes a day to write. Please write as if this
were your personal journal about your language learning experience,
(pp. 283-284)
The learners in Brown's study also made their journal entries in English, their
native language.

REFLECTION

What are the advantages and disadvantages of having language learners


make diary entries in their first language as opposed to the target lan
guage?

Chapter 10 Introspective Methods of Data Collection 301


Introspective Data Collection Methods and Triangulation
Introspective procedures are sometimes the only data collection processes in
volved in a study. Forexample, some diary studies have been based entirely on
journal entries (see, e.g., Bailey, 1980, 1983b; Campbell, 1996; F. Schumann,
1980; Schumann and Schumann, 1977). In other cases, theyhave been used in
connection with other types of data, which have permitted the researcher(s) to
compare the outcomes of various kinds of data collection. (See, for instance,
Block, 1996; Hilleson, 1996; and Schmidt and Frota, 1986.)
Astrongexample of data triangulation comes from a study by Ellis (1989),
who used questionnaires, a cognitive style test, a language aptitude test, atten
dance and participation records, a word-order acquisition score, measures of
speech rate, and the results of two proficiency tests, in combination with two
learners' diaries. The learners—called Monique and Simon in the report—kept
journals of "their reactions to the course, their teachers, their fellow students,
and anyother factors which theyconsidered were having an effect on their lan
guage learning" (pp. 252-253). The diaries were collected regularly, photo
copied, returned immediately, and treatedconfidentially.
The diary data show that Monique was "obsessively concerned with lin
guistic accuracy" (p. 254) and that she worried about making mistakes. The
diary data contrast markedly with Monique's self-report on a questionnaire in
which she characterized herself as confident and adventurous. In contrast,
Simon showed "little concern for making errors"(p. 255) and his diary had no
references to wanting the teacher to correct his mistakes. Thus in this report,
the diary data not only helped to highlight the differences between these two
learners but also to verify or challenge other forms of data collected in the
research.
Some language classroom research has involved learners keeping spoken di
aries to lessen the burden of having to write the entries. For instance, Block
(1996) had learners of English in Barcelona, Spain, make oral entries in their
journals, using their first language. The difficulty with learnersor teachersmak
ing spoken entries as research data is that you will probably have to transcribe
the audio-recordings, or portions thereof, at some point in your analysis. For
thisreason, werecommend written journal—and word-processed journal entries
in particular—whenever possible.

Tips for Keeping a Language Learning Diary


Over the years, we have worked with many teachers and language learners who
have kept diaries about their experiences. Oneof themost difficult things about
undertaking a diary study issticking with it over time. Based on our experience,
we offer the suggestions below to help make writing yourdiary entries a system
atic buteasy experience. Tryto setup theconditions for writing sothatthediary
keeping does not require a great deal of effort. The actual process of writing
should be (or should become) almost effortless.

302 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Here are some tips for language learners keeping diaries, but these ideas
are applicable to teachers keeping journals as well. (See also Curtis and Bailey,
in press).
1. If you are taking (or teaching) a regularly scheduled language class, set
aside time each day immediately following the class to write in yourdiary.
Ifyou are nottaking aclass butare immersed ina target language situation,
set aside a regular time and place each day in which to write your diary
entries. Write daily andassoonaspossible afterclass or afteryour attempts
to use the target language.
2. Write your diary entries ina place you like (your favorite desk, outside with
a pleasant view, in a sunny kitchen) where you won't be disturbed by
friends or ringing telephones. If you are writing your diary by hand, use
paper or a notebook thatyou like and a penthatiseasy and comfortable to
use. If you arewordprocessing yourdiary, make sureyouare familiar with
the program andthatyousave andback up yourentries regularly and reli
giously, in order to avoid the frustration of losing data.
3. Carrya small pocket notebook or personal digital assistant (PDA) withyou
so that you can make notes about your language experiences as they occur
even if you don't have your actual diary withyou. Some diarists have sug
gested keeping the notebook or PDAnearyour bedside soyou can record
any late-night or early morning thoughts.
4. If you are conducting a diarystudyabout learning a foreign language, we
believe the time devoted to writing about your language learning experi
ence should at least be equal to the time spent in class. If you are immersed
in the target culture, you will find you probablycannot record everything
that happens in a day, soyou may wantto focus your diary on somepartic
ular aspect of your experience that interests you.
5. Keep yourdiary in a safe, secure place—a locked drawer, file cabinet, or brief
case. If youareword processing yourdata, make sureyourfiles arepassword
protected. The idea isforyou to be able to write anything youwant without
feeling uneasy aboutother people reading andreacting to yourideas.
6. When you record entries in the original uncensored version of your diary,
don't worry about style, grammar, and organization—especially if you are
writingin your second language. The ideais to get complete and accurate
data at a time when the information is still fresh in your mind. You can pol
ish your presentation of the data at a later time when you edit the journal
for public consumption. Thus, the original diary entries sometimes read
like "stream of consciousness" writing.
7. Each time you write an assertion, askyourself, "Why?" Why did you write
that? What evidence do you have for the statement you just made? Some
of the language learning journals that have been kept to date are full of
fascinating but unsubstantiated insights. Try to support your insights and
interpretations with examples from your class sessions, your daily interac
tions in the target culture, or actual language data.

Chapter 10 Introspective Methods ofData Collection 303


8. At the end of each diary entry, note thoughts or questions that have
occurred to you to consider later. Many anthropologists conducting field
research keep an ideas file—brief notes on topics to explore further.
Thisprocedure is one way to refine your focus somewhat during thediary-
keeping process and to guide future entries.
Westronglyrecommend wordprocessing your diary entriesrather than record
ing them by hand since having an electronic record greatly facilitates data
analysis.

A SAMPLE STUDY

The sample studywe have selected for this chapteris based on an EFL teacher's
journal that was kept for two academic semesters (Bailey, 2001b). The author
had manyyears of experience teaching ESLandworking asa teacher educator in
the United States, but then she taught at a university in Hong Kong for a year.
(See Verity, 2000, foran account of a similar situation inJapan.) Uponreturning
to her regular graduate teacher trainingposition a yearlater, Bailey noticed that
her teaching(e.g., in statistics courses, language assessment seminars, and so on)
hadchanged. Shethen decided to readthroughher Hong Kongdiary entries to
see if she could determine how those changes had come about.
The database for this report consisted of fifty-two single-spaced pages for
fall semester and fifty-eight single-spaced pages for spring semester. These en
tries wereword processed and saved electronically. Paper printouts of the diary
entries were filed along with copies of the class handouts and lesson plans. The
data were qualitatively analyzed and in the process, a number of teaching
strategies related to scaffolding emerged. Scaffolding was defined by Bruner
(1983, p. 60) as "a process of 'setting up' the situation to make the child's entry
easyand successful, and then gradually pulling backand handing the role to the
childas he becomes skilled enoughto manage it." (Here the learners wereyoung
adults rather than children, but the concept is still useful.) The following scaf
foldingprocedureswere identified in the teachingdiary (Bailey, 2001b):
1. the teacher's use of multiple channels for presenting information or
instructions to the learners;
2. havingthe students "feed me backthe task" (i.e., paraphrasing the instruc
tions to checktheir understandingof the assignment);
3. having the students compare their ideas with a classmate privately before
giving a public response to a solicit;
4. the teacherbuilding in a recognition step(sothe learners could identify the
linguistic item of focus in the input prior to having to produce it them
selves); and
5. the teacher usingschema activators to preparethe learners for readingand
listening tasks, (pp. 15-25)

304 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


These five techniques emerged from the diary entries, but the journal was not
kept in order to document scaffolding concepts. Bailey kept the diary to docu
ment adjustments to her new teaching context in HongKong, but it later pro
vided a rich database for examining how working with those EFL students had
influenced hersubsequent graduate level teaching upon herreturn to theUnited
States. The five patterns listed above are illustrated in the report with dated
excerpts from the diaryto illustrate teacherlearning over time.

PAYOFFS AND PITFALLS |


As noted above, the use of think-aloud protocols came into applied linguistics
from psychology. Ericcson and Simon (1993) explain the psychology research
context as follows:

From the beginnings of psychology as a science, investigators, impelled


by the difficulties of relying wholly on external observation in studying
mental processes, have questioned subjects about their experiences,
thought processes and strategies. Claims for the validity of suchverbal
accounts were based primarilyon the notion that individuals had privi
leged access to their experiences; as long as they were truthful, their
reports could be trusted, (p. xii)
The legitimacy of introspective and retrospective data hinges on the concept of
memory. As Ericcsonand Simon explain, a subset of
thoughts occurring during performanceof a task is stored in long-term
memory. Immediately after the task is completed, there remain re
trieval cues in short-term memory that allow effective retrieval of the
sequence of thoughts. Hence for tasks that can be completed in 0.5 to
10 secondswe would expect subjects to be able to recall the actual se
quence of their thoughts with high accuracy and completeness. With
long durations, recall will be increasingly difficult and incomplete,
(ibid., p. xvi)

These authors acknowledge that self-report may be flawed: "Even where


subjects are asked to report on their cognitive processes used duringmanytrials
ofan experiment, we cannot ruleout the possibility that the information theyre
trieve at the time of the verbal report is different from the information they re
trieve while actually performing the experimental task" (ibid., p. xii). For this
reason, they recommend that "whenever possible, concurrent verbal reports
should be collectedso that processingand verbal report would coincide in time"
(ibid., p. xiii).
The reliability of the data will be enhanced if informants are given ade
quate contextual information about the study. According to Ericsson and
Simon, steps should also betaken to ensure that informants donot make infer
ences that go beyond the task, although there maywell be aspects of classroom

Chapter 10 Introspective Methods of Data Collection 305


interaction where the informant's inferences and interpretations are of central
interest. Regardless of the focus, steps should be taken to eliminate researcher
bias. (Inretrospective data collection sessions, the researcher may fall intothe
trap of'leading the witness.') Finally, where possible, subjects should not be in
formed that they will be required to retrospect until after theyhave completed
the task.
The major pitfall of introspection is its potentially questionable reliability
and internal validity. If we get people to engage in an introspective task on two
different occasions, we may get different responses. If we do, then questions of
reliability arise. More serious is the issue of internal validity. When subjects re
port on some innerstateor thoughtprocess, howdo weknow theyare tellingthe
truth? They mightnot be lying. Rather, they may simply be unaware of the true
nature of those processes.
The same goes for the interpretation of introspective diary entries. In her
study of herself asa learner of French, Bailey (1980) analyzed the diary data and
identified three keythemes. However, shewas completely unaware of the theme
of competitiveness and anxiety that ran through the data until these issues were
pointed out to her by her professor. She then reanalyzed her own data as well as
the journals or reports of ten other learners and found a connection between
competitiveness and anxiety (Bailey, 1983b).
There are many potential pitfalls in collecting introspective data, and these
problems varydependingon whether you are using think-aloud protocols, stim
ulated recall, or diarystudies. The main concern with the think-aloud protocols
is that verbalizing one's thoughtsmayinterferewith the process under investiga
tion. In other words, collecting think-aloud data may trigger the observer's
paradox (Labov, 1972).
Stimulated recall doesn't overburden the participants' thought processes or
distract them during the event or processunder investigation, and, in this sense,
it is less intrusive than gathering the data for think-aloud protocols. If the stim
ulated recall procedure is used, however, it should be done as soon as possible
after the event so the person's memories don't degrade too much. Think-aloud
protocols, on the other hand, have the advantage of being generated concur
rently with the event under investigation, but there is a worry that the data col
lection process itself might in some way influence or impede the normal mental
processingabout which the subjectsare introspecting.
Stimulated recall helps overcome the danger of the possible interference
caused by concurrent introspection but also distances the subject from the ac
tual event. The possibility also exists that using stimulated recall may intro
duce thoughts or perceptions that were not present during the original event.
If electronic recordings are used as the stimulatingdata, researchsubjects may
be "put off" by the sound of their ownvoices on audiotapes or their images on
videotape.
If transcripts are used as the data in stimulated recall, there is the added bur
den of transcribing data, which is very time-consuming. The transcription
process thus inserts even more time between the event under investigation and
the description of that event by the person doing the stimulated recall.

306 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Finally, incases where learners or teachers are making diary entries thatwill
be analyzed by someone else, they may come up with responses that they think
the researcher wants to hear. This problem can also occurwith think-aloud pro
tocols and stimulated recall.
Given the problems associated with retrospection, one may question
whether the technique should be used at all. However, there areoccasions when
it is neither feasible nor desirable to collect data from informants as they are
performing a task. This isparticularly trueofresearch conducted in actual class
rooms.

The single greatest advantage of the introspective methods reviewed in this


chapter is that they take us to a place that no other data collection method can
reach—into the mind of the learner or the teacher. The inner cognitive and af
fective states of learners and teachers can only be indirectly plumbed by other
methods such as elicitation and interviews. Thoughts, feelings, motives, reason
ing processes, and mental states can really only be gotten at in anydirect sense
through introspection.

CONCLUSION j
Introspective data collection procedures can be characterized in terms of when
the data are generated relative to the event being investigated—concurrently
with the event, immediately afterwards, some time afterwards, or long after
wards. In addition, the data collection can be very brief, as with think-aloud pro
tocols and some stimulated recall procedures, or much longer as in the diary
studies. Some autobiographical and biographical research spans years of data
collection—or recollection.
While there are potential problems with these data collection procedures,
introspective procedures help us identify and understand issues that are not
readily accessible through other means. When used with other data collection
procedures over time in studies that employ careful triangulation, they can pro
vide helpful insights in research on languageteaching and learning.

QUESTIONS AND TASKS


1. Think of a research question that interests you and that could be ad
dressed with think-aloud protocols from language learners. First, deter
mine what task(s) the learners will do as they think aloud. Then write the
instructions to the learners about how they are to verbalize as they do the
task(s).
2. Think of a research question you'd like to investigate using stimulated
recall. What is the research question? What sorts of data would you use
to stimulate your informants' recollection of the event(s) you wish to
study?

Chapter10 Introspective Methods of DataCollection 307


3. Think of a research question that interests you and that could be ad
dressed through data from language learners' or teachers' diaries. Write
the instructions to the diarists about the journal entries you'd like them
to make.
4. Complete the following table. Identify the pros and cons of the different
techniques discussed in the chapter, and then suggest a possible research
question that could be investigated using eachtechnique.

Technique Pros Cons Question


Think aloud
Stimulated recall
Diary entries
Biographical research
Autobiographical research

5. In recent years, language classroom research has produced reports in


which diary entries served as part of the data, along with data from other
elicitation procedures. Think of a research questionyou would like to ad
dress in which learners' or teachers' diary entries could form a part of the
database. What additional types of data would you use to address your
question?
6. The diary extracts that follow have been taken from an unpublished diary
kept by David Nunan. Study the extracts and complete the tasks that
follow.

December 8
I've been living in Hong Kong now, and speak very little Cantonese.
This is something I'm embarrassed and ashamed of, being a language
teacher myself. Not that it's unusual. I have colleagues who have lived
here twice as long as I have who speak even less Cantonese than I do.
So, what are the reasons? Firstly,it's a very difficult language.The com
mitment of time to makea decent fist of learning the language is enor
mous. Like most expats in Hong Kong, I lead a very busy life. In fact, I
spend approximately half of my life travelingout of Hong Kong. Out
side Hong Kong, the language is of little utility. If I'm going to learn a
Chinese language it might as well be Putonghua.
Secondly, despite constant laments about the poor standard of Eng
lish, most of the local Chinese that I interact with have a reasonable
level of English. I would have to study Cantonese for many years for
my Cantonese to be comparable to or superior to their English. From
a practical, communicative perspective, there is therefore no need to

308 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


learn the language. Shortly after I moved to Hong Kong, and had
learned a few phrases, I tried to use it in public. The person I was
attempting to communicate with said to me, in perfect English, "If
you're going to speak Cantonese-speak Cantonese!" That kind of put-
down was not encouraging.
Finally, I suffered foreign language interference. I speak Thai to a
lower-intermediate level of proficiency, and there is a surprising num
berofcognates and false-cognates between Cantonese and Thai. Often
a word will have the same pronunciation but a different tone.

December 9
I've set myself the goal of learning 1,000 words and phrases byJune—
only around five a day, but I'm having trouble remembering even that.
The CD that L. got me is much better than tapes I have, but there's not
enough repetition, and the phrases are presented out of context.
'Maai' rhymes with 'buy' so that should be easy enough to remember.
I'm confused by the particles 'ma' and 'ah'. As far as I can figure out,
'ma' is a question particle, and 'ah' functions more likea question tag in
English. So 'Mohngma?' = 'Areyoubusy?' and 'Mohngah?' = 'You're
busy, are you?' the 'ah' form seems much more prevalentthan the 'ma'
form.
I tried creating little dialogues from the phrasesI wastrying to learn in
order to give them some context, but didn't get very far. I don't even
know how to say 'yes'. From what I know of other Asian languages, I
guess there won't be a singleword as there is in English.
Really frustrating! I have no ideahowto give affirmative responses. The
book I'm working with teaches 'Gei hou ma?' (How are you?) but not
how to respond. To respond in the affirmative, I did what I'd do if
speaking Thai—repeat the phrase. I have no ideaif it's right or wrong.
In response to 'mohng ma?' 'Are you busy?' I made up'Haila'because
it's what I hear people in the office saying all the time. I'm sure it's
wrong, but I have no other resources to use.
I checked with L. who suggested Gei hou (quite good) or simple 'hou'
and 'hou mohng ah!' (very busy) or simply 'hou mohng' for answers.
At lunch, I was really pleased when I called for the check"Maaihdaan,
mgoi"and the woman understoodinstantly.

December 15
Opportunities to get out and actually practice the language are close to
zero. My Chinese friends and acquaintances are all totally bilingual,
and even most of the cab drivers on Hongkongside are much better at
English than I'll everbe at Cantonese. This is MUCH more like learn
ing asa foreign than a second language. It it very demotivating.

Chapter 10 Introspective Methods ofData Collection 309


December 22
Onlyone or two of the words that I worked on yesterday seem to have
stuck. The 'organic' principle seems to be working here. I'm not learn
ing one word perfectly one at a time. Rather, I seem to partially learn
words and expressions, and then suddenly, several of them will seem to
'come together'. The more vocabulary you learn, the easiernew vocab
ulary isto learn through a kind of'lexical synergy'. For example, know
ing 'jousahn' (good morning) made it relatively easy to learn 'good
night' ('joutau').

December 28
Conversation practice withL. (40minutes). Westartedpracticing more
freely today, and I tried to use whatever resources I had to communi
cate. It was hard work, but fun, and motivating when L. understood
whatI was trying to say. It's so motivating to have a sympathetic native
speakerto reassure me that I AM makingprogress.
At one point I wanted to ask L. if she liked tea, so I said, "Nei jungyi
yum cha." She corrected me to the like/not like form "Nei jung-mh-
jungyi yum cha a?" I asked her why she didn't use the full form of the
verb to like 'jungyi', but just part of it 'jung' in the first part of the
negative/positive, i.e., why she didn't say "Nei jungyi-mh-jungyi yum
chaa?"Shelooked mystified for a minute, not understanding whatI was
trying to say, then laughed, because she'd never noticed that the full
form of the verb isn't used with this question form.

January 7
It's time to push ahead. I am constantly tempted to 'consolidate' rather
than to work on new language.
This morning, L. asked me, "Nei yau mou tung nei go leiu gong dinwa
a?" I knew instantlywhat she had asked, "Did you talk to your daugh
ter (whojust went backto England)on the phone?" I did this by recog
nizing the phrase "leiu gong dinwa" ("daughter speak phone"),
although I wasn't able to respond appropriately. EventuallyI came up
with 'yau' (yes). L. then asked "Geisi a?" Using the context, I guessed
that this must mean 'when'—also drew on the fact that 'Geido' means
'how much/how many' and made the assumption that 'Gei' combines
with other particles to form 'wh-' questions. L. confirmed this 'Si'
means 'hour'.

Use the diary entries above to answer the following questions:


® What are the factors that facilitated language learning?
o What factors inhibited the process?
o What research questions suggest themselves?
@ What learning strategies does the diarist use?

310 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


7. Write a briefaccount of you own experiences as a language learner using
the extractfrom Choi's retrospections above as a model.
© Did you find the introspective process easy ordifficult?
© Did ideasand incidentsthat you had forgotten about come back to you
during the writing process?
® How would you go about analyzing the data?
® If possible, share your account with someone who has completed the
same task. What similarities and differences are there in the accounts?
© What insights did you gain about introspection as a data collection
technique from doing this task?

SUGGESTED READINGS
Ericcson and Simon (1993) provide a thorough overview of the use of verbal
reports and protocol analysis in psychology research. If you plan to work with
think-aloud protocols, this book maybe useful. For a chapter on the use of the
think-aloud procedurein second language research, seeJourdenais (2001).
Fserch and Kasper (1987) edited a collection of articles about introspection
in second language research. This volume is the source of Grotjahn's (1987)
category system, which we have cited throughoutthis book, as well as Ericcson
and Simon (1987).
Gass and Mackey's (2000) book is an excellent! source of information about
stimulatedrecall in secondlanguage acquisition research.
Any of the diary studies listed in Tables 10.1 and 10.2 above will giveyou
samples of how this introspective method has been used. Many are illustrations
of classroom research, while others document language acquisition in naturalis
tic contexts.
For interesting critiques of the diary studies, see Fry (1988), Matsumoto
(1987), and Seliger (1983b).

Chapter 10 Introspective Methods of DataCollection 311


C H A PT E R

11

Elicitation Procedures

Field researchers would do well to study the efforts oftheirpeers and


predecessors notbecause they expect tofindreadyguidelines and
recipes—the knowledge involved incraftproduction isinherently tacit
andnoncodifiable—but toimmerse themselves in the cultures and
customs oftheir communities; that is, to understand a complicated
ethos rather than to find a simpleformula. (Schrank, 2006, pp. 217-218)

INTRODUCTION AND OVERVIEW

The focus of this chapter is on elicitation procedures. These include interviews,


production tasks, role plays, questionnaires, tests, and so on. Elicitation proce
dures vary enormously in terms of their scope and purpose as you can see from
the precedinglist. In the field of secondlanguage acquisition, studiesusing elic
itation are extremely common. They are not quite so common in classroom
research although they are widely used in classroom-oriented (as opposed to
classroom-based) research.
In second language research, elicitation refers to all of the ways in which the
researcher tries to obtain data directly from informants (rather than, for exam
ple, simplyby observingthem). In second languageclassroom research,the focus
of interest could be the teacher, the students, or some aspect of teacher-student
interaction. An unstructured interview with a teacher about his or her decisions
made during a lesson wouldbe an example of a teacher-focused elicitation study.
A questionnaire administered to students to obtain their attitudes and feelings
about studying a foreign language would be an example of a student-focused
elicitation study.

312
REFLECTION

Think of a studythatyouwould be interested in doing. Based on whatyou


know so far, would some sort of elicitation procedure be useful in your
research?

INTERVIEWS

In this section, we will consider the interview as a family of elicitation proce


dures. We will look at structured, semi-structured, and unstructured interviews,
aswell asethnographic interviews andfocus group interviews. While most inter
views are conductedface-to-face, they can also be carried out electronically—by
telephone, via e-mail, or even through a chat room. Like other elicitation de
vices, interviews can be used to collectsamples of learner language for analysis,
the viewsand attitudes of informants, or their language learning histories.
Interview types can be placed on a continuum in terms of their formality.
This continuum can range from unstructured through semi-structured to struc
tured. The data that we obtain by interviewing must be recorded—either in
writing, electronically, or both.Typically interview responses, or a subsetofsuch
responses, must be transcribed. However, an advantage of interviewing is that,
unlike written questionnaires, interviews can be used with non-literate research
respondents. In addition, even though interviewing is very time-consuming,
it does not suffer from the same problems of low return rates that plague survey
research.

Structured Interviews
The structured interviewis likea questionnaire that is administered orally rather
than in writing. The researchernormallyworks with one person at a time, ask
ing him or her questions and recording the person's answers. The interview fol
lows a pre-set list of questions, and the researcher is careful to elicitanswers to
the same questions from all of the respondents. Conducting a structured inter
view demands training and discipline on the part of the researcher to stick
closely to the predetermined agenda. The advantage of using structured inter
views is that they provide detailed data that is comparable across informants.

Semi-Structured Interviews
In a semi-structured interview, the researcher will have a general idea of how he
or she wants the interview to unfold and may even have a set of prepared ques
tions. However, he or she will use these questions as a point of departure for the
interview and will not be constrained by them. As the interview unfolds, topics
and issues rather than pre-setquestions will determine the direction that the in
terview takes. The main difference between a semi-structured interview and an

Chapter11 Elicitation Procedures 313


unstructured interview is that the former will adhere more closely to the re
searcher's agenda thanthe latter. Because ofits flexibility, the semi-structured in
terview is preferred by many field researchers. Dowsett (1986), for example,
writes that the semi-structured interview

is quite extraordinary—the interactionsare incrediblyrich and the data


indicate thatyou can produce extraordinary evidence about life thatyou
don't get in structured interviews or questionnaire methodology—no
matter how open ended and qualitative you think your questionnaires
are attempting to be.It'snot the onlyquahtative research technique that
will produce rich information aboutsocial relationships but it does give
you access to social relationships in a quite profoundway. (p. 53)

Unstructured Interviews
An unstructured interviewwill develop according to the agenda of the intervie
wee rather than the agenda of the interviewer. While there will be a general
theme underpinning the interview, it can take off in unexpected directions,
which the interviewer will follow, picking up on issues and themes suggested by
the interviewee. For example, the interviewer might begin by asking a teacher
about how he or she modifies and adapts course books and other commercial
materials into her lessons but then segue into the role of technology in language
teaching.

Ethnographic Interviews
Aswe sawin Chapter 7, ethnography seeksto document both the emic (insider's)
and the etic (outsider's) point of view. Ethnographers doing field researchoften
use interviews to discover and develop the emic perspective. According to
Spradley (1979), these interviews are like "a series of friendly conversations"
between the researcher and the members of a culture (p. 58). Ethnographicin
terviews occur in the natural course of the longitudinal, ongoing relationships
the ethnographerbuilds with the cohort in the study. They are characterized by
(1) "a specific request to hold the interview (resulting from the research ques
tion)" (Flick, 1998, p. 93); (2) ethnographic explanations given in everyday
language, in which the ethnographer tells the informant explicitly what he is
seeking and why; and (3) specific question types that elicit information about
how the participants constructmeaningand organize their society.

Focus Group Interviews


We tend to think of interviews as one-on-one events involving an interviewer
and an interviewee. However, focus group interviews involving several in
formants and a moderator are becoming increasingly popular (Stewart and
Shamdasani, 1990). Focusgroups are defined as "a research technique that collects
data through group interaction on a topic determined by the researcher"

314 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


(Morgan, 1997, p. 6). Gettinga group ofstudents to view a video of one of their
lessons and then eliciting their reactions to the lesson would be an example of a
focus group interview. The term focus highlights that the fact that the researcher
guides and focuses the discussion rather than letting informants take the inter
view in any direction that they wish. The advantage of a focus group rather than
an individual interview is that the informants can stimulate and be stimulated by
each other. The researcher may thereby elicit a richer data set than if he or she
is conducting individual interviews.

REFLECTION

Can you think of any disadvantages of using focus group interviews when
second language learners are the interviewees?

The choice to interview (and of the type of interview you might use) is
directly related to your research question(s). Clearly, if you wish to investigate
the perspectives of participants in a course or members of a culture, interviews
are one wayto gather in-depth data. The types of interviews described aboveare
summarized in Table 11.1.

REFLECTION

Based on your previous reading and experience, as well as what you have
read so far in this book, what do you see as the advantages of interviews?
Do you have a preference for any of the types of interviews described
above? If so, why?

As an example of classroom-oriented research using interviews, we have


chosen an excerpt from a study by Galda (in press), an ESL teacher in a U.S.
community- college. She studied the English needs and the English language use
of three elderly refugees who had been her students at one time. Part of her data
collection involved interviewing them. The interviews were conducted in
English. Here is Galda's description of her work with these three learners:
I was fortunate to work with three very special students, Falina, Luda,
and Misha, in compiling the research data. Although I had worked with
each of these students prior to the study, they were not enrolled in my
classes during the research period. All three participants were from the
former Soviet Union and had emigrated to the United States due to re
ligious persecution. (All are heritage Jews.) They vary in their level of
English proficiency, with Falina performing at a high-intermediate
level, Luda at a basic level, and Misha at a beginning level of proficiency

Chapter 11 Elicitation Procedures 315


TABLE 11.1 Types of Interviews
Type Description

Structured Structured interviews are similar to questionnaires that are


completed orallyrather than in writing. The researcher has
a well-planned and clearly defined agenda and a set of
questions that are adhered to fairly closely.
Semi-structured Semi-structured interviews are less rigidthan structured
interviews but more systematic than unstructured
interviews. The interviewer usespredetermined questions
to elicit comparable data across interviewees, but also
allows for expansion and elaboration in the responses.
Unstructured In an unstructured interview, the interviewer knows, in a
general way, where he or she wants to go with the interview
but is happy to let the interviewer take the lead and follow
themes and issues as they emerge in the course of an
interview

Ethnographic Ethnographic interviews occur in the researcher's


longitudinal, ongoing relationshipswith the cohort.
They include (1) a specificrequest for the interview,
(2) understandable explanations about what information is
beingsought and why; and (3) specific question types to get
informationabout how the participants construct meaning
and organize their society.
Focus group Focus group interviews involvea group of individuals
rather than a singleinformant. Focusgroup interviews are
popular with market researchers who want to obtain
collective insights into a new product that they are placing
on the market. In English language teaching, commercial
publishersoften use focus groups to get data on how new
coursematerialsand textbooks are workingin the
classroom.

(no previous knowledge of English). These students were enrolled in a


computer skills course for ESL students when they volunteered to help
with this project.
Galda explains how the data collection changed slightlyas she got to know the
three students better and they became more familiar with the interview process:
Over the course of the two semesters, I interacted with the student par
ticipants individually several times a week, chatting informally, helping
each with their computer class assignments, and conducting interviews.
Each of the participants took part in open-ended interviews for two
hours each week for 20 weeks. These interviews focused on three or four
major questions or themes. Many of these questions were "orientation"

316 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


types of inquiries, ice-breakers chosen to help us all relax and to get to
know each other better. Some centered on the participants' life experi
ences before coming to the U.S., in particular, their early memories of
their ownliteracy development duringchildhood.
To maintain a natural, conversational interaction in the interview
sessions, I initially hand-recorded field notes[during these conversa
tions. As the participants grew increasingly comfortable with me and
the research being conducted, the conversations were audio-recorded
and subsequently transcribed within hours of the interviews. In addi
tion, I maintained a research journal that was updated after each inter
view session to help in the identification and recording of insights and
impressions. I utilized the journal to process the conversations and to
reflect upon my own thoughts, ideas, and further questions as the
project progressed.
Galda used the interview data to build a cumulative picture of the three respon
dents' lives, prior to their emigration to the United States, and in particular to
learn about their education and experiences of literacy:
Through the open-ended interview sessions, I learned a great deal
about the life and literacy experiences that the participants hadamassed
before coming to the U.S. In addition, I gained insights into their per
ceptions ofthemselves as learners, as well assome of the personal learn
ing goals identified by each individual. Finally, I managed to uncover
self-perceptions held by each and detected some of the social identities
constructed by these students as they negotiated their new social,
linguistic,and academic environments.
Although the interviews were Galda's main data elicitation procedure, she
also collected data by other means. This process allowed her to triangulate her
findings.
In addition to the interview sessions, I spent one hour per weekworking
withparticipants in tutoringsessions, helping themwiththeiremerging
language skills in computer, writing, and reading applications. The par
ticipants eventually consented to being recorded during the tutoring
sessions, and these interactions were transcribed within hours of each
session. Through the tutoring sessions, I gained insights into the learn
ing strategies, learning styles, literacies, and learning experiences
exhibited by the participants. I also gathered data concerning the con
struction of personal identity by each individual in this semi-social,
semi-academic context.
In order to obtain a more rounded impression of each participant, I
attended two full ESL class sessionswith the participants each semester.
During these visits, I recorded field notes using a divided page tech
nique, which allowed for the recording of observed interactions on one
side of the field note page. The other side of the field note page was

Chapter 11 Elicitation Procedures 317


then available to add additional comments and insights collected from
course instructors discussing the observed interactions following the
class visits.
Finally, I collected copies of student work and assignments, then
examined these materials, mining them for additional data during the
course of the semester. Through the examination of student-generated
documents, I learned a greatdeal about the life and learning experiences
of the participants. In addition, I was afforded some very personal in
sights into the perceptions these student participantsheld of" themselves
as both learners and as social beings. Finally, through their own care
fully considered, self-generated words, I was able to identify additional
personal learning goalsarticulated by the participants.

REFLECTION

Think of a research question that interests you. How might Galda's


experience and ideas about interviewing be useful to you if you were to
interview second language learners to collect (some of) your data?

ACTION

Galda's primary data collection procedure was die ongoing interview


process, but she also achieved methods triangulation. Reread the para
graphs above and identify all the methods of data collection she used.

QUESTIONNAIRES

In Chapter 5, we saw that questionnaires are a popular elicitation device. They


can be widely disseminated, particularly in this electronicage, and quantitative or
categorical responses to closed questions can also be readily collated and analyzed
with the aid of technology. Questionnaires bear a close family resemblance to
interviews, particularly structured interviews, and many of the comments that we
madeabout interviews in the precedingsection are pertinent to questionnaires as
well. In particular, the closed-ended items in a structured interview resemble a
questionnaire administered to informants orally rather than in writing.
Questionnaires are very popular data collection devices with graduate
students, who often have the idea that the ease of disseminating and collating
data means that administering questionnaire is an 'easy' data collection option.
However, constructing a sound questionnaire, one that is unambiguous and
yields the data the researcher wants to collect, is notoriously difficult, as we saw
in Chapter 5. This is especiallyso when the questionnaire is administered in the
second language of the respondents.

318 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Now thatyouarefurtheralongin thisbook, whs t kinds of classroom issues


do you thinkmightbe (partly) investigated through questionnaires?

COMBINING QUESTIONNAIRE A1SD


INTERVIEW DATA

Interviews and questionnaires work well together. With questionnaires, which


are typically completed in a noninteractive fashion, it ispossible to geta range of
responses from many people on a limited number ofitems. So, once they arede
veloped, piloted, and validated, questionnaires are practical and convenient. In
contrast, interactive interviews are less practical to administer, but they permit
researchers to delve into people's ideas and askthem to expand upon their com
ments. In thissense, questionnaires permitus to sample broadly while interviews
permit us to explore more deeply.
You may sometimes wish to employ a research design in which you first use
questionnaires to geta broad cross section ofinformation or opinions. You could
subsequently use interviews to get more detailed data from a subgroup of your
sample. Or you might use questionnaire data from a large number of teachers
and students and then observe a smaller cohort selected from that larger group.
Here's an example. Imagine you are interested in the interactive decision
making of language teachers—and, in particular, you want to know why and
when teachers depart from their lesson plans. You design, pilot, and revise a
questionnaire addressing this issue and distribute it to 100 language teachers.
You send questionnaires to both native speakers (NSs) and non-native speakers
(NNSs) of the languages theyteach because youwish to compare the responses of
these two groups. Thus, native speaking and non-native speaking teachers be
come the two levels of the independent variable.
Let'ssaythat forty of the teachers return the completed questionnaire. You
could first divide the responses into those of the NS and NNS teachers of their
targetlanguages. If you had included appropriate questions about the teachers'
experience in the background section of the questionnaire, youcould also iden
tify themasmore experienced andless experienced teachers, treating experience
as a moderator variable. Younow have a factorial criterion groups design.
The questionnaire responses—the dependent variable in thisstudy—would
certainly be informative, but you might wish to gather more detailed data from
some of the teachers, so you could resample from among those who had re
turned the questionnaire. Suppose the original return ratehad looked something
like Figure 11.1.
So, perhaps you decide to interview two of the teachers from each cell in
the design. We call this process the "sample-resample procedure." It allows
you to obtain more detailed information from a subset of your respondents by

Chapter 11 Elicitation Procedures 319


Native Speaking Non-Native Speaking
Teachers (NS Ts) Teachers (NNS Ts)
(N = 19) (N = 21)

More Experienced
N= 12 N = 8
Teachers (N = 20)

Less Experienced
N = 7 N=13
Teachers (N = 20)

FIGURE 11.1 Questionnaire returnin a hypothetical study of


teachers' decisions to depart from their lesson plans

Native Speaking Non-Native Speaking


Teachers (NS Ts) Teachers (NNS Ts)
(N = 19) (N = 21)

More Experienced ^\ N=12 N = 8 ^^


Teachers (N = 20)
N = 2 N = 2

N = 2 N = 2
Less Experienced
Teachers (N = 20)
^^ N = 7 N=13 ^\
FIGURE 11.2 The "two-phase" or "raised" design

interviewing them. It results in a "raised design" or "two-phase design," as


depicted in Figure 11.2.
Carrying out this sort of intensified data elicitation process in the second
phase allows youto getmore detailed information from a few people who repre
sent the larger sample. You need to be careful to define your selection criteria
clearly, but eliciting detailed information from a subset of your respondents
provides more in-depth data in a practical way.
This sort of two-phase design can be used in various forms of methods
triangulation. For instance, you could get questionnaire data first from teachers
and then observe classes taught by some of those teachers. Or you could use
archival data to locate certain groups of learners (highly successful versus less
successful learners, as determined by an achievement test) and interview subsets
of those groups.

320 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


P10IHJCTI0N TASKS
Production tasks are techniques used to obtain samples of learner language,
typically in order to study processes and stages of development that learners
pass through as they develop their second language proficiency. Production
tasks are therefore quite popular with second language acquisition specialists.
The alternative—observation and recording of learner language in naturalistic
situations—has a number of drawbacks. In the first place, it is very time-
consuming, and learners may not produce sufficient quantities of the language
structures, lexical items, or speech acts in question in order for you to detect
patterns or come to conclusions about language development. Secondly, learn
ers may simply not produce a particular language structure, lexical item, or
speech act at all. It is impossible to conclude that a learner has not acquired an
item simply because he or she has not used it in your presence.

Discourse Completion Tasks


One way that researchers try to elicit language samples from learners is through
a procedure called discourse completion. In this situation, the researcher sets up a
context and provides part of the discourse. The learnermust then complete the
interaction by expressing whathe or she would sayif he or she were actually in
such a context.
For example, suppose you wish to gather data on how language learners
complain in their target language. It may be very difficult to obtain complaint
samples naturalistically unless youcan find a situation where complaints are nor
mally voiced, such as an ombudsman's office or the returns section of a depart
ment store. Even if you can find such a place, you would need to get permission
to record people's speech, andyouwould stillhave to wait for non-native speakers
of the target language to appearand voice their complaints.
Discourse completion tasks have been used to dealwith this sort of imprac
ticalsituation.An example would be something like this:
Instructions: You receive your test back from your professor. You see
that he has added the points incorrectly and that you should actually
have ten more points than he gave you. What do you say to your
professor?
The learner then writes (in a questionnaire) or says (in an interview context)
what he or she thinks his or her response would be in this situation.

REFLECTION

What problems might occur in using this kind of prompt as a discourse


completion taskto collect datafrom language Warners?

Chapter 11 Elicitation Procedures 321


Sometimes a discourse completion taskis framed asa briefconversation and
the students are asked to complete the conversation. Forinstance, if you were
trying tofind outhow language learners deal with denials ofrequests, you might
use a discourse completion task like this one:
You discover your library book must be returned today and you will
have to pay a fine if you don't return it immediately. You say to your
friend, "May I please borrow your bicycle to goto the library?"
Your friend says, "No, I need it myself."
You say, " ____"

REFLECTION

What problems might occur in using this land of prompt as a discourse


completion taskto collect datafrom language learners?

Role Plays
Some researchers have used roleplay scenarios to elicit language learners' speech
samples and ideas. For instance, as early as 1980, Fraser, Rintell, and Walters
used tworoles plays aboutawkward situations. In one,theyasked respondents to
imagine themselves at a parking meter with no change, having to borrow some
coins from an older stranger as the meter maid draws closer and closer. In the
second situation, they asked the respondents to put themselves in a situation
where theywerelateto a lunchappointment withan olderbusiness acquaintance
whom they do not know well. Here are the instructions that theygave:

We are asking you to participate withus in a series of role playing exercises


in which I will describe in somedetail to you a situation and then ask you
to tell me exactly what you would say or do. You should talk to me as if I
were actually the personwithwhomyou are speaking in the situationI will
describe, even though I will not usually resemble that person. Sometimes
you will indicate that an answeris unnecessary, or that you would be un
willing to say anything at all. Other times, you might want to go on at
some length in expressing your views. Some people like to move around
during these exercises; please feel free to do so.
It is importantfor youto understand that weare in no way giving you
a "mark" on your responses. There are no right or wronganswers. Some
timesmore than one answer might be appropriate; if you think this is the
case, please feel free to offerthe alternatives. Also, if you want to talkabout
the situation after you respond or raise other topics you might feel to be
relevant, please do so. (pp. 81-82)

322 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


The respondents were university students living in the United States. They were
native speakers of Spanish and did these tasks in both Spanish and English.

ACTION

Write instructions for a role play to elicit data in a study that would inter
est you, but imagine that your respondents are intermediate learners or
false beginners ofthe target language. Make sure the instructions are both
clear and positive in tone.

Role plays have been used in language assessment as well as indata collection,
but some researchers have voiced concerns about whether personality or acting
ability may influence the outcomes (see, e.g., van Lier, 1989). Others have noted
that students' abilities to play a role may be related to their experience. Bailey
(1998b) gives an example ofa native speaker of English doing two different role
plays—one in English and one in Spanish, her second language. The speaker
reported that the English role play was much more difficult because she couldn't
imagine thesituation, while the Spanish role play was fun and plausible.

Tests Used as Elicitation Procedures


In the instructions quoted above from Fraser et al. (1980), the authors are care
ful to point out that the role play responses will not be marked or graded at all.
They do this because often research subjects—like students—feel like they are
being tested and getanxious about performing a task correctly. Indeed, tests have
often been used to elicit language samples from learners.
What do we mean by a test in this context? According to Wesche (1983), all
tests consist of four components:

1. the stimulus material (whether this is an essay prompt, a listening passage,


a reading text, etc.);
2. the task posed to the learner (the mental operations the learners mustdo);
3. the learner's response (e.g., choosing A, B, C, or D; writing an essay) and
4. the scoring criteria (whether they are objective or subjective).
Some familiar forms of tests are dictations, cloze passages, multiple-choice
items, matching items, essays, and so on. All of these can be used to elicit sam
ples of learner language, but the test chosen must beappropriate for the research
and for the age and proficiency level of the people involved.
There are many kinds of tests, and it is beyond this scope of this chapter to
review them all thoroughly. (Fortunately, there are many textbooks available to
introduce language teachers to testing procedures.) Here we will briefly discuss
two forms of tests that are commonly used in language classroom research for
one purpose or another.

Chapter 11 Elicitation Procedures 323


First, oral proficiency interviews are often used toelicit speech samples from
learners. In this situation, it is not the learners' ideas perse thatinterest the re
searchers. Rather, it isthelanguage used to express those ideas. Some familiar in
terview formats are the ILR (Inter-agency Language Roundtable) and the
ACTFL (American Council ofTeachers ofForeign Languages) Oral Proficiency
Interview procedures. Thesearebothconducted bytrained interviewers to elicit
speech samples from language learners (in a wide range of languages). The
speech samples are typically recorded and rated by trained raters (who are usu
ally the interviewers). The ILR system uses ratings of zero to five, with plus
factors (e.g., 0, 0+, 1, 1+, 2, 2+, and so on) that indicate that a speaker has
nearly reached the nextlevel but cannotsustain performance there. The ACTFL
rating system uses category labels (novice low, novice mid, novice high, interme
diate low, intermediate mid, intermediate high, and so on) to characterize
students' oral proficiency.
There are also numerous standardized language tests, particularly ofEnglish
proficiency. A standardized test is one that is administered under uniform condi
tions, no matter when or where it is given. The scores are also reported on a
standardized scale that does not vary from one administration and reporting
period to another. Common standardized tests in language teaching include
the IELTS, the TOEFL, the TOEIC, the SLEP, etc.
Tests are used in several ways in language classroom research. Sometimes
they constitute the dependent variable in a study. For example, in a methods
comparison, you may wish to see which group of beginning language learners
scored highest on a test after a particular course of study. If the learners are not
true beginners, a form of the testmight be given before the course ofstudy, so
you could use a pre-test post-test control groups design and calculate the
students' gains to see which group improved the most.
In other situations, tests determine whatgroups oflearners are involved in a
study. For instance, you might wish to compare learners who are very fluent,
moderately fluent, and not fluent speakers of the target language as they under
take some curriculum in order to see if the curriculum works better for one sort
oflearner than another. In this case, the testoffluency would not bethe depend
ent variable in the study; rather, it would be the instrument bywhich you define
the types of learners that comprise the levels of the independent variable in a
criterion groups design. Test results can also be used to classify people into
different levels of the moderator variable.
Tests have a long and important history as tools in research, including
language classroom research. However, therearemany problems associated with
tests, and it is not sufficient to simply construct a quick-and-dirty measure of
student learning to use in a study. Nor should we select a commercially available
test if it is not appropriate for the study.
The traditional criteria for evaluating tests include reliability, validity, prac
ticality, and washback. Reliability is the idea that a test must be consistent across
administrations. This concern isespecially important when ratings are involved.
Validity is largely a matterof whether a test is actually assessing what it was de
signed to measure. Practicality is the question of howmanyresources are usedin

324 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


developing, administering, and scoring a test in order to get the needed infor
mation, And, finally, washback is the effect of a test on teaching and learning.
(Washback can be positive or negative, as well as intentional or unintentional.)
To these traditional criteria, Bachman and Palmer (1996) have added two
more: authenticity and interactiveness. They define!authenticity as "the degree of
correspondence ofthe characteristics ofa given language test task tothe features
of a target language use task" (p. 23). And interactiveness is defined as "the
extent and type of involvement of the test taker's individual characteristics in
accomplishing the task" (p. 25). Such characteristics include the students'
language ability andbackground knowledge.

Picture Description Tasks


As noted above, researchers often attempt to overcome the shortcomings of nat
uralistic observation by setting up situations that are designed to force produc
tion of target language items. Some ofthe earliest studies of this type were the
so-called morpheme acquisition studies (e.g., Dulay and Burt, 1973). These
studies were designed to investigate the order in which certain grammatical
morphemes (such as thecopula verb to be, third person -s, past tense markers, and
articles) were acquired byspeakers of different first languages. The researchers
wanted to know whetherfirstlanguage speakers of Spanish, for instance, would
acquire the morphemes in an order that differed from that of first language
speakers of Chinese. The classroom-oriented dimension to the study came in
when researchers, having established that the acquisition orders were virtually
identical regardless of the learners' first language] tried to change the 'natural
order' of acquisition through a series of classroom interventions.
One common way of stimulating production was to ask the informant ques
tions about a series of pictures. The morpheme order studies used a test known
as the Bilingual Syntax Measure (BSM). The BSM consisted of a series of sim
ple, colorful, cartoon-like drawings. The informant was shown the pictures one
at a time and asked questions meant to elicit the target language items being
investigated. For example, a picture designed to elicit the -ing form of the verb
might show someone eating ameal while a little doglooks on hungrily. The BSM's
scoring system determined how advanced a speaker's syntactic development was,
based on the oral picture descriptions.

REFLECTION

Can yousee any problems with this landofpict ure elicitation device?

One challenge in using picture description tasks to elicit data is to come up


with aprompt (what theresearchers asks theinformant inorderto elicit a language
sample) thatdoes not include the target structure. Forexample, ifthe researcher
asks, "What's the boydoing?", the -ing form of the verb in the question may act
as a cue to the informant,whichwouldhaveimplications for the internal validity

Chapter 11 Elicitation Procedures 325


ofthe study. The researcher would have tosay something like, "Tell me about the
boy," instead.
Another problem is that the nonappearance of a targetitem does not neces
sarily mean that the informant has not acquired that item. In the example just
cited, in response to the question, "Tell me about the boy," the informant might
say, "He is happy." We cannot infer from this response that the informant has
not acquired the -ing form of the verb simply because he or she has not used it.
The researcher would need to pose a follow-up question, such as, "Why do you
think he is happy?"
Another problem is that the language that is stimulated by the elicitation
instrument might be an artifact of the instrument itself. This was a criticism
leveled at the morpheme acquisition studies that used the BSM. Apparently, the
present progressive (verb + -ing) form appeared with great regularity, partly
because the BSM cartoons showed people doing things. As a result, the picture
description task itself was partly responsible for the high number of present pro
gressive instances in the data.

Using Tasks to Investigate Negotiation of Meaning


Aconsiderable body of research has used production tasks to investigate interac
tional modifications, otherwise known as the negotiation ofmeaning. Such modifica
tions include comprehension checks, clarification requests, incorporations, and
so on. (If you turn back to Figure 2.3, you can find operational definitions and
examples ofthese concepts as they were used inJepson's [2005] study comparing
language learners' interactions in voiced chats and text chats.) Such interactional
modifications have been hypothesized to facilitate second language acquisition
because they are triggered by signals of incomprehension from the learner's
interlocutor that force the learner to restructure his or her utterance.
Drawing on the work of Long (1985), Nunan (2004) explains the relation
ship between interactional modifications and acquisition in the following way.
Long argues that linguistic conversational adjustments . . . promote
comprehensible input because such adjustments are usually triggered by
an indication of non-comprehension, requiring the speaker to reformu
late his or her utterance to make it more comprehensible. If compre
hensible input promotes acquisition, then it follows that linguistic/
conversational adjustments promote acquisition, (p. 80)

REFLECTION

Which is the trigger and which the modified utterance in the following-
piece of interaction (Martyn, 2001, p. 33)?
A: She's a loner.
B: Sorry?
A: She stay away from others.

326 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Much of the research stimulated by theoretical claims about relationships
between the negotiation of meaning and acquisition sought to identify the
characteristics of production tasks that maximize opportunities for students to
negotiate meaning. In his own research, Long found that two-way tasks gener
ated more interactional modification than did one-way tasks. (In a two-way task,
all the students involved have unique information that has to be shared for the
task to be successfully completed. In a one-way task, one participant holds all
of the information that must be shared.) Similarly, Doughty and Pica (1986)
found that there was more interactional modification when the information was
required by the task rather than when it was optional. (Also see the studies by
Pica and Doughty, 1985a; 1985b.)
Early studies were carried out in non-classroom 'laboratory' settings, and
are therefore what we have called classroom-oriented rather than classroom-
based research. Martyn (1996; 2001), however, conducted her research in intact
classrooms.

REFLECTION

Why do you think that Martyn chose to carry out her investigations in
irtl-or-f classrooms?
intact pliccrriAmc?

Martyn used five production tasks: (1) jigsaw, (2) information exchange,
(3) problem solving, (4) decision making, and (5) opinion exchange. From her
literature review, she also isolated the following four cognitive demand features
of tasks:

1. contextual support: whetherembedded, reduced, or remote within the task


(e.g., in the form of visuals)
2. reasoning demand: whether high or low
3. degree of task structure: whether high or low
4. availability of knowledge schema: provided or assumed through prior
knowledge

Martyn then mapped these cognitive demand features onto the five production
task types. This procedure resulted in the following matrix (see Table 11.2),
which she used to investigate interactional mollifications. Martyn found that
tasks with the highest cognitive demand, such as the opinion exchange task, gen
erated the most interactional modifications, while jigsaw tasks, with relatively
low cognitive demand, generated the fewest modifications.
Martyn's study isvaluable, not only because sheworked inactual classrooms,
but also because the kinds of tasks she chose to investigate are those which
language teachers often use.

Chapter 11 Elicitation Procedures 327


TABLE 11.2 Cognitive demand features used inMartyn's study of
interactional modifications (from Nunan, 2004, p. 89,
adapted from Martyn, 2001).
Contextual Reasoning Degreeof Task Available
TaskType Support Required Structure Knowledge
Jigsaw embedded not required high given
Information embedded not required high given
Exchange (for one learner)
Problem some required varies given
Solving embedded
Decision context- required low given or
Making reduced available
Opinion remote required low variable/not
Exchange required

A SAMPLE STUDY

For the sample study in this chapter, we decided to focus on some research by
Snow, Hyland, Kamhi-Stein, and Yu (1996). They investigated the ideas of lan
guageminorityjunior high schoolstudents in Los Angeles, usingoral interviews
in both Spanish and English. Their research questions were the following:

1. What instructional practices do language minority students view as


effective?
2. What is the role that they see themselves playing in the transmission of
school values?
3. How can the school environment help language minority students to
socialize into the new school culture?
4. In what ways can language minority students help their peers become
successful in the new school system? (p. 306)

To address these questions, Snow et al. (1996) interviewed sixty-six students


drawn from six different junior high schools about "student role efficacy—how
they viewed their roles as students" (p. 307). Individual students were inter
viewed and the interviews were tape-recorded. Here is how the authors
described the interviews:

Each interview commenced with a warm-up activity in which the stu


dents were asked to fill in the captions of two cartoons. The interview

328 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


consisted of two different activities. The first was a card sort activity,
conducted in English, in which the students were asked to create a
"recipe" for the ideal class by selecting among a set of "ingredients."
The following pairs of descriptors were printed on the opposite sides of
index cards.

1. A class where I write journals in English or in Spanish.


A class where I do not write in journals.
2. A class where I can speak English if I want to.
A classwhere only English is spoken.
3. A class where the teacher is the center of attention.
A classwhere I participate a lot.
4. A class where the teacher uses cooperative learning.
A class where I work by myself.
5. A classwhere I learn only from the teacher.
A class where I learn from my classmates.
6. A class where I have to take notes.
A classwhere I am not expected to take notes.
7. A class where I help myclassmates edit what theywrite before they
write a final versionof an assignment.
A class where I am the only person who reads what I write before
havingmy teacher read my assignment.

As students chose between the two descriptors on each card, they


explained why they had selected these ingredients. By having students
discuss their choices, we could seewhether or not theyunderstood the
instructional technique in question, andwecould gain moreinsight into
the students' needs and preferences. Students then added two ingredi
ents of theirown, again explaining why theyadded theseitems while the
interviewer wrote the student-generated items on new cards. In the final
step, the students arranged allthe cards (including theirown two ingre
dients) in order, from the most to the least importantingredient for the
ideal class, (pp. 307-308)

Thus, the forced-choice activity provided comparable dataacross all the students
while the original ideas they contributed were open-ended and creative. The
rankingof both the selected and the constructed "ingredients" provided the re
searchers with information they could not have gotten from either the provided
categories or the students' own ideas alone.
The second part of the interview was conducted in Spanish. This was a
problem-posing task involving a role play in which students were asked to ex
plain how they would orient a new student in their school. The students were
supposed to say how they would tell the new arrivalwhat he or she would need

Chapter 11 ElicitationProcedures 329


to do to be successful. The instructions weregiven in Spanish and are reprinted
below, along with the English translation:

En tu clase hay un alumno nuevo que habla muy poco ingles. Tii eres el
consejero de esealumnoy debes ayudarlo. dQue es lo que esealumnotiene
que hacer para convertise en un buen alumno? ^Como tiene que estudiar
para un examen? Recuerda que tu debes aconsejar a tu nuevocompanero.
jSu exito depende de ti!

[Inyourclass thereisa new studentwhospeaks very littleEnglish. You are


his advisor andyouhave to helphim.What does the studenthave to do to
become a good student? How does the student have to prepare for an
exam? Remember that you have to help your new classmate. His success
dependson you!] (Snow et al., 1996, p. 308)

Snow et al. (1996) provide the following commentary about thisdata elicita
tion procedure:
The students, in general, responded readily to the task. They quickly as
sumed the role of the experienced student, offering advice to the new
comer. In some cases, the interviewers had to repeat parts of the
question and prompt thestudents to respond toall parts ofthesituation.
Five ofthesixty-six students said thatthey could not perform thetask in
Spanish. They responded in English; however, theirresponses were not
included in the analysis, since one objective of the task was to see if the
students could communicate their meta-notions of student role efficacy
in Spanish, (p. 308)
In addition to the quahtative analysis, the students' preferences in the card
sort activity were analyzed quantitatively using a statistic called the chi-square
analysis (see Chapter 13). Statistically significant preferences for the students'
views of the ideal class emerged for the following descriptors in the card sort
activity:
A classwhere the teacher uses cooperative learning.
A class where I writejournals in English or Spanish.
A class where I participatea lot.
A class where I am expected to take notes.
A class where I help myclassmates edit what theywrite before theywrite
the final version of the assignment.
A class where I learn from my classmates, (pp. 308-309)
There were no statistically significant differences in the students' choices for the
paired statements "A class where I can speak English ifI want to," and "A class

330 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


where only English is spoken."Their views about speakingEnglish in class were
varied:

Some students indicated that they could learn English better if they had
to use it. Others said that it would be rude or unfair to speak Spanish
since some of their classmates did not understand Spanish. Some also
said that it would be impolite to speak Spanish in a class where the
teacher did not know what they were saying. One student indicated that
he would speak Spanish only if he wanted to say something he did not
want the teacher to hear. (p. 309)

The students' views about what constitutes an ideal class were also diverse.
The greatest number of student-generated responses had to do with the
teacher's role. Studentsofferedthe following ideasabout their ideal class:
Aclass where the teacher could helpyou moreoften—the teacher cangive
you more qualitytime and work with you until you get it.
A class where the teacher can be your friend.
A class where the teacher is nice.
A class where teachers have positive attitudes.
A classwhere the teacher is patient.
A class where the teacher cares—where the teacheris on your backwhen
ever you fool around.
A class where the teachersare more open and more fun.
A class where the teacher makes the class fun.
A class where the teacher helps students when they need help.
A class where I can get extra help on something I may find difficult.
A class where the shy people are encouraged to participate.
A class where everyone is encouraged to try.
A class where teachers explain the assignments well.
A class where the teacher is clear about assignments and deadlines.
Aclass where the teacher doesn't give too much workbut gives it correctly
with enough information.
Aclass where the teachers don't have toreferj tothe textbook so much.
A disciplined classwhere the teacher teaches well and understands.
A class where things work well; teachers help students improve their
education.
A class where the teacher assigns a lot of work.
A class where teachers show you tests ahead of time to helpyou get a good
grade.
A class where the teacher asks questions before a test so that everybody is
forced to study.

Chapter11 ElicitationProcedures 331


A classwhere the teacher giveshomework three times a week.
A class where teachers want you to help them when they make errors,
(pp. 309-310)

In their conclusion, these authors comment about their data elicitation proce
dures.They saythe interviews reveal the students' "insightsinto effective instruc
tion and their perceptionsof strategiesfor successful academic behavior" (p. 316).
The problem-posing role play task about helping the new student showed the
researchers that the students were "developing metacognitive awareness of
appropriate learningstrategies ... whichcontribute to academic success" (ibid.).

PAYOFFS AND PITFALLS

This chapterhas focused on a rangeof elicitation procedures that have beenused


in classroom-based and classroom-oriented research. As always, there are both
advantages and disadvantages in usingthese procedures.

REFLECTION

Brainstorm and listthe possible payoffs thatresearchers mightexpect from


using oneor more ofthe eficitatibn devices discussed in this chapter.

The data collection techniques we have grouped together under the rubric
of 'elicitation' have some obvious advantages. Because the techniques are so di
verse, they can resultin data that are incredibly rich, as Dowsett (1986), among
others, has pointed out. Most can also be used in combination. For example,
Benson and Nunan (2005) used both questionnaires and interviews in their in
vestigations into language learning histories. This mixing andmatching helps in
methods triangulation. (See Chapter 7fora detailed discussion oftriangulation.)
Another advantage of eliciting data, and one that hasalready been touched
on, is that elicitation can be a great time-saver, providing the researcher with
large amounts of data in a much shortertime than would be required to collect
such data through naturalistic observation. In fact, desired data may never be
forthcoming if we simplysit and wait for it.
A third advantage of elicitation (which we will see when we discuss pitfalls
below) is that elicitation enables the researcherto collectdata that could simply
not be obtained in any other way. For instance, the classroom researcher who
wants to obtain insights into why the teacher made certain spontaneous deci
sions to departfrom his or her lesson plan while the class was in progress could
make certain inferences bysittingin on the teacher's class, or byviewing a video,
but willnever really knowfor sure without interviewing the teacher.

332 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Brainstorm alist ofthe possible pitfalls thatmiglit beset research using one
or more of the elicitationdevices discussed in this chapter.

There are, of course, also pitfalls involvedin using the various elicitation de
vices described in this chapter. Use of elicitation devices rather than naturalistic
observation has been criticized on a number of grounds. In the first place, the re
searcher determines in advancewhat is to be investigated.There are at least two
possible threats to the validity of such investigations:
The first is that by determining in advance what is going to be
considered relevant, other potentially relevant phenomena might be
overlooked. The other danger, and one which needs to be considered
when evaluating research utilizing such [elicitation] devices, is the ex
tent to which the results obtained are an artifact of the elicitation de
vices employed (see, e.g.,Nunan [1987] for a discussion on the dangers
of derivingimplications for secondlanguage acquisition from standard
ized test data). One needs to be particularly cautious in making claims
about acquisition orders based on elicited data, as Ellis (1985) has
pointed out. [In at leastone study] it seemsclearthat the so-called order
of acquisition is the creation of the elicitation device and the statistical
procedures used to analyze the data. (Nunan, 1992, pp.138—139)
Regardless of these kinds of problems, however, elicitation procedures provide
effective ways of gathering data that might otherwise be unobtainable.

CONCLUSION

This chapter has introduced a wide range of elicitation procedures, including


five types of interviews. We alsobriefly revisited questionnaires and talked about
how questionnaire data and interview data (or other kinds of data) can be com
bined in a two-phase design. We considered several production tasks—activities
designed to get the participants in a study to provide language samples, their
viewpoints, their language learning histories, and so on. The sample studysum
marized here used interviews witha cardsort activity and a role play task to elicit
ideas from junior high school language minority stiidents.

QUESTIONS AND TASKS

1. Think of a research question that interests you in which the use of dis
course completion tasks would be an appropriate form of data collection.
Write three to five discourse completion tasks designed to elicitthe target

Chapter 11 Elicitation Procedures 333


language structures, lexical items, or speech acts you wish to investigate.
Pilot them with two or more nativespeakersor highly proficient non-native
speakers of the target language to see if the tasks actually elicit the forms
you wish to investigate.
2. Think of a research question that interests you in which the use of a picture
description task would be an appropriate form of data collection. Find
three to five pictures that you could use. Write the instructions carefully,
either in the native language of your intended subjects or in a clear and
simple form of the target language. Try out the picture description task
with two or more native speakers or highly proficient non-native speakers
of the target language to see if the tasks actually elicit the formsyou wish
to investigate.
3. What do you think about usingrole playto elicitspeechsamples from lan
guage learners? Try writinga role play prompt that would elicitsomedata
youwould liketo collect. Haveone or morelanguage learners try to do the
role play you create.
4. Read a studythat uses interview procedures to elicit data from informants.
What type of interview was used? Were the researchers seeking language
samples, the learners' ideas and views, information about their history, or
some combination of these issues?
5. Think of a classroom-based research project in whichyou could first sam
ple widely (e.g., through questionnaires) and then collect more detailed
data from a subgroup of your sample usingthe raised design concept.
A. What would be the research question(s) you'd want to address?
B. How would you collect data from the largeroriginal sample?
C. Howwould you collect data from the small subset of the sample (e.g.,
interviews, classroom observations)?
D. By what criteria would you select participants from the larger original
sample to be re-sampled into the smaller group?

SUGGESTIONS FOR FURTHER READING

Elicited imitation is a research procedure which has beenused widely in second


language acquisition studies. For an interesting treatment of this topic, see
Bley-Vroman and Chaudron (1994).
The volume edited by Perecman and Curran (2006), while not directed to
language educators, offers agreat deal ofpractical wisdom and personal advice on
techniques andprocedures for using elicitation techniques in field research.
The naturalistic inquiry tradition makes use of interviews extensively to get
informants' perspectives. See, e.g., Flick (1998), Fetterman (1989), and Mason
(1996). The classic treatment of the ethnographic interview is by Spradley
(1979).

334 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


If you do not have a background in language assessment but would like
to read some user-friendly books for teachers on this topic, we recommend
Bachman and Palmer (1996), Bailey (1998b), H.D. Brown (2004), J.D. Brown
(2005), and Hughes (1989). If you work with primary and secondary school
children, a good resource about language assessment is Law and Eckes (1995),
Assessment andESL: A Handbookfor K-12 Teachers.
Data Elicitationfor Second and Foreign Language Research by Gass and Mackey
(2007) is a fine resource about the elicitation procedures discussed here, as well
as many others. It includes a chapter on classroomrbased research.

Chapter 11 ElicitationProcedures 335


Data Analysis and
Interpretation Issues
Figuring OutWhat the Information Means

I n this final section, we do two things. First, we revisit and extend into
the realm of data analysis several discussions that were initiated earlier in
the book. Secondly, we draw together the themes that have emerged in the
course of the book thus far. Because this volume contextualized the research
process in terms of the classroom, the initial chapter in this section takes a
somewhat detailed look at the analysis of classroom interaction. The chapters
that follow deal, respectively, with methods for quantitative and qualitative data
analysis. The section ends with the final chapter in the book, which pulls to
gether themes and issues and revisits practical suggestions for getting started on
designing and conducting your own studies as well as ways to publish them.

Chapter 12: Analyzing Classroom Interaction


By the end of this chapter, readers will
e be able to interpret transcripts of language classroom interaction;
s analyze transcripts of classroom interaction both quantitatively and
qualitatively;
0 demonstrate familiaritywith a range of approaches to analyzing classroom
interaction;
a state the main substantive issues that have been investigated in terms of
teacher talk and learner talk;
a discuss the advantages and disadvantages of this approach to analyzing
classroom data.

337
Chapter 13: Quantitative Data Analysis
By the end of this chapter, readers will
0 be able to explain the measures of central tendency and measures of
dispersion;
•a understand the concept of degrees of freedom;
0 understand the concept of statistical significance;
ra know how to calculate and interpret the chi-square test;
0 know how to interpret correlation coefficients, t-tests, and analysis of
variance;
ei identify several concerns about statistical analyses.

Chapter 14: Qualitative Data Analysis


By the end of this chapter, readers will
® describe qualitative data and give examples;
sa identify different sources of data and suggest a range of methods for
analyzing those data;
ei discuss techniques for identifying patterns in qualitative data;
ei describe the process of meaning condensation;
n definea grounded approach to data analysis;
ei discuss technology-supported tools and techniques for qualitative data
analysis.

Chapter 15: Puttingit AllTogether


By the end of this chapter readers will
e be able to articulate the steps in the research process;
ei be able to combine quantitative and qualitative approaches to data
collection and analysis;
ei create an original research plan basedon the procedures described here;
ca discuss ethical concerns in conducting language classroom research;
gei evaluate research plans and prepare an effective research proposal;
ei critique completed studies;
ei be familiar with procedures for submitting abstracts for conferencesand
manuscripts for publication.

338 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


CHAPTER

12

Analyzing Classroom Interaction

"Theclassroom is the crucible." (Gaies, 1980)

INTRODUCTION AND OVERVIEW)


In this chapter, we examine ways of analyzing classroom interaction. In the first
section of the chapter, we look at the nature of classroom discourse. We then
consider transcribing and codingclassroom interaction, followed by a sectionon
analyzing learner language. Following a brief introduction to conversational
analysis, we review some of the substantive issues that classroom researchers
have investigated, including teacher talk and student-student interaction. The
chapter ends with the usual discussion of quality control issues and a sample
study.
Influenced by first-language classroom research, the earlieststudies of lan
guage classroom interaction involved using live coding as the interaction
occurred. These studies were said to employ "real-time coding" (as opposed to
audio- or video-recordings, which could be stopped and replayed as manytimes
as needed). Real-time coding could produce the sorts of SCORE data we sawin
Chapter 9 or yield tallies of specific behaviors or utterance types.
For instance, Moskowitz (1971) adapted Flanders' (1970) interaction analy
sis system from first-language education research' She added categories to ac
commodate the use of the target language and the learners' native language to
study effective teaching in foreign language classes. The resulting instrument,
called FLint (for "foreign language interaction"), was used largely for teacher
education purposes (see, e.g., Moskowitz, 1968; 1971).Moskowitz (1968) found
that "teachers felt studying observational systems had influenced them to make
numerous desirable changes in their teaching" (p. 218).

339
In another study that involved FLint for real-time coding,trained observers
watched eleven foreign language teachers who had been identified as outstand
ing by their former students, as well as eleven typicalforeign language teachers.
The observers did not know that some teachers had been identified as outstand
ing and others as typical. Each teacher was observed teaching four different
lessons: (1) grammar, (2) reading skills, (3) some sort of new material, and (4) a
lesson based entirely on the teacher's choice.
When Moskowitz(1976) compared the results, she found eighty-five statisti
cally significant differences in the coded behavior of the outstanding and typical
teachers. Several contrasts were observed in three out of the four lessons coded.
These included the fact that the outstanding teachers and their students used more
of the target language than did the typical teachers and their students. There was
also less off-task talk in the lessons taught by the outstanding foreign language
teachers. There were more personalized questions as well as more praise and
joking in the outstanding teachers' classes. In addition, the coding of nonverbal
behaviors showed that the teachers identified as outstanding walked around more
and looked at most of their smdents more often than did the typical teachers.
These findings were intriguing and the research methods Moskowitz used
wereappropriateat the time. Butwhilethesesorts of investigations were inform
ative, they were problematic in the sense that the results of real-time coding
couldnever be checked against the originalclassroom interaction data,nor could
the actual utterances be analyzed. And unless there were two observers present
during the lessons, it was not possible to computeinter-coderagreement.
Recently, withthe advent of accessible, transportable, and affordable record
ing devices, as well as advances in discourse analysis, we have focused more on
the analysis of tape- and video-recorded data. Such electronic recordings have
many advantages over real-time coding since they can be replayed as often as
necessary for transcription and/or coding. As a result of advances in both tech
nology and research methodology in the past two decades, we have come to
understand a great deal about classroom discourse and how it shapes students'
opportunities to learn.

CLASSROOM DISCOURSE

Classroom discourse is the distinctive type of interaction that occurs between


teachersand students,and alsoamongstudentsduring lessons. It hasa privileged
place in the discourse analysis literature because it was one of the firstgenres to
be exhaustively researched by linguists. You will recall from Chapter 9 that
Sinclair and Coulthard (1975) found a consistentlyrecurring pattern of teacher
initiation, student response, and teacher feedback evaluating the response (the
so-called IRF pattern) in classroom discourse. (In some reports, youwill seethis
pattern referred to as IRE for initiation, response, and evaluation [Johnson,
1995], or QAC for question, answer, and comment [Markee, 2005].) It is our
familiarity with the IRF pattern as a characteristic of classroom discourse that
lets us recognize Extract 2 belowas the teacher-studentexchange.

340 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Compare the following extracts. It is not difficultto identifywhichisan ex


ample of classroom discourse.
Extract 1:

A: What's the last day of the month?


B: Friday.
A: Friday. We'll invoice you on Friday.
B: That would be great.
A: And fax it over to you.
B: Er, well, I'll come and get it.
A: Okay. (McCarthy and Walsh, 2003, p. 176)
Extract 2:

A: What's the last day of the month?


B: Friday.
A: Friday. Very good.

Other classroom researchers have been influencedby Sinclairand Coulthard's


work. For instance, Bowers (1980) expanded on their categories and developed
a seven-category system for analyzing classroom discourse. It was given in
Chapter 9 as Table 9.3 but is reprinted here as Table 12.1:

TABLE 12.1 Bowers' (1980) categories for analyzing classroom


interaction

Category Description

Responding Any act direcdy sought by die utterance of another speaker, such
as answering a question.
Sociating Any act not contributing directly to the teaching/learning task, but
rather to the establishment or maintenance of interpersonal
relationships.
Organizing Any act that serves to structure the learning task or environment
without contributing to the teaching/learning task itself.
Directing Any act encouraging nonverbal activity as an integral part of the
teaching/learning process.
Presenting Any act presenting information of direct relevance to the learning
task.

Evaluating Any act that rates another verbal act positively or negatively.
Eliciting Any act designed to produce a verbal response from another person.

Chapter 12 AnalyzingClassroom Interaction 341


ACTION

Analyze these two extracts using Bowers' categories. What patterns or


regularities can you see?

Extract 3:
T: (Holding up a picture.) What's the name of this? What's the name?
Not in Chinese.
S: Van. Van.

T: Van. What's in the back of the van?


Ss: Milk, milk.
T: A milk van.
S: Milk van.
T: What's this man?

S: Drier?
T: The driver.

S: The driver.
T: The milkman.
S: Millman.
T: Milkman.

Ss: Milkman.
T: Where are they?
Ss: Where are they?
T: Where are they? Inside? Outside?
S: Department.
T: Department?
S: Department store.
T: Mmm. Supermarket. (N unan, 1988, pp. 84-85)

Extract 4:

T: The questions will be on different subjects, so, er, well, one will be
about, er, well, some of the questions will be about politics, and some
of them will be about, er . . . what?
S: History.
T: History. Yes, politics and history, and, um, and . . . ?
S: Grammar.

T: Grammar's good, yes, . . but the grammar questions were too easy.

342 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Ss: No!
S: Yes, ha, like before.
S: You can use . . . (inaudible)
T: Why? The hardest grammar questions I could think up—the hardest
one, I wasn't even sure about the answer, and you got it.
S: Yes.

T: Really. I'm going to have to go to a professor and ask him to make


questions for this class. Grammar questions that Azzam can't answer,
[laughter] Anyway, that's um, Thursday . . . yeah, Thursday. Ah, but
today we're going to do something different...
Ss: Yes.

T: ... today, er, we're going to do something where we, er, listen to a
conversation—er, in fact, we're not going to listen to one conversa
tion. How many conversations're we going to listen to?
S: Three. (Nunan, 1989, pp. 41-42)

Thus, Bowers added the categories or sociating, organizing, directing,


and presenting to Sinclair and Coulthard's work. His other categories roughly
paralleled theirs.
Drawing on Sinclair and Coulthard's (1975) exchange structure analysis (see
Chapter 9), McCarthy and Walsh (2003) developed their own approach to ana
lyzing classroom discourse. These authors say there are three key questions for
researchers interested in studying classroom discourse:
1. What is the relationship between the speakers and how is this reflected in
their language?
2. What are the goals of" the communication (e.g., to tell a story, to teach
something, to buy something)?
3. How do speakers manage topics and signal to one another their percep
tion of the way the interaction is developing? How do they open and
close conversations? How do thev make sure they get a turn to speak?
(p. 174)

According to McCarthy and Walsh, all these questions are relevant to language
teaching. From their research, they have identified four basic discourse
patterns, or modes as they call them, in classroom interaction. These are man
agerial mode, materials mode, skills and systems mode, and classroom context
mode.
Managerial mode occurs when the teacher is setting up a lesson or lesson
phase, or transitioning from one phase to another. Not surprisingly, it occurs
most often at the beginning of a lesson. In materials mode, the discourse is driven

Chapter 12 AnalyzingClassroom Interaction 343


by and flows from the materials being used. In skills and systems mode, the focus is
on either one or more of the four skills (listening, speaking, reading, or writing)
or on one of the three systems of language (phonology, lexis, or grammar).
Finally, classroom context mode involves more conversational, real-world discourse
opportunities.

REFLECTION

Look at two extracts below and decide if they represent managerial, mate
rials, skills and systems, or classroom context mode. In many transcripts,
AAA' stands for unintelligible speech, but here it means that the teacher
is reading out a blank filling exercise, so the XXXs represent the blank
spaces that the students haveto fill in the text.The equal signs(=) indicate
latching, when one turn follows another without a pause,and brackets([])
indicate overlapping turns.
Extract 5:

(The teacher is doing a blank-filling exercise with the students.)


Teacher: ok . . . now . . . see if you can find the words that are suitable
in these phrases (reading) in the World Cup final of 1994
BrazilXXXItaly 2 3 2 and in a XXXshoot-out... what words
would you put in there?
Student A: beat
Teacher: what beat Italy 3 2 yeah in?
Student A: in a penalty shoot-out
Teacher: a what?
Student A: in a penalty shoot-out
Teacher: in a penalty shoot-out, very good, in a penalty shoot-out . . .
(reading) after 90 minutes THE?
Students: the goals goals goals (mispronounced)
Teacher: the match was . . . what?
Student B: match
Students: ml nil

Teacher: nil nil (reading) and it remained the same after 30 minutes OF?
Student C: extra time

Teacher: extra time, very good, Emerson. (McCarthy and Walsh, 2003,
p. 180)

Extract 6:
Teacher: he went to what do we call these tilings the shoes with wheels=
Student 1: = ah skates =

344 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Student 2: = roller skates=
Teacher: =ROLler skatesroller skatesso [h,e went]
Student 1: [he went] to=
Student 3: =roller SKATing=
Teacher: =SKATing=
Student 1: =he went to=
Teacher: =not to just he went [roller skating he went roller skating]
Student 1: [roller skating hewent roller skating] = (ibid., p. 181)

Extract 5 is an exampleof materials mode. Here the learners are completing


a blank-fillingexercise on sports vocabularyand the teacher directs the students'
contributions; the talk is almost entirely determined by the materials. The
sequence is classic IRF, the most economical way to manage the interaction,
where each turn by the teacher is an evaluation of a learner's contribution and an
initiation of another exchange. The discourse evolves from the material, which
determines turn taking and topic (McCarthy and Walsh, 2003).
In Extract 6, an example of skills and systems mode, the teacher's goal is to
get the learners to use irregular simple past forms (e.g., went). In this mode, the
focus is the language system or a skill. Learning is typically achieved through
controlled turn taking and topic selection, which are usually determined by the
teacher. Learners respond to teacher prompts in an endeavor to produce accu
rate utterances (McCarthy and Walsh, 2003).

REFLECTION

Let's try categorizing two more extracts. Which modesare exemplified by


Extracts 7 and 8?

Extract 7:
Teacher: Ok, we're going to look today at ways to improve your writ
ing and at ways whichcanbe more effective for you and if you
lookat the writingwhichI gave yo i backyouwillsee that I've
marked any little mistakes and eh I've also marked places
where I think the writing is good a id I haven't corrected your
mistakes becausethe best way in Tyriting is for you to correct
your mistakes so what I have done I have put little circles and
inside thecircles there issomething which tells you what kind
of mistake it is so Miguel would ydu like to tell me one of the
mistakesthat you made? (McCarthy and Walsh, 2003, p. 179)

Chapter12 Analyzing Classroom Interaction 345


Extract 8:

Student 1: =ahh nah the one thing that happens when a person dies my
mother used to work with old people and when they died . . .
the last tiling diat went out wasthe hearing about this person=
Teacher: =aha
Student 1: so I mean even if you are unconscious or on drugs or something
I mean it's probably still perhaps can hear what's happened
Student 2: but it gets =
Students: but it gets/there are=
Student 1: =1 mean you have seen so many operation and so you can
imagine and when you are hearing the sounds of what hap
pens I think you can get a pretty clear picture of what's really
going on there =
Student 3: =yeah= (ibid., pp. 181-182)

In Extract 7, an example of managerial mode, we see an extended teacher turn


and no learner turns. Here the focus is on the business side of the lesson (how mis
takesare markedon the students' papers). There is repetition and the teacher hands
over die exchange to a learner at die end of the monologue. The teacher uses die
discourse markers okay andso assignals that help the learners to follow die talk.
Extract 8 illustrates classroom context mode, in which opportunities for
genuine, real-world-type discourse are frequent and the teacher plays a less
prominent role, allowing learners all the space they need. The teacher listens
and supports the interaction, which often takes on the appearance of a casual
conversation outside the classroom. In this hit of data, the teacher was working
with six advanced adult learners. The teacher's aim was to generate discussion
before doing a cloze exercise on the subject of poltergeists. Here the turn taking-
is almost entirely managed by the learners, with competition for the floor and
turn gaining, holding, and passing—typical features of natural conversation.
Topic shifts are also managed by the learners, with the teacher responding as an
equal partner. Teacher feedback shifts from form-focused to content-focused
and error treatment is minimal. This transcript seems to show genuine commu
nication rather than a display or test of knowledge.
As you can see, interpretingand categorizing transcripts like theseyields more
specific information thandoes real-time coding. Weturn nowto a discussion focus
ing on transcribing and the use of transcripts in language classroom research.

GENERATING AND CODING TRANSCRIPTS

Once you have tape- or video-recorded some classroom interaction, you must
decide whether to transcribe some or all of the interaction. This decision should
be guided by your research question. In some cases, it may not be necessaiy to

346 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


do any transcribing at all. Sometimes it is possible to count instances of a partic
ular structure or speech act by simplylisteningto your recordings. Since it can
take up to thirty minutes to transcribe one minute of interaction, you do not
wantto transcribe data unless it isstrictly necessary. However, in our experience,
transcription is extremely valuable—and well worth the time involved to gener
ate good transcripts.
To be useful as research tools, transcripts must be accurate and detailed
enough to represent the speech event under investigation:
An authentic representation of the oral data^as close to the original as
possible—is of crucialimportancefor linguistic analysis Oral data in
themselves are evanescent. Preserving them is a first and primary
objective for all further steps of analysis. (Ehlich, 1993, p. 124)
The more detailed the transcription process, the longer it takes to produce an
accurate transcript. There are different schools of thought about the ideal level
of detail, with some researchers (e.g., those that work in a tradition called con
versational analysis) saying that an extremely fine-grained approachis necessary.
Others feel that standard orthography (simplywriting down what you hear) is ac
ceptable. Our position varies, depending on what we are investigating and why.
Assuming that you decideto transcribe, the next decision is what transcrip
tion conventions to follow. Again, this decision will be tempered by the kind of
research you are conducting and just what it is that you want to capture. If the
research is lookingat learners' pronunciation, it will probablybe necessary to use
phonetic symbols, such as those of the International Phonetic Alphabet (IPA).
This systemrequiresconsiderable training and skill, and its use exponentially in
creases the time taken to render speech into visual form. In IPA and other pho
netic systems, each symbol represents one and only one sound of the language,
so the IPAsymbols conveydetails about pronunciation that cannot be recorded
with simple orthography.
We generally favor using regular orthography, unless there is a compelling
reason for using the IPA or some other form of phonetic transcription. For
instance, if learners' pronunciation is part of your research focus, you will
need to transcribe learners' speech phonetically. Extract 9 exemplifies regular
orthography.

Extract 9:
Teacher: Let's check exercisefour. How do ]ran feel about exercisefour,
was that strange? Number four, $h, page one hundred fifty-
four. Was it difficult? How did you feel?
Student 1: It was easy! We did that!
Teacher: Ah, no, people, Pat turned off the tape recorder by pushing
the stop button. We didn't do that. No, we didn't.

Chapter 12 Analyzing Classroom Interaction 347


Student 1: No?
Teacher: No.
Student 1: Ah!

Teacher: How about that kind of pattern, how do you feel about that?
Have you used that one before?
Students: No.
Teacher: Does it look easy?
Student 2: I use the wrong way. I. . .
Teacher: You used the wrong way? How was it?
Student 2: I—how is it? I got the meaning. With reading, I get the
meaning of this word with reading the dictionary, I got die
meaning of this word. (Nunan & Lamb, 1996,p. 277)

Ellis (1984) provides the following classroom transcription system:

1. The teacher's or researcher's utterances are given on the left-hand side of


the page.
2. The pupils' utterances are given on the right-hand side.
3. T = teacher; R = researcher; pupils are designated by their initials.
4. Each utterance is numbered for ease of reference. An 'utterance' consists
of a single tone unit except where two tone units are syntactically joined
by means of a subordinator or other linking word or contrastive stress has
been used to make what would 'normally' be a single tone unit into more
than one. (A tone unit is part of an utterance, usually consisting of more
than one syllable, in which there is one major change in tone. The follow
ing utterance consists of two tone units. "It was Kathi, who decided not
to go.")
5. Pauses are indicated in parentheses with one or more periods. For
instance, (.) indicates a pause of a second or shorter, while a numeral with
periods indicates the length of a pause beyond a second in duration. For
example, (.3.) indicates a pause of three seconds.
6. XXX is often used to indicate speech that could not be deciphered.
7. Phonetic transcription (IPA) is used when the pupil's pronunciation is
markedly different from the teacher's pronunciation and also when it was
not possible to identifythe English word the pupilswere using.
8. . . . indicates an incomplete utterance
9. Words are underlined to show overlapping speech between two speakers
or a very heavily stressed word. (p. 230)
10. A limited amount of contextual information is given in brackets.

348 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


This transcription system issimple enough to use with minimal training, and yet
comprehensive enough to pick up most interactional features of interest.
There are many variations on the transcription conventions classroom
researchers use. Here is the key to the transcription conventions from Duff
(1996, pp. 431-432) in the sample study at the end of Chapter 7 on ethnography.

Transcription Conventions
1. Participants: T = teacher; S = student; Ss = two students; SSS =
manysmdents. Initials are used for students identifiable by name (e.g.,
M, SZ, J) rather than S.
2. Left bracket [: Indicates the beginning of overlapping speech, shown
for both speakers; second speaker's bracket occurs at the beginning of
the line of the next turn rather than in alignment with previous
speaker's bracket.
3. Equal sign =: Indicates speech which comes immediately after an
other person's, shown for both speakers (i.e., latched utterances).
4. (#): Marks the length of a pause in seconds.
5. (Words): The wordsin parenthesis () werenot clearly heard;(x) = unclear
word; (xx) = two unclear words; (xxx) = diree or more unclear words.
6. Underlined words: Words spoken with emphasis.
7. CAPITAL LETTERS: Loud speech.
8. ((Double parenthesis)): Comments and relevant details pertaining to
interaction
9. Colon: Sound or syllable is unusually lengthened (e.g., rea::lly lo:ng)
10. Period: Terminal falling intonation.
11. Comma: Rising, continuing intonation.
12. Question mark: High rising intonation, not necessarily at the end of
a sentence.

13. Unattached dash: A short, untimed pause.


14. One-sided attached dash-: A cutoff often accompanied by a glottal
stop (e.g., a self-correction); a dash attached on both sides reflects
spelling conventions.
15. Italics: Used to distinguish LI and L2 utterances.

ANALYZING LEARNER LANGUAGE

In their book on analyzing learner language, Ellis and Barkhuizen (2005) iden
tify three research paradigms in second language acquisition. These are the nor
mative, the interpretive, and the aitical. These paradigms approach the analysis of

Chapter 12 Analyzing Classroom Interaction 349


TABLE 12.2 Three research paradigms in SLA(adapted from Ellis
and Barkhuizen, 2005, pp. 10-11)
Normative Interpretive Critical

Quantitative methods Qualitative methods Qualitative methods


supported by inferential involving 'thick description' identifying the 'discourses'
statistics to test the of the aspect of L2 learning learners participate in and
strength of relationships under investigation from how these position them
between variables and multiple perspectives socially.
differences between (triangulation)
social groups.

learner languagein very differentways that reflect the assumptions and purposes
behind the paradigms. The purpose of normative approaches is to test a theoret
ically motivated hypothesis. The purpose of the interpretive paradigm is to
describe and understand L2 acquisition through the intensive, and usually longi
tudinal, study of a limited number of cases. The critical paradigm investigates
language acquisition in its sociocultural context. Table 12.2 sets out the kinds of
analysis of learner language carried out within the three paradigms.
Classroom data come in many shapes and forms. Learner data can be
classified into three different types: nonlinguistic performance data, samples of
learner language, and verbal reports from learners about their own learning
(Ellis and Barkhuizen, 2005). Nonlinguistic performance data include reaction
times to linguistic stimuli, nonverbal measures of comprehension, and gram-
maticality judgments. Learner language samples can be elicited or naturally
occurring. Verbal reports include self-reports, interview data, questionnaire
responses, stimulated recall, think-aloud protocols, and self-assessments.

REFLECTION

Study the following extract,lb what extent does it followthe IRF pattern
identified by Sinclairand Coulthard? What variations are evident? What
other commentswouldyou make on the teacher talk?
Extract 10:
T: OK, let's try number one. Bin* why don't you start and Tomo will
follow. Go ahead try it... Number one.
Bin: It warm this evening.
Tomo: Yes, the evenings are getting warmer.
Bin: I think it get warmer this evening.
T: OK... How could we change that a little bit?
Bin: Getting warmest?

350 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Tomo: The warmest?
T: Let's take out "getting." Let's not use theverb "get," all right?
Bin: It get warmest.
T: Let's justsay... let's take out "getting," let's say, "It's the warmest
it'sever been." ... "It's die warmestspring."How does diat sound?
Bin: OK.
T: Let me write that down .. . (T writes on board: The evenings are
getting warmer.) Well, let'ssee what you get.
Bin: Getting warmer.
T: Now, maybe we can look at this sentence and see how we can
change it to make it better . . . OK, because remember becausein
the second sentence of our dialogue we use "getting" with the
comparative, so for example . ..
Bin: Getting warmest.
T: The evenings are getting wanner.
Bin: Getting warmer.
T: OK, warmer ... Is that the comparative or the superlative?
Ss: Comparative.
T: Comparative. OK? All right. "Getting" is the verb, what does that
"getting" mean?
Bin: Getting warmer.
T: What does getting mean?
Tomo: Starting.
T: Starting, maybe becoming.
Vinny: Becoming to.
T: Beginning to, all right, so they are in the process of becoming,
they're changing to the point of being warmer . . . they are be
coming warmer.
Bin: Evenings getting warmer.
T: OK, evenings, all right... now, let's put in the superlative here .. .
We'll go from this one "The evenings are getting warmer ... to
this one, "It's the warmest it's ever been."
Bin: Evenings getting warmer.
Tomo: It's the warmest it's ever been.
Bin: It the warmest.
T: Right, here we get the superlative, right? So what is the meaning
of this sentence with the superlative?
Bin: Warm.

Chapter 12 Analyzing Classroom Interaction 351


T: What does "it's"stand for? Or, "it has"?
Bin: Warmest.
Tomo: Evening.
T: OK the evening, this particular evening, thatwe aretalking about
is the warmest ever, it's ever been before, it's one hundred ten
degrees, so it's the warmest it has ever been.
Tomo: It's the warmest it's ever been.
Bin: Wow! One hundred ten? No, here, one hundred ten! Too hot!
T: Bin?
Bin: It the wannest, warmest it ever been.
T: All right, good . . . let's go on . . . look at number two. (adapted
from Johnson, 1995, pp. 19-20)

Extract 10 is taken from a book byJohnson (1995) on understanding com


munication in second language classrooms. The researcher used a form of dis
cursive and interpretive analysis for making sense of the classroom interactions
that she recorded. Here is what she had to say about the IRE (initiation-
response-evaluation) pattern in this particular extract.
Clearly the teacher-student exchanges . . . follow the IRE sequence. In
almost every exchange the teacher provides an initiation, a student
responds, and the teacher evaluates the response. However, embedded
in these interactional sequences are options, usually signaled by the
teacher, for altering the IRE pattern. For example, in several instances
Bin's responses, which were incorrect, were not evaluated but were
instead followed by a second initiation. Ignoring Bin's incorrect
response,... the teacherrepeats hisquestion. This behavior, on the part
of the teacher, seems to indicate to the other students that they are free
to respond. . . . Thus this appears to be one acceptable alteration of the
IRE sequence, (p. 21)
This extract and other bits of transcribed data above clearly display the amount
of information we can get by transcribing classroom interaction. We turn now
to a brief discussion of conversational analysis—an approach to research that
requires extremely detailed transcripts.

CONVERSATIONAL ANALYSIS

Conversational analysis is a method for investigating talk, including classroom


interaction data. It is an important ethnomethodological approach to analyzing
spoken data beyond the classroom as well (see, e.g., Markee, 2000; 2003). How
ever, here we will focus only on its use with classroom data, in which it allows

352 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


researchers "to explicate how learning activity is organized on a moment-by-
moment basis" (Markee, 2005, p. 355). Conversational analysis is defined as "a
methodology for analyzing talk-in-interaction that seeks to develop empirically
based accounts of the observable conversational behaviors of participants that
are both minutely detailed and unmotivated by a priori, etic theories of social
action" (ibid.).
Conversational analysis examines two main types of talk-in-interaction:
ordinary mundane conversation and institutional talk. Ordinary conversation
"may be thoughtof as the kind of everyday chitchat that occurs between friends
and acquaintances" (ibid., p. 356). Within cultural subgroups, such talk has
identifiable patterns and signals of turn taking, repair, leave taking, and so on. In
ordinary conversation, "talkis locally managed, meaning that turn size, content
and type are all free to vary, as is turn taking" (ibidi).
Institutional talk, in contrast, involves "various structural modifications to
the sequential, turn-taking andrepair practices ofordinary conversation" (ibid.).
Markee notes that examples of institutional talk include debates, job interviews,
press conferences, courtroom interactions, emergency calls, and so on. Class
room interaction, especially teacher-fronted classroom speech, can be consid
ered to be a form of institutional talk because it "is characterized by the
preallocation of turns and turn types in favor of teachers .. . who also typically
initiate repairs" (ibid.). In fact, "teachers prototypically do beingteachers by as
serting in andthrough theirtalk the rightto select the next speaker, to nominate
topics, to ask questions, and to evaluate learners" (ibid.).
Markee (ibid.) lists five key characteristics of conversational analysis in
second language acquisition research, including classroom research. He says it
should be

1. Based on empirically motivated emic accounts of members' interactional


competence in differentspeechexchange systems.
2. Based on collections of relevant data that are themselves excerpted from
complete transcriptions of communicative events.
3. Capable of exploiting the analytical potential of fine-grained transcripts.
4. Capable ofidentifying bothsuccessful andunsuccessful learning behaviors,
at least in the short term.
5. Capable of showing how meaning is constructed as a socially distributed
phenomenon, thereby critiquing and recasting cognitive notions of com
prehension andlearning, (pp. 357-358; see also Markee, 2000)
These points are nicely illustrated in two transcripts of the same classroom
event, one of which is much more detailed than the other. Three English stu
dents are discussing an article they have read about the greenhouse effect, and
one learner does not understand the word coral. Here, for illustrative purposes,
we will reproduce only a small bit of the two transcripts to show you the differ
ence between a detailed regular transcript and a fine-grained transcript prepared
for conversational analysis. Extracts 11 and 12 are based on a videotape of the
group work. Lstands for learner.

Chapter 12 Analyzing Classroom Interaction 353


REFLECTION

Givenwhatyou have readso far, whatdo you thinksomeof the differences


might be between a transcript prepared for conversational analysis and one
for a less detailed form of interpretation?
Extract 11:
001 (L10 is reading her article to herself.)
002 L10: coral, what is corals
003 (4.0)
004 L9: hh do you know the under the sea, under the sea
005 LI0:un-

006 L9: there's uh::


007 (2.0)
008 L9: [how do we call it]
009 L10: [have uh some coral]
010 L9: ah yeah (0.2) coral sometimes
011 (0.2)
012 L10: eh include lei s (0.2) un includessome uh: somethings uh-
013 (1.0)
014 L10: [the corals], is means uh: (0.2) s somethings at bottom of
015 L9: [((unintelligible))]
016 L10: [the] sea
017 L9: [yeah,]
018 L9: at the bottom of the sea,
019 L10: ok, uh:m also is a food for is a food for fish uh and uh
020 (4.0)
021 L9:food?

022 (0.3)
023 L10: foo-
024 L9: no it is not a food it is (.) like a stone you know?
025 L10: oh I see I see I see I see I see I know I know hh I see h a
whit- (0.4) a
026 kind of a (0.2) white stone h [very beautiful]
027 L9: [yeah yeah] very big yeah
028 [sometimes very beautiful and] sometimes when the ship moves
029 L10: [T see I see ok] (Markee, 2005, pp. 359-360)

354 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


After this exchange, LIO confirms her understanding of coral by offering the
Mandarin translation for that word to her group mates.
These same twenty-nine lines of transcript take up 107 lines in the parallel
transcript prepared for conversational analysis, as shown in Extract 12. Here
the actual speech is given in boldface while nonverbal behaviors are depicted in
italics:

Extract 12:
((L9, LIO, and LI 1are alllooking down at their class materials, reading an
article on global warming. L9, who is facing the camera, is leaning her
head on her left hand. LIO has her back turned to the camera and is facing
L9. LI 1 is in profile but her hair hides her face))
001 LIO: coral, what is corals
002 (1.3)
003 L9: ((L9moves herhead slightly to herright to
004 look tit the right-hand page ofher materials.))
005 (1.3)
006 L?: hshhh
007 (1.3)
008 L9: [X_ ((L9looks up at L10, holding
009 [hh her chin in her left handin
010 a thinking pose))
011 (1.3)
012 L9:
013 L9: do you know the under the sea::, ((L9 leans
014 forward and
015 drops herleft
016 hand toherlap))
017 L9: ((L9 looks down at LWs article))
018 L9: under the sea::,
(Markee, 2005, p. 362)

This fine-grained transcription is a hallmark of conversational analysis. As


you can see, the eighteen lines reproduced as Extract 12 account for only four
lines of Extract 11. According to Markee (2005), inherent to conversational
analysis is the idea that

no detail of interaction, however small or seemingly insignificant, may


be discounted a priori by analysts as not pertinent or meaningful to the

Chapter 12 Analyzing Classroom Interaction 355


participants who produce this interaction.... A natural consequence of
this position is that. . . transcripts do not just set down the words that
are said during a speech event. Rather, they are extremely detailed
qualitative records of how talk is co-constructed by members on a
moment-by-moment basis, (p. 358)
So, to prepare a transcript for conversational analysis, it is important to docu
ment fillers, hesitations, in-breaths, pauses, silences, volume, lengthening of
sounds, speakers speeding up or slowing down, and howturns overlap, aswell as
gestures, eye gaze, and facial expressions. This approach to transcription and
interpretation of speech is sometimes referred to as microanalysis (see, e.g.,
Lazaraton, 2004).
It is important to state that for conversational analysts, "transcription cannot
be viewed merely as a laborious chore that has to be completed before the real
business of analysis can begin" (Markee, 2005, p. 358). Instead, "it is a crucially
important,substantive firstattempt at describing talkand co-occurring gestures,
embodied actions and eyegazephenomenaasa unified, socially constitutedcon
text for second language learningactivity" (ibid.).
An example of conversational analysis used in language classroom research
is found in Ulichny's (1996) study of an interaction that took place in an ESL
conversation class. The students were adult learners at the intermediate level.
Two students and the teacher are having a conversation in which one learner,
Katherine, talks about a decision to discontinue her volunteer work. (She had
originally started volunteeringfor a chanceto use Englishmore often.)Ulichny's
(1996) detailed analysis shows how the teacher "exits from the conversation to
work on specific elements of the language" (p. 756). The discourse shifts from
conversation to instruction, "which involves the whole class in language work"
(ibid., p. 754), and to a correction moveor a conversational replay (ibid., p. 756)for
Katherine. The analysis clearly shows how the student is soon made "silent in
the telling of her story" (ibid.).

SUBSTANTIVE CLASSRQiif INTERACTION ISSUES

In this section, we look at substantive issues in classroom interaction. Bysubstan


tive issues, we mean the 'what' rather than the 'how' of classroom interaction.
The first subsection focuses on teacher talk, followed by a subsection on learner
talk. In the final subsection, we look at teacher-learner interaction.

Teacher Talk
Teacher talk is a crucial element in the classroom. In second language contexts,
researchers have investigated how teachers speak to minority language children
(Strong, 1986).In many EFL contexts, teacher talk represents the only 'live' tar
get language input that learners receive. Some of the questions that have been
investigated in classroom research include the following: How much talking do

356 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


teachers do, and how much of this is in the target language? What is the nature
of teacher explanations? How and when do teachers correct learner errors?
What kinds of errors receive attention?
A great deal of classroom research, in content classes as well as language
classrooms, showsthat when it comes to talking, teachers dominate. On average,
it seems that teachers tend to talk around two-thirds of the time. Whether or not
this is a good thingwill depend on what one believes aboutthe role of compre
hensible inputforacquisition. In the early stages ofacquisition, extensive teacher
input may be a good thing. It goes without saying, however, that this talkmust
be in the target language if it is to function as input for language acquisition. In
some foreign language classrooms that we have observed, virtually all of the
teacher talk has been in the students' first language. Use of the first language is
usually ascribed to the fact that low proficiency students simply do not under
stand the target language. Another factor is the teachers' lack of confidence
about their own language competence.
A major pedagogical function of teacher talk is to provide explanations. In
fact, the naive layperson probably believes that this is the central function of the
teacher. There is some evidence, however, that teacher explanations may not be
particularly effective, asyouwill seein the following reflection task.

REFLECTION

In the course of the following interaction, a student interrupts the teacher


to ask why, in English, we say a three-bedroom house not a three bedrooms
house. What do you think of the following teacher explanation? Is it ade
quate and or appropriate?
Extract 13:
T: OK, he's looking for a house with three tiedrooms and what do we
say? We don't in English we don't ususJly say house with three
bedrooms ... What do we usuallysay?
S: Three-bedroom house.
T: A three-bedroom house, a three-bedroorji house, a three-bedroom
house. So, what's he looking for?
S: For three-bedroom house.
T: What's he looking for?
S: House.
T: What's he looking for?
S: Three-bedroom house?
T: All right.
S: Why three bed, or three-bedroom? Wfrjr we don't say three bed
rooms?

Chapter 12 Analyzing Classroom Interaction 357


T: Ahh... oh ... I don't know, um;
S: Is not right.
T: We don't sayit. We don't sayit There's no explanation. Butweoften
do that in English.Three-bedroom house.
S: Don't ask for it.
S: Yes.
T: Well,do askwhy. Ask why, and ninety-ninepercentof the timeI know
the answer. One percent of the time nobody knows the answer. If I
don't knowthe answer, nobodyknows, [laughter] Ah,no, I don't know
the answer, sorry.

In Extract 13, the teacher claims that there is no explanation for the grammati
cal issue that the student raises. This statement, of course, is incorrect, and the
teacher quickly(and wisely) admits that he does not know the answer.

REFLECTION

How would you answer the question above? Why don't we say "three
bedrooms house"? Have you ever told your students (or been told by a
teacher), "There is no reason—that's just the waywe sayit"?

Explanations can focus on any aspect of language, be it grammatical, lexical,


phonological, or pragmatic. These kinds of explanations can be identified and
tabulated in variousways according to the focus of the research. For example, a
studyof the explanations given by novice and experienced teachersmight look at
the issue of whether there is any difference in the types of explanations givenby
teachers with varying degrees of experience. (Of course, you would need to op
erationally define what you mean by both "experienced" and "inexperienced"
teachers, as well as define the categories for the focus of the explanation.)In ad
dition,you wouldneed an equalnumber of hours of instructionfor the two types
of teachers, and you should probably control for the foci of the lessons as well.
You could tabulate and exemplify the instances of such explanations in a table
like this one:

Phonology Grammar Lexical Pragmatics


Explanation Explanations Explanation Explanations
Experienced
Teachers

Inexperienced
Teachers

358 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

What makes an explanation effective? Think about times you have been
teaching or times you have taken a language course. Mow can you tell if
your students have understood and benefited from the explanation? When
you have been a student, how could your teachers have known if theirex
planations were successful?

Teacher questions have also received substantial attention by classroom


researchers. Research has focused on the types of questions that teachers ask and
the kinds of learner responses that they elicit.
Some years ago, researchers paid a great deal ofattention to theuse of display
versus referential questions. A display question is one to which the questioner
(oftena teacher or a parent)knows the answer. Its purpose is to get learnersto dis
play their knowledge. In contrast, a referential question is one to which the per
son asking the question does not know the answer. In classrooms, in contrastwith
the outside world, there are very few referential questions. In one widely cited
study, Brock (1986) investigated the effects of referential questions on ESL class
room discourse. The study was carried out with four experienced ESL teachers
and twenty-four non-native speakers. Two of the teachers were trained to
incorporate referential questions into their teaching while two were not. Each of
the teachers taught the same lesson to six of the non-native speakers and the les
sons were recorded, transcribed, and analyzed. Brockfound that the two teachers
who had not been trained to ask referential questions asked a total of 141 ques
tions, onlv twenty-four of which were referential. The teachers who were trained
to ask referential questions asked a total of 194 questions, 173 of which were ref
erential. Learners to whom referential questions were directed gave significantly
longer and more syntactically complex responses to those questions.
In addition to quantification, researchers have also used qualitative and
interpretative analyses of question-answer sequences. Extract 14provides an ex
ample of a briefclassroom extract along with the researcher's interpretive gloss.
(A more detailed example ofthistype ofanalysis is given in the sample study that
ends this chapter.)
Responding to student errors is considered to be a fundamental aspect of a
teacher's work. You will notice that we said "responding to" instead of "correcting
errors." This choice is deliberate because as teachers we often talk about error
correction, but, in fact, only the learners c\n correct their errors by reconstruct
ingtheir internalized interlanguage grammar systems and lexicons, so researchers
tend to talk about error treatment—attempts to get learners to correct their errors.
In conversational analysis, these attempts are often referred to as repairs:
Conversational repair ... has been found to consist of three components:
the trouble source or repairable: the repair initiation, which is the indication
that there is trouble to be repaired; and the outcome, which is either the
success or the failure of the repair attempt. (Liebscher & Dailey-O'Cain,
2003, p. 376)
Chapter 12 Analyzing Classroom Interaction 359
Extract 14:

T: Do you think that, um, was it exciting that night? Mm? Do you think
that it was very exciting? Right. Chi-ming. What do you think? It was,
it was—
S: It was very exciting.
T: It was very exciting. Right. Sit down.
The teacher appears to be asking for the student's opinion of the story,
hence a 'referential question.' However, when we look at the clue that the
teacher provides—It was, it was—and her evaluation of the student
response as correct, we know that in fact the teacher expected the student
to say It was exciting, which is the concluding sentence in the story. It is
therefore a 'display' question. (Tsui, 1995, p. 29)

Leibscher and Dailey-O'Cain underscore the fact that the overall category is
called "repair rather than correction, the latter being the particular subtype of
repair that occurs when a notable language error is corrected" (ibid.).
Some of the questions that researchers have addressed in relation to error
treatment include the following: (1) When should errors be treated? (2) How
should they be treated? (3) Who should treat errors? (4) I low effective is self-
and peer-correction? (5) What errors should be treated? (6) To what extent do
learners take up teachers' responses to their errors (i.e., how effective is teacher
treatment ol learners' errors)?

REFLECTION

Choose one of the questions listed above. What sort of study could you
design to address the question of your choice?

Despite the importance attributed to error correction, the effectiveness of


such feedback is by no means clear. Although it is somewhat dated now, a very
comprehensive review of research on language teacher questions is found in
Chaudron's (1988) chapter on teacher and student interaction. He states that
from the learners' point of view, the use of feedback in repairing their
utterances, and involvement in repairing their interlocutors' utterances
may constitute the most potent source of improvement in both target
language development and other subject matter knowledge. Yet the
degree to which this information in fact aids learners' progress in target
language development (or in subject matter control) is still unknown,
(p. 133)

360 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


How teachers, other learners, and the learners' themselves respond to students'
errors isa fascinating topic with many practical implications. Even though it was
one of the earliest foci of language classroom research, it is still a fruitful topic
for investigation.

Student-Student Interaction
The other major area of interest naturallyenough is that of student talk. As you
might imagine, the range ofresearch issues and questions isenormous although
mostseekto establish some type of relationship between input or environmental
factors and learner acquisition. For example, researchers have asked, "What is
the relationship between inputanduptake?" That is, do items made available for
learning subsequently appear in learner output? (This is one of the questions
addressed by the sample study in the nextsection.) Other key questions include
(1) What is the relationship between participation structures (group size and
composition, etc.) andlearner output? (2) What is the relationship between task
type and learner language? and (3) What patterns of interaction typify student
talk in language classrooms?

REFLECTION

The following datum is taken from a piece of classroom-based research.


Two students are trying to classify vocabulary pards. Which of the above
questions do you think was beinginvestigated?
Extract 15:
A: Statistic and diagram—they gotogether. Yoju know diagram?
B: Yeah.
A: Diagram and statistic... butmaybe, I think|, statistic and diagram—you
think we can put in science? Or maybe...
B: Science, astronomy, [yeah] and, er, can beagriculture.
A: Agriculture's not a science.
B: Yes, it's similar.
A: No And, er, maybe Darwin and science \..
B: What's the Darwin?
A: Darwin is a man.
B: No [doesn't fit], it's one of place in Australi a.
A: Yes, but it's a man who discover something, yes, I'm sure.
B: Okay. j
A: And maybe, look, yes,picture, newspaper, magazine, cartoon, book, illus
tration [yeah]. Maybe we can put lazy and English together. (Nunan,
1991c, p. 53) j

Chapter 12 Analyzing Classroom Interaction 361


In early classroom research, real-time coding systems tended to focus on
teachers' speech. Learners' speech was sometimes categorized as being in the
targetlanguage or the first language, on-task or off-task, andso on.These gross
categorizations have been overshadowed by the advent of portable tape- and
video-recording devices that allow researchers to record learners' speech and
nonverbal behaviorwith some degree of accuracy.
Much has been written about the role of teacher-student and student-
student interaction, particularly since communicative language teaching and
task-based learning have becomewidespread. Much classroom research has ana
lyzed the waylearnersnegotiate for meaningwhen faced with a communicative
task because, in the negotiation for meaning, the input to the learners getssim
plified, or pitched, appropriately to the learners' level. (See, e.g., workby Gass,
1997; Long, 1996; and Pica, 1994.) To summarize (andperhapsoversimplify) the
findings of a great deal of interactionist research, people learn the elements of
languages by interacting, not by learning the components of language and then
putting them together in interaction.
In recent years the interactionist perspective has been supplemented (some
might say supplanted) by viewpoints derived largely from sociolcultural theory
(Vygotsky, 1978; 1986). In this approach to understanding human development,
"interaction in contextis examined to find out how proficiency is collaboratively
constructed or appropriated within and through practical activity" (van Lier,
2001, p. 163). Much has been written about sociocultural theory and language
learning (see, e.g., the collection edited by Lantolf, 2000). Here we will focus
only on a small portion of this literature that is directly related to analyzing
learner language.
A key concept in sociocultural theory is the zone ofproximal development, or
ZPD. This concept is a metaphor (not a physical apparatus in the brain) that is
defined as "the difference between what a person can achieve when acting
alone and what the same person can accomplish when acting with support
from someone else and/or cultural artifacts" (Lantolf, 2000, p. 17). Some
researchers feel that the "someone else" referred to here must be more knowl
edgeable than the learner (e.g., a teacher or parent). But recent investigations
into learner language suggest that language learners can facilitate one another's
learning.
One way that learning takes place is through scaffolding. This term repre
sents another metaphor, which you can understand if you picture a scaffold—
normally a wooden or metal or bamboo structure built around a building that is
under construction or being repaired or painted. It is never intended that the
scaffold will be a permanent feature of the edifice: It will be removed as soon as
its job is done.
Scaffolding in language teachingand learningis defined as "the support given
to language learners to enable them to perform tasks and construct communi
cations which are at the time beyond their capability"(Carter and Nunan, 2001,
p. 226). Such assistance is "gradually pulled away when the learner no longer
needs it" (Oxford, 2001, p. 167).Instances of scaffoldingcan be found in some of
the transcripts of learner language reproduced in this chapter:

362 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Look back atExtracts 11 and 15. In each one, fiid an example ofalearner
scaffolding another learner. Compare your iceas with a classmate or
colleague.

Classroom research using a sociocultural perspective has documented and


analyzed interaction among learners ofJapanese (Donato, 2000; Ohta, 2000),
French (Donato, 2000; Swain, 2000), Spanish (Donato, 2000; Roebuck, 2000),
and English (Donato, 2000; Kramsch, 2000; Sullivan, 2000; Swain, 2000; and
van Lier, 2000).

QUALITY CONTROL ISSUES


There are at leastfive key qualitycontrol issues to keep in mind when analyzing
classroom interaction. The first and foremost recalls earlier chapters of this
book. That is, it is essential to plan your study carefully to ensure that you col
lect the datayou need to answer your research question, and—as we have seenin
earlier chapters—collecting classroom interaction data is not an easy task. For
instance, if you are working in the naturalistic inquiry tradition and observing
ongoing classes (your own or another teacher's), you will observe and record
what happens—which may or may not be what you expected would happen.
There is ample evidence that lessons don't always go according to plan (see, e.g.,
Bailey, 1996; K. E.Johnson, 1992a; 1992b; Nunan, 1996).
Secondly, you can only analyze interactional data to answer your research
questions if you have collected it in such a way as to make the analysis possible.
If youwish to analyze turn distribution you needto collect and transcribe video
taped datasince turn bids areoftenaccomplished throughnonverbal signals. You
mayeven needto produce a fine-grained transcript forconversational analysis in
which you document in-breaths and vocalized fillers signaling turn bids. If you
wish to analyze teacherfeedback following learners'errors and onlycollecttape-
recorded data, you will miss the facial expressions and gestures that are some
times used to indicate the presence and place of errors.
Third, ifyouare coding data, whetheryouusecategories ofyourowndevis
ing or concepts borrowed from your review of the literature, it is important to
operationally define yourterms and to make surethat the category system canbe
used by other analysts. To demonstrate that the category system works well, you
need to calculate inter-coder agreement (e.g., by dividing the number of in
stances two raters coded the data identically by the total number of instances
coded).
Fourth, it is important to do a member check (Maxwell, 2005). This step is
especially valuable if you are recording data in a classroom where you do not
know all the students individually. It involves asking the members of the group

Chapter 12 Analyzing Classroom Interaction 363


under study (e.g., the teachers and thestudents you are observing) to make sure
you have correctly interpreted and labeled the data. Forinstance, ifyou are tran
scribing full-class interaction, you should make sure you have identified the
speakers correctly.
Fifth, if you are doing an interpretive analysis of your data, you need to
provide ample descriptions and sufficient examples to convev \our ideas clearly
to your readers. This step can be quite challenging because of length restrictions
if you try to publish your report in a professional journal. The sample study
described below provides a good example of this effort.

A SAMPLE STUDY

For the sample study in this chapter, we have selected a report about some
teacher research. Storch (2002) carried out an investigation into patterns of
interaction in pair work. Her classroom-based study was conducted in a one-
semester, credit-hearing ESL course offered at a university in Australia. The
purpose of the course was "to develop learners' academic listening, reading,
speaking and writing skills" (p. 123). Storch collected her data in the writing-
classes, which included a focus on grammatical accuracy. In this particular
report, she addressed the following research questions:
1. What patterns of dyadic interaction can be found in an ESL university-
level class?
2. Does task or passage of time affect the pattern of dyadic interaction?
3. Do differences in the nature of the dyadic interaction result in different
outcomes in terms of second language development? (p. 123)
I [ere, due to space constraints, we will only summarize her work on the first re
search question above.

ACTION

Think about Storch's first research question. How would you go about
collecting data to address this question? List the specific steps you would
take.

There were thirty-three students in the study. Storch (ibid.) describes them
as follows:

The students came from a range of language backgrounds, although


they were mostly Asian. The participating students ranged in age from
19 to 42 years, with the majority (76%) being in the age range 20 to 30.
Most of the students were international students (70%), some coming
on exchange programs for the duration of one semester, others to

364 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


complete their entire degree in Australia. Their length of residence in
Australia (forbothinternational andresident students) ranged from one
month to nine years at the time data were collected. In terms of ESL
proficiency, given thatall were students accepted by theuniversity, they
had met the required ESL threshold. However, their scores on writing
and reading on the university's in-house ESL placement tests had
demonstrated that they needed further work: on their academic lan
guage skills, particularly writing and grammatical accuracy. Thus they
were considered intermediate in this context, (p. 124)
From the complete data set, Storch (ibid.) selected the interactions of ten
student pairs for a detailed analysis because the data for these pairs were com
plete and these pairs "were also fairly representative of the entire group of par
ticipants in terms of age, language background, length of residence, and
residential status" (p. 124). She also had gender balance since four of the pairs
consisted of two male students, three pairswere mixed (a male and female stu
dents in each), and the remaining three pairs involved two female students.
Storch (ibid.) used three tasks to collect her data:
1. Ashort composition based on a given diagram showing differences in lan
guage fluency between two groups of migrants before and after arrival to
Australia.
2. An editing task, where students were presented with a text of approxi
mately 160 words in length containing a number of errors typical of these
students: errors in verb tense, articles, word forms.
3. A text reconstruction task (Storch, 1998),in which students were presented
with content words and had to insert function words (e.g., prepositions, ar
ticles) and change word forms (e.g., for tense morphology) to produce a
meaningful and grammatically correct text. (p. 125)
She points outthatallthese tasks "were related to thecourse syllabus andformed
part of the regular class work" (p. 14). The students did three versions of each
task at one-week intervals—two in pairs and one alone. The interactions of the
pairs were tape-recorded. The data that Storch analyzed in this study were
the recordings from thesecond week, when students had gotten familiar with the
tasks.
Storch (ibid.) also collected other data including a student attitude survey
about group work and pair work. She used "an editing task administered at the
beginning and end of the study to function as a pre- and post-test" (p. 125).
Storch also made observational notes, both while the students did the pair work
and rightafter the lessons. Ofthese notes and the interactions they documented,
she says,
Given the large number of pairs working simultaneously, observation
notes were fairly briefandwere of the most salient behavioral features.
These features were then noted when transcribing the pair talk. The
transcription of the pair talkattempted to reflect the interactive nature

Chapter 12 Analyzing Classroom Interaction 365


ofthe talk and to represent the talk as it occurred. Thus special symbols
were used ... to indicate aspects such as overlapping talk and emphasis
applied bythe speaker to certain words or phrases, (pp. 125-126)
The author describes two stages in her data analysis. First, she analyzed the
transcripts for "the pattern of dyadic interaction and the salient traits that char
acterize these patterns" (p. 126). Secondly, the transcripts and the tasks were
analyzed "to trace the effect, if any, resolutions reached during pair interaction
had on the subsequent individual performance" (ibid.).
Storch (ibid.) found four patterns ofinteraction based on the roles taken by
the two members of the dyad: collaborative, dominant/dominant, dominant/
passive, and expert/novice. Her analysis was influenced by Damon and Phelps
(1989), whodefined twovariables—equality andmutuality. In their work, Storch
(2002) says, equality is"thedegree ofcontrol or authority over thetask. Equality
describes more than merely an equal distribution of turnsof equal contributions
but an equal degree of control over the direction of a task (van Lier, 1996b)"
(p. 127). Interactions displayed a high degree of equality if "both participants
take directions from each other" (ibid., p. 127). The second variable, mutuality,
"refers to the levels of engagement witheach other'scontribution. High mutual
ity describes interactions that are rich in reciprocal feedback and a sharing
of ideas (Damon & Phelps, 1989)" (ibid.). Storch's model, reprinted here as
Figure 12.1, depicts these ideas.
The following extract from Storch's data and the commentary that follows
illustrate one type of classroom analysis. It is discursive in style. In other words,
rather than assigning speaking turns to analytical categories in a coding process,
the researcher engages in a close, line-by-line description and interpretation of
the interaction. Based on the framework shown in Figure 12.1 and her inter
pretation of this transcript, Storch says Excerpt 16is an example of collaborative
interaction. Storch uses the pseudonyms Charley and Mai to represent a Thai

High Mutuality

4 1
Expert/Novice Collaborative

Low Equality - TT- 1 I-


Iligh Equality I-

3 2
Dominant/Passive Dominant/Dominant

Low Mutuality

FIGURE 12.1 A model of dyadic interaction (from Storch, 2002,


p. 128)

366 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


male and a Vietnamese female, respectively. She notes diat they "contribute
jointly to the composition and engage with each other's contribution" p. 130). A
transcript from this pair is reprinted here as Extract 16:

Extract 16:
1 C: this (reads instructions) . . . what is this?
2 M: from the chart
3 C: this chart about

[
4 M: the data
5 C: with percentage and eh . . .
6 M: describe describe the percentage of
7 C: English language fluency
[
8 M: English language fluency between two countries yeah?
9 Vietnam and Laos
10 C: yes and the compare before they came here and now
11 M: yes . . .
12 C: you can separate it here
13 M: yeah . . . first we . . . mm the
14 C: perhaps you should write
15 M: yeah I write yeah from the information of the chart yeah
16 ... ((writing)) information of the chart
17 C: no from figure 3
[
18 M: ah figure . . . figure 3? From figure 3
figure 3 ah
19 C: show the information
20: M: show the information ... it it's
21: C: yeah it's ok it shows
22: M: it shows the . . . the data or the percentage?
23: C: should be the percentage (Storch, 2002, p. 131)

According to Storch (ibid.), this transcript


contains elements of cohesion as well as unpredictability. Cohesion is
created as the participants incorporate or repeat each other's utterances
and extend on them (e.g., lines 2-3, 5-6), or simply complete each

Chapter 12 Analyzing Classroom Interaction 367


other's utterances (e.g., lines 6-8, 8-10). The participants engage with
each other's suggestions: there is negative or corrective feedback in the
form ofexplicit peerrepair (e.g., line17) or recasting (e.g., line 6)aswell
as positive feedback in the form of confirmations (e.g., lines 10, 11).
There are manyrequests (e.g., lines 1, 8, 22) and provision of informa
tion (e.g., line 23). (p. 131)

Storch notes that resolutions to problems are often attained "via a process of
pooling resources. For example, in line 20 Mai notes a problem with subject-
verb agreement and Charleysuggests the appropriate correction.Thus, the talk
shows a pattern of interaction that is high on equality and mutuality" (ibid.).
Storch (ibid.) provides additional extracts that show dominant/dominant,
dominant/passive, and expert/novice patterns. The studyconcluded that the col
laborative interaction pattern was the most common in the pair workdata. She
notes that the patternsof interaction were "fairly stable. Once they were estab
lished early in the semester . . . they remained so regardless of the passage of
time" (pp. 144-145). Although not all the students worked collaboratively, the
collaborative pattern was the most common and one dyad became more collab
orative as time went by. She concluded that the learners did indeed scaffold one
another's learning, and that that scaffolding was more likely in the collaborative
or expert/novice pairs.
We find this report to be intriguing for a number of reasons. First, the re
search was conducted by a teacherin her own class. The data collection proce
dures were based on activities normally used during lessons. The literature
review is current and wide-ranging, and the analysis of the interactions is clear
and convincing. Storch alsoreports that 33% of the data were analyzed by a sec
ond researcherusingthe four descriptive categories, and that this process yielded
90% agreement.

PAYOFFS AND PITFALLS

The ways of analyzing interaction during lessons have evolved over the years—
partly because technological developments have improved our ability to record
and transcribe and partly because advances in discourse analysis have enhanced
the possible means of investigating classroom interaction. Transcribing and
analyzing classroom interaction are powerful tools in understanding how talk
during lessons promotes (or impedes) language learning.
Still there are some pitfalls associated with these kinds of analyses. As noted
in earlierchapters, usingelectronicrecordingdevices can be a bit disruptive. You
will probablyneed to record classroom interactionsoften enough that the learn
ers grow accustomed to having a digital recorder or a video camera operating
during lessons.
If you do make tape- or video-recordings, you must decide whether the re
search question demands that you transcribe the data. In some instances, it may
be possible to code data directly from the recordings, but for many research

368 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


questions with a linguistic focus, you will need to transcribe all or part of the
data. For instance, if you are analyzing the way that teachers give learners in
structions about doinggroup work, youmay onlyneed to transcribe the instruc
tion-giving episodes. But ifyou wish to determine how clear (or unclear) those
instructions are, you will also need to transcribe evidence of the groups' success
in carrying them out.
A major concern in analyzing classroom discourse is the time involved to
transcribe and interpret the data. Analyzing the data for R. L. Allwright's (1980)
report on turns, topics, and tasks took up to twenty hours oftranscribing for one
hour of classroom data (Allwright and Bailey, 1991). As noted above, one issue
that lends to the complexity of transcription is the level of analysis involved. If
you are doing a fine-grained conversational analysis or creating a detailed
phonetic transcript, the task will be much more time-consuming than ifyou are
creating an orthographic transcript.
Transcripts of classroom interaction have served as data on their own and
alsoin combination with other types of data. For instance, transcripts have been
usedin stimulated recall studies to help teachers recall whatwas happening dur
inga lesson (see, e.g., Bailey, 1996; K. E.Johnson, 1992a; 1992b; Nunan, 1996).
The payoffs of analyzing interactional data are so numerous that classroom
researchers forge ahead in spite ofthe challenges. Such data have given usa great
deal of information about what various teaching activities accomplish—what
sorts of talk result from the tasks that we pose to our learners.

CONCLUSION

This chapter focused on analyzing classroom interaction. We have discussed


coding of classroom data and talked about various transcription systems that
have beenused to analyze teacher speech and learning language. We briefly con
sidered conversational analysis and someof the substantive issues that classroom
researchers have investigated. We close this chapter with some questions and
tasks you can use to assess and reinforce your learning, as well as suggestions for
further reading.

QUESTIONS AND TASKS


1. Think of a research question that interests you.What role might be played
in answering that question byanyof the approaches to analyzing classroom
interaction that were described in this chapter?
2. Use the four basic discourse modes identified by McCarthy and Walsh
(2003)—managerial mode, materials mode, skills and systems mode, and
classroom context mode—to analyze the transcript in Extract 11 above.
3. Use the coding categories from Bowers (1980) in Figure 12.1 to analyze
the data in Extract 10 above.

Chapter 12 Analyzing Classroom Interaction 369


4. Using the four categories of interaction type described by Storch (2002)
and depicted in Figure 12.1 above, analyze the data in Extract 11.That is,
what pattern of interaction do these learners use?
5. Transcribe a recording of a piece of classroom interaction—five minutes
worth of data should be sufficient. What problems do you encounter as
you transcribe? What benefits are gained by the transcription process?
6. Now try to transcribethat same pieceof data at the fine-grained microan-
alytic level demanded by conversational analysis. What additional chal
lenges do you encounter working at this level of detail? What additional
insights arise?
7. For any piece of classroom data involving error treatment, locate (1) the
trouble source or repairable; (2) the repair initiation; and (3) the outcome
(Liebscher & Dailey-O'Cain, 2003).
8. Using a transcript provided in this chapter or with one you yourself have
developed, analyze the data with a codingsystem that interests you. If you
are working with a classmate or colleague, it would be worthwhile to
compute inter-coder agreement.
9. Next, try counting some instance of a particular phenomenon, such as
errors made by learners and treatment moves by teachersor other learners.
What terms must you operationally define in order to carry out this
analysis?
10. Now using that same transcript, try some more interpretive analysis, such
as the one undertaken byJohnson (see the notes following Extract 10)or
Tsui (see Extract 14). Do you have sufficient data to have confidence in
your interpretation? If not, what further data would you need to collect?

SUGGESTIONS FOR FURTHER READING

Tsui (1995) provides an introduction to classroom interaction. Walsh's (2006)


bookInvestigating Classroom Discourse provides an overview of currentapproaches
to analyzing classroom discourse and presents toolsand techniques for analyzing
classroom interaction.
Ellis and Barkhuizen (2005) give a comprehensive overview of methods of
analyzing learner language. If you would like to read about group dynamics in
the language classroom, see Dornyei and Murphey (2003).
For a detailed discussion of transcription and coding in discourse analysis,
see the collection edited by Edwards and Lampert (1993).
Markee (2000; 2003; 2005) and Lazaraton (2004) provide helpful informa
tion about conversational analysis, along with clear examples.

370 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


13

Quantitative Data Analysis

All science requires mathematics. The knowledge ofmathematical things


isalmost innate in us. This is the easiest ofsciences, afactwhich is
obvious inthatno one's brain rejects it;forlaymen andpeople who are
utterly illiterate know how tocount andreckon. (Roger Bacon, as citedin
Andrews, 1993, p.568).

INTRODUCTION AND OVERVIEW]


You may dispute Bacon's claim that mathematics is "the easiest of sciences," but
there is no denying that people through the ages have counted and measured
things that mattered to them—from the stars in the sky to grains of rice. This
chapter focuses on analyzing quantitative data—those data that have been gen
erated through processes of counting or measuring, no matter what classroom
research tradition you may use. I
If mathematics is not your strong suit, don't worry: Just take your time in
reading this material. If you can interpret decimal points and canadd, subtract,
multiply, divide, square numbers, and find square roots—all using a calculator—
you can dotheprocedures described here. Keep in mind the analogy from Chap
ter 4 that learning aboutresearch is like enteringa newculture. This chapterwill
teach you more of the vocabulary and values of quantitative dataanalysis.
In Chapter 4, we examined experimental research. We noted that it typically
employs quantified data. However, in recent years, some experimental re
searchers have employed "mixed methods"—that is, qualitative data collection

371
and analysis methods have been used along with quantitative methods. In
addition, quantitative data have been used in classroomstudies in both the action
research and naturalistic inquiryapproaches to research.
Quantitative data can be analyzed and displayed in many different ways.
Some familiar ways of reporting numerical data include percentages and
proportions. Such data can also be provided in helpful graphic displays such
as bar graphs and pie charts. For example, Scales, Wennerstrom, Richard, and
Wu (2006) used a piechart to show the percentages of English language learn
ers in their sample who correctly identified an American English accent.
Twenty-nine percent of the listeners correctly attributed a speaker's accent to
the United States, but the pie chartclearly demonstrates the remaining percent
of listeners who attributed the accent to eleven other countries or regions.
Thesesame authors use bargraphs to good advantage to display other compar
isons in their data.
In language classroom research, authors often use statistics to express their
findings. What do we mean bystatistics} The term has at least three meanings.
First, it is the labelgiven to a group of procedures by which researchersmake de
cisions about accepting or rejecting hypotheses in the experimental approach.
Secondly, statistics can refer to the actual mathematical formulae bywhich those
procedures are carried out. Third, statistics refers to the results of those mathe
matical procedures—the numerical findings of a study. These ideas maysound
veryesoteric, but our purposes in this chapter are twofold: We want to help you
understand both how to interpret some of the statistics that are commonly used
in language classroom research and also howto usesome of themwithyour own
data.
To accomplish those goals, we willfirst discuss descriptive statisticsand then
discuss the specialized meaning ofsignificant (as in significant differences or signif
icant coirelations) as the term relates to probability. We will also explore the
notion of degrees of freedom and explore one commonly used statistic (the
chi-squared test) in some detail before reading about other inferential statistics,
includingsome tests of significant differences and some correlation procedures.
Finally, we will briefly consider some technological advances for working with
quantitative data.
You will recall from Chapter 4 that when we think about representing a
group of people through quantitative data, we can use the measures of central
tendency (the mean, mode, and median) and the measures of dispersion about
the mean (range, standard deviation, and variance). These measures are called
descriptive statistics becausethey describe the group in terms of the variables that
have been measured or counted.
There is also an important group of statistical procedures called inferential
statistics—so named because they help us make inferences about the population
based on what is known about the sample. In general, inferential statistics are
used to reach conclusions about significant differences between or among
groups, or about significant relationships between variables. We will consider
both types in turn after we first explore descriptive statistics.

372 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


DESCRIPTIVE STATISTICS

In Chapter 4, we learned aboutfrequency polygons, pfou will recall that these are
figures that depict the measurement ofthe trait being investigated on the hori
zontal axis (also called the abscissa) and the frequency on the vertical axis (some
times referred to as the ordinate). (Look back at Figure 4.8 for a reminder.) In
language classroom research, the measurement on the horizontal axis often in
volves a range ofpossible scores (for instance, on a language test). The frequency
on the vertical axis often represents the number of people who received each
particular score in the possible score distribution (see Figures 4.7, 4.8, and 4.9).
When we have a large number of scores or measurements represented in a
data set, the shape of the frequency polygon may resemble a bell. This image is
referred to as the normal distribution or the bell-shaped curve (or just the bell curve).
Keep thisvisual image in mind as you read about measures of central tendency
and measures of dispersion. Central tendency is the propensity of scores to group
around the middle of a data set. The effects of central tendency are visible in the
large central hump ofthebell-shaped curve. Dispersion refers to thepropensity of
scores to spread out away from the mean. This tendency is reflected in the taper
ing tails of the bell curve. (See Figures 4.8 and 4.^9 for a visual image of these
ideas.)

Measures of Central Tendency


There are three main measures of central tendency in the descriptive statistics:
the mean, the median, and the mode. As noted above, the mean is the mathemat
ical average. The median is the middle score in a data set. The mode is the most
frequently obtainedscore in a data set. These three statistics provide some use
ful information about a group (or groups) of measurements.
To illustrate, let's revisitthe post-test scores from the students in the control
and experimental groups in a study about innovative listening materials. These
data were presented in Table 4.1 above and are reprinted here asTable 13.1.
In Table 13.1, the means for the two groups Ihave been provided for you.
These are the mathematical averages of the two sets of data. To find the mean,
yousimply add the scores in the groupand divide by the numberof people who
contributed those scores. The actual formula looks like this:
i

X = 2X/n i
The capital X with the line above it (X-bar) stands for the mean, the mathe
matical average. The capital Greek letter sigma (2) means "sum up the follow
ing" and the capital X represents "score(s)." The slash (/) indicates division.
So, this formula says that to get the mean, we sum the scores and divide by the
number (n).
To find the mode, you simply arrange the scores from highest to lowest and
look to see which score was obtained most oftenj So, for example, the control

Chapter 13 Quantitative DataAnalysis 373


TABLE 13.1 EFL students' scores on a listening comprehension
test(n = 20)

Control Group Experimental Group


Student ID Scores Student ID Scores

C-l so E-l 85
C-2 82 E-2 87
C-3 78 E-3 83
C-4 77 E-4 82

C-5 83 E-5
C-6 80 E-6 85

C-7 76 E-7 SI

C-8 S4 E-8 89

C-9 75 E-9 84
C-10 85 E-10 86

Mean 80 85

group obtained the following scores: 85, 84, 83, 82, 80, 80, 78, 77, 76, and 75.
We can see that the score of 80 points was obtained by two students in the con
trol group, and every other score was obtained just once. Therefore, in this small
data set, the mode is 80.

REFLECTION

What is the mode for the experimental group data in Table 13.1?

It is not unusual to find two modes in a data set. When that happens, it is
called a bimodal distribution. Sometimes there is no mode in a data set because no
particular score is obtained by more people than any other score. This situation
often occurs with small data sets.
The median is "the score which is at the center of the distribution" (Hatch
and Lazaraton, 1991, p. 161). (Think of the median strip that divides a highway.)
Another way to understand the median is to say that "50% of the scores fall at
or below [the median) and 50% of the scores fall above that value" (Jaeger, 1993,
p. 37).
If you have an odd number of scores in your data set, the median will be the
score that is right in the middle. If you have an even number of scores in the data
set, the median is the "midpoint between the two middle scores" (ibid.).
Look at the control group data in Tible 13.1 for a moment. There are ten
people in the control group. Therefore, the median score is found between the
fifth and sixth scores (the two middle scores): 85, 84, 83, 82, 80, 80, 78, 77, 76,

374 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


and 75. We add the two middle scores and divide by two (80 + 80 = 160/2 = 80).
In this case, die two middle scores happen to be identical.
If this had been a slightly biggerdata set—say, 88, 86, 85, 84, 83, 82, 80, 80,
78, 77, 76, and 75—what would the median be? Find the two middle scores.
They are 82 and 80. Add them together and divide by2 to find the midpoint. In
this case, 81 is the median.

REFLECTION

What is the median for the scores of the experimental group in Table 13.1?
Here are the ten scores arranged in order from the highest to the lowest
score.

89, 88, 87, 86, 85, 85, 84, 83, 82, and 81
What is the median for this slightly larger data set?
89, 88, 87, 86, 85, 85, 84, 83, 82, 81, 80, and 79

The median "is often used as the measure of central tendency when the
number of scores is small and/or when the data are obtained by ordinal measure
ment" (Hatch and Farhady, 1982, p. 4). The median is sometimes used to divide
a group into two groups through a procedure called themedian split. Imagine, for
instance, that you want to investigate the effects of a particular teaching method
on students who have a high or a low aptitude for language learning. You might
operationally define high and low aptitude by administering a language learning
aptitude test at the beginning of the experiment, finding the median score, and
using that point to divide the subjects into two equal (or very nearly equal)
groups. This would be an application of the median split.

REFLECTION

Can you think of any possible problems with the median split as a tech
nique for defining groups?
Imagine two cases wherea researcher used the median split technique
to operationally define high and low aptitude students in an experiment.
On a language aptitude test with a possible score of 100 points, the mean
is 63 and the median is 60.
Scenario 1: The highest student in the low aptitude group scores 55.The
lowest student in the high aptitude group scores 65.
Scenario 2: The highest student in the low aptitude group scores 59. The
lowest student in the high aptitude group scores 61.

Chapter 13 Quantitative Data Analysis 375


Measures of Dispersion
.Measures of central tendency are very important, but they present only part of
the picture in descriptive statistics. The other necessary element is to know how
the measurements in a data set differ from one another. The amount of variabil
ity in a data set gives us additional information that the measures of central
tendency do not provide. The three key measures ofdispersion are the range, the
standard deviation, and variance.
The difference between the highest score and die lowest score in a data set
is called the range. It can be reported in two ways. If we look at die scores in
Table 13.1, we can see that the lowest score in the control group is 75 and the
highest is 85. We can say that the range is from 75 to 85, or that the range is 10.
Two related concepts are exclusive range and inclusive range of the scores in a
group. The exclusive range is what we get when we subtract the lowest measure
ment from the highest measurement (in this case 85 - 75 = 10).The inclusive
range is found by subtracting the lowest score from the highest score and adding
one. In the control group data shown in Table 13.1, the inclusive range is com
puted as follows: 85-75= 10+ 1 = 11.
Here is a practical way to remember the difference between exclusive range
and inclusive range. Imagine you have borrowed a book from the library and you
wish to photocopy the chapter that appears on pages 22 to 32. How many pages
will you photocopy? If you said ten pages, you have used the exclusive range
(simple subtraction). But il you wish to have a copy of every page in the chapter,
you will need to copy eleven pages (pages 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
and 32). In other words, you will need to use the inclusive range.

ACTION

Determine the inclusive range and the exclusive range for the experimen
tal group data in Table 13.1.

The range is useful, but it is strongly influenced by two single


measurements—the highest and the lowest score. For instance, if student C-9
had scored 70 instead of 76, the exclusive range for the control group would
be 15 instead of 10. For this reason, a more useful statistic is the standard
deviation.
Once you understand the concepts of the average and range, you are well
on your way to understanding the notion of standard deviation becauseit is a mea
sure of the average difference among the scores in a particular data set. Standard
deviation is the most commonly reported measure of dispersion because it is so
informative.
To make sure you understand this concept, we will compute the standard
deviation for the scores of the control group in 'Fable 13.1 above. We will begin
by providing part of the formula for computing standard deviation and then

376 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


adding to it. The symbolsbelow tell us to perform a certain set of operations in
a certain sequence. You already know about 2, X and X-bar. So what does this
part of the equation say? (Remember our language learning analogy: The issue
of mathematical sequencing of operations is like learning word-order rules in a
new language.)

S(X - X)2
N- 1

Start your calculations from within the parentheses. (This starting point is
a mathematical convention.) The X-minus-X-bar component inside the paren-
theses is telling us that we subtract the mean from each individual score. We do
this step to find the distance between each individualscore and the mean of that
group of scores.
The superscript (2) to the right of the parentheses tells us to square each of
the differences we find bysubtracting. The purpose of squaring is to get rid of
the minus signs thatwill appear when we subtract] the mean from each individ
ual score. We do this step because minus signs areiinconvenient to workwith—
especially if you are doing calculations by hand. (There will always be minus
signs in this step because, logically, some scores are higher than the mean and
some are lower than the mean.)
After we have done all the subtraction and the squaring, the capital sigma
(2) in the numerator of the standard deviation formula tells us to sum those
amounts. So the numerator in this equation says, "First, subtract the mean from
eachscore. Square those results and add them all t p." The denominator tells us
to subtract one from the number of data points (the scoreshere) that contributed
to the mean.
So far, so good. But remember that we squared all the differences to get
rid of the minus signs. So now we must undo that step by finding the square
root of the results. When we include the square root signwe have the complete
formula:

S(X - X)2
N- 1

To apply this formula to any data set (e.g., the data for the control group
in Table 13.1), it is convenient to create a table with column headings that
represent the steps in the equation. When we do so, we get the set-up shown in
Table 13.2.
The last step is to sum the values in the right column. When we do that
we get 108 as the value in the numerator of the equation. We then divide it by
N - 1, which is 9 in this case (10 - 1 = 9): \

108/9 = 12

Chapter 13 Quantitative Data Analysis 377


TABLE 13.2 Computing standard deviation for the control group
Student ID Scores (X- X) (X- X)2

C-l so 0

C-2 82 4

C-3 78 -1

C-4 "7 9

C-5 83 9

C-6 80 0

C-7 76 16

C-8 84 16

C-9 75 25

C-10 85 25

Finally we take the square root of that result:


1~2 = 3.46
This amount is our standard deviation. What does it mean? It is telling us
that the average distance from the mean of 80 in the control groups scores was
about 3.5 points. When you look at the raw data, you will see that this result
makes sense.

ACTION

Compute the standard deviation for the experimental group data in


Table 13.1.

A concept related to the standard deviation is variance, which is defined as


the standard deviation squared. This idea is very important in language assess
ment and also in a statistical procedure called Analysis of Variance, which we will
consider shortly. When we square 3.46 (the value of the standard deviation we
found for the control group data in Table 13.1), we find that the variance for the
control group is 11.97.

REFLECTION

If the square root of 12 is reported as 3.46, why is 3.46 squared reported


as 11.97? Share your ideas with a classmate or colleague.

378 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


ACTION

Based on your calculation of the standard deviation for the experimental


group in the action box above, compute the variance for the data from the
experimental group in Table 13.1.

Degrees of Freedom
Notice that the denominator of the standard deviation formula says X - I. Why
is 1 subtracted from the N in this formula? The answer is based on a concept
called degrees of freedom. This is an abstract notion that is easier to illustrate than
to define. Think about a simple algebraic equation, like this:
3 + 4 + X = 12

You can determine that X in this formula represents the whole number 5. You do
this by adding 3 + 4, which gives you 7. You then subtract 7 from 12 and get 5.
There is nothing else that X can be in this formula. In other words, once we
know that the sum of the three values in the equation is 12, and that two of the
values are 3 and 4, then the other value must be 5. That one particular value is
not free to vary. It is predetermined by the other values in the formula. So, once
you know all but one of the values, the last one is accounted for. 'Flic mathemat
ical way of saying this is X —1.
Here's another way to think about this concept. Imagine that you are teach
ing a class with twenty students. There are twenty desks in the room. Seventeen
students are present and there are three empty chairs when the lesson begins. A
student enters the room and selects one ofthe three desks, sotwo remain empty.
Another student enters the room and chooses one of the remaining chairs.
When the twentieth student enters the room, how many chairs are available?
I low many choices does he have?
If you said one chair, you are correct. But we hope you also said that he had
no choice! Since only one desk remained, he had no choice but to sit there
(unless he wished to sit on the floor, which takes him outside of the equation).
The point is that when all the quantities except one of the quantities are known
and the total is known, then the lastquantity7 can have onlyone particular value,
and that value is not free to vary. Many statistical formulae include "N - 1" to
account for this fact.
In summary, we have seen that the three main measures of dispersion are the
range, the standard deviation, and the variance, while the three main measures of
central tendency are the mean, the mode, and the median. These descriptive
statistics are extremely important in language classroom research. They are
regularly used in experimental research, but they are frequently provided in
reports of action research and naturalistic inquiry as well. In addition, the de
scriptive statistics underpin the inferential statistics and the concept of statistical
significance—the topic of our next section.

Chapter 13 Quantitative Data Analysis 379


THE SIGNIFICANCE OF "SIGNIFICANCE"

In the discussion that follows, we willrepeatedlyuse the term significant, and you
will see it used in many research reports that employ quantitative analyses. What
does this mean? Very briefly, a significant difference or a significant relationship
is one that is too substantial to have occurred by chance. By convention, in our
field the outcomes of statistical analyses are typicallyconsidered to be significant
if there are fewer than five chances out of a hundred that the results have oc
curred by chance. (In some instances, the standard is set more stringently—for
instance, at one chance out of a hundred.) This standard is set at the beginning
of a study, and it is called alpha—the first letter of the Greek alphabet, which is
in factthe symbol that is used (a). You maysee this symbol, whichlookslikea fat
little fish swimming to the left, in research reports—it is part of the "code" of
quantitative analyses.
When we do quantitative analyses in studiesseekingeither significant differ
ences or significant relationships, we checkfor statistical significance—the confi
dence youcanhave that the finding isstable or trustworthy. In reading a research
report, you maycomeacross a note that says "p < .05."Here the lower-case "p"
stands forprobability. It represents the likelihood, or probability, that the findings
are erroneous or fluky or atypical. The little carat lyingon its side (<) means"is
less than." (When it faces the oppositedirection [>], it means"is greater than".)
So, if a value is given and the probability is represented as "p < .05," it means
that ifwe repeated the study100times, wewould onlyget substantially different
results fewer than 5 times out of 100. In other words, if "p < .05," it is likelythat
the result is very stable.
Statistics books include tables called the critical values tables. These days, the
information in those tables is also contained in statistical software packages.
Over the ages, statisticians have determined what the critical values are for the
various statistical procedures. These tables help us decide whether the results of
our own quantitative analyses are likely to have occurred by chance. The out
comes of the inferential statistics formulae that you use with your own data are
called the observed values. In general, if the observed value from your data equals
or exceeds the predetermined critical value printed in the table, then you can say
that your results are statistically significant.

INFERENTIAL STATISTICS: SIGNIFICANT


DIFFERENCES

In experimental studies, the researcher frequently wants to determine the possi


ble effect of the treatment(s) (one or more levels of the independent variable).
The question is often whether the treatment(s) caused a difference in the per
formance of the group(s) that got the treatments) in comparison with the con
trol group, which did not. This situation occurs in both of the true experimental
designs (the post-test only control group design and the pre-test post-test con
trol group design).

380 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Researchers also look for significant differences in scores and measurements
in studies that employ the criterion groups design.i You will recall from Chapter
4 that this is a design in the ex post facto class that compares groups on the basis
of preexistingconditions (suchas gender, first language, country of origin, hand
edness, and so on).
Inferential statistics are used to test for significantdifferences in some of the
weaker research designs as well, such as the intact groups design. In this design,
data are compared from two or more groups that were not randomly selected
from the population or randomly assigned to groups. Rather, groups that were es
tablished for someother purposeare usedin the research. Still, at leastone group
gets the treatment and at least one group does not, so that we can compare the
groups' results. This situation is very common in classroom research because
learners may have been assigned to particular classes on the basis of placement
test scores, teacher recommendations, or parental requests rather than through
random sampling procedures.
Another situation in which researchers look for significantdifferences is the
one-group pre-test post-test design. In this context, a single group of learners is
tested or measured on some variable at the outset of the study. Then some sort
of treatment is administered to them. In classroom research, this treatment often
takes the form of an instructional unit, a particular teaching method, or a set of
materials. A statistical procedure is used that checks for significant differences
between the pre-test scores (the measurement before the treatment) and the
post-test scores of the students in the group.
Many different inferential statisticsare used to determine significantdiffer
ences between or among groups. Here we will discuss just a few that are widely
usedin language classroom research. In general, the choice of the statistic is de
termined, first and foremost, by the type of hypothesis being tested or research
question being posed.Secondly, the choiceof statisticto use is largely a function
of (1) the typesof data (interval, ordinal, or nominal); (2) the number of groups
in the comparison; (3)the sample size; and (4)whether or not there are any mod
erator variables included in the design.

Comparing Frequency Data: The One-Way Chi-Square Test


We willbeginour exploration of these proceduresbyiexamining in somedetailasta
tistic called thechi-sqiuire test. (Herechi rhymes witheye rather thanwithkey, and the
letters ch- make the /k/ sound.) We focus on the chi-square test for three reasons.
First, it is often usedin language classroom research Secondly, the wayit functions
demonstratesstatistical reasoningwith the least technicaljargon of the various in
ferential statistics. And third, its procedures are intuitively plausible; the reasoning
underlying the chi-square test is lessabstract than is that of some other procedures.
This statistic is represented bythe Greeksymbol chi (x)withthe superscript2
to indicate the mathematical operation of squaring (X2). The chi-square test is
often used in educational research when the dependent variable consists of fre
quency counts rather than measurements.
The chi-square test is used when we want to learn whether an observed
pattern of occurrence is significantly different from what we would expect by

Chapter 13 Quantitative Data Analysis 381


chance. (Another way of thinking about the issue is whether there is a relation
ship among the variables being investigated, but we will save that wording for
our discussion of correlation studies below.) Here is an example. Suppose we
wonder if students in a British secondary school choose to study any particular
foreign language more often than others. Let's imagine that a certain school has
600 students and that they all must enroll in a foreign language course, but they
can choose whether to enroll in French, Spanish, or German. If there are no sys
tematic differences among the languages in terms of students' preferences, we
would expectsomething like the enrollment pattern shownin Table 13.3:

TABLE 13.3 Hypothetical enrollment data in three foreign languages


French Spanish German

200 Ss 200 Ss 200 Ss

However, this much symmetryrarely occurs in real life! We are more likely
to get different numbers in the various language groups, but at whatpoint could
we saythat there is a significant difference in the number of studentswho chose
French, Spanish, or German? The data set shownin Table 13.4 would probably
not suggesta remarkable difference in the students' choices:

TABLE 13.4 More hypothetical enrollment data in three foreign


languages
French Spanish German

205 Ss 200 Ss 195 Ss

But what about the data shown in Table 13.5?

TABLE 13.5 Somewhat more diverse hypothetical enrollment data in


three foreign languages
French Spanish German

250 Ss 200 Ss 150 Ss

And what about the data in Table 13.6? When are the differences in group
sizebig enough for us to saythat they are statistically significant?

TABLE 13.6 Verydiverse hypothetical enrollment data in three


foreign languages
French Spanish German

300 Ss 200 Ss 100 Ss

382 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


At what point can we safely conclude that among these particular students,
French is significantly more popular than Spanish, which,in turn, is significantly
more popular than German? It is the chi-square test that lets us determine when
an observed pattern in nominal frequency data differssignificantlyfrom what we
would expect by chance. Here's how it works. In this case, we are comparing the
students in terms ofjust one variable—their choice of which language to study—
so we will use what is called a one-way chi-square test. It is a statistical procedure
that can be done by hand with a calculator or with a statistical software package.
Here is the formula, and while it may look intimidating, it is actually rather
straightforward once you understand the symbols:
X2 = ^(Observed - Expected)2/Expected
As we saw earlier, the capital Greek lettersigma (2) simply means "sum up the
following," and the slash (/) is just a convenient wayto indicate division. (Ahor
izontal line separating the numerator and the denominator in a division equation
is another way of indicating that division is needed, as shown in Table 13.7.)
Let's look at the last of the three data sets presented in Table 13.6and set up
a table that will let us use this equation easily. Here are the steps:
1. First, create a column where you can label your variables (the three
languages). i
2. Then add a column for the observed frequencies (the data that actually
occurred).
3. The third column is for the expected frequencies (what we would find by
straightforward division of the total N into three equal groups).
4. Create another column for the difference between these two values (found
by subtraction).
5. Add a fifth column where we can square the differences we find between
the observed and the expected frequencies.
6. In the sixth column the values in the fifth column are divided by the
expected frequencies.
Once again, the purpose of squaring is to get rid jof the negative numbers that
have arisen in finding the difference between the observed and the expected
frequencies. The calculations are shown in Table 13.7:

TABLE 13.7 Chi-square computations table for enrollment data in


three language courses
i

Language Observed Expected Observed Minus (O-E)2


Chosen Frequency Frequency ExpectedfO - E) (O-E)2 E

French 300 Ss 200 Ss 100 10,000 50

Spanish 200 Ss 200 Ss 0 0 0


1
German 100 Ss 200 Ss -100 10,000 50

Chapter 13 Quantitative DataAnalysis 383


After you have created these columns and entered your data, there are just a
fewmore steps to determine whether the observed pattern is statistically signifi
cantly different from what we would expect by chance. First, we add the values
in the last column (50 + 0 + 50). This sum is the value of our chi-square ob
served, so in this case, Xobserved =100 (because 50 + 0 + 50 = 100).
Finally, we must compare the value of Xobserved to the value of Xcritical- If you
are doing the computations by hand, you can find that value in statistics books in
a table called the "Table of Critical Values for Chi-Square." (See, e.g., J. D.
Brown [1988] and Hatch and Lazaraton [1991].) As noted above, if you are
working with a statistics software package, the critical values are usually built
into the package and the computer will typicallydo the comparison for you.
To use the table of critical values, you must know how to compute the
degrees of freedom. In working with a one-waychi-square test, the degrees of
freedom are defined as the number of groups being compared minus one.
Here we are comparing students' choices to enroll in each of three languages,so
3-1=2 (whichis typically written as "df = 2").
When you locatethe tableof criticalvalues for chi-squarein a statistics book
(usually in the appendices near the end of the book), you will see that it is set up
something like Table 13.8:

TABLE 13.8 Partial table of critical values of chi-square


(adapted from Hatch and Lazaraton, 1991, p. 603)
Probability .10 .05 .025 .01 .001

df
1 2.706 3.841 5.024 6.635 10.828

2 4.605 5.991 7.378 9.210 13.816

3 6.251 7.815 9.348 11.345 16.266

4 7.779 9.488 11.143 13.277 18.467

5 9.236 11.070 12.832 15.086 20.515

(The actualchi-square critical values table is much longer. It usually goes all the
way to df = 100. Here we have just reproduced a small portion of the table.)
To determine whether our Xobserved value is statisticallysignificant, we com
pare it to the Xcritical value in the appropriate row (in this case, where df = 2) and
under the probability level selected at the beginning of the study (in this case,
.05). If the observed value is equal to or greater than the critical value at that
point, we can saythat the results are statistically significant.

ACTION

Use Table 13.8 to determine whether the results ofour chi-square analysis
are statistically significant. All the information you need is given above.

384 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


We can see that the value of Xobserved = 100 in our language enrollment
data is much greater than the critical value (5.991). As a result, we can conclude
that in this case the differences in the numbers of students who have enrolled
in the three different languages are too ureal to have happened by chance. In
other words, the differences students' language preferences are statistically
significant.

REFLECTION

What do you notice as you read down each column of numbers in the table
of values of chi-square critical (Table 13.8)?
For example, in the column headed by .10 (meaning the 10% proba
bility level), the critical values are 2.706. 4.605, 6.251, 7.779, and 9.236.
Does the same pattern hold in the remaining columns in the table?
What does this mean, in practical terms, about the results needed in your
Xobserved in order to get statistically significant results?

You will recall that probability is the likelihood that our results are due to
chance. By convention in our field, we usuallyset that level (called the alpha level
because it is determined at the beginning of an investigation) at .05—meaning
that we are only willing to be wrong five times out of 100.

REFLECTION

Look back at the row labeled Probability in Table 13.8.


What do you noticeas you read across a row in the portion of the chi-
square critical values tables reprinted above? For example, here are the
critical values for x2 in the row for df = 1:
2.706 3.841 5.024 6.635 10.828

Skim the remaining rows. Does the same pattern hold true? What does
this mean, in practical terms, about the Xobserved you need to find in order
to get statistically significant results?

Comparing Frequency Data: The Two-Way Chi-Square Test


The chi-squared test can be used whether or not there is a moderator variable in
the study. If there is a moderator variable, we use what is called a two-way chi-
square test because two variables are included in one analysis. For the sake of il
lustration, let us use students' gender as a moderator variable. The question now
becomes whether there is a significant difference between the propensity of male

Chapter13 Quantitative Data Analysis 385


and female students to enroll in French, Spanish, or German. Table 13.9
presents the data from Table 13.6with the frequency counts for male and female
students added:

TABLE 13.9 Hypothetical enrollment data of male and female


students in three language courses
Gender/Language French Spanish German Totals

Male 100 Ss 100 Ss 75 Ss 275 Ss

Female 200 Ss 100 Ss 25 Ss 325 Ss

Totals 300 Ss 200 Ss 100 Ss 600 Ss

Once again, we can analyze these data to see if there are significant differ
ences in this enrollment pattern from what we would expect purely by chance—
that is, if there were no connection between students' gender and their tendency
to choose French, Spanish, or German as the foreign language they would study.
We can see from the data above that there are 325 female students and 275
male students. Their choices of what language to study constitute the observed
frequencies in this data set. The firststep in calculating chi-square then is to de
termine what the expected frequencies would be if there were no particular dif
ference between male and female students in their choice of what language to
study. The row and column totals in Table 13.9help us to do this. Here's how.
The values in the row and the column labeled "Totals" are called the mar
ginalfrequencies, or justthe marginals, because theyappear in the margins of the
table. (Note that the cell in the lower right-hand corner of this table should show
a number equalto the total number of studentsin the study. We get that value—
600 in this case—by addingup the numbers in the "Totals" column. We should
get the same numberwhen we add the figures in the "Totals" rowaswell.)
We use the marginal frequencies to get the expected frequencies with the
following formula:
njnj
E •• = —-

What does this mean? Well, the E just represents the expected frequency—the
thing we need to determine before we can calculate the chi-square value. The
uppercase TV represents the numberin the study—the one in the lower right cor
ner of the table (in this case, 600). The lower-case n just represents the smaller
numbers in each cell and the subscripts i andj are mathematical symbols that
refer to "whatever row and whatever column" in the table, where / represents
row data andy representscolumn data.
Let's look at an example. The first cell in Table 13.9 tells us that 100 males
registered for French. If we read downthat column, wewill see that the column
total is 300 Ss. If we read to the far right of that row, we see that the row total is
275 Ss. The numerator in the equation for calculating expected frequencies in a
two-way chi-square studytells us to multiply the value for the rowtotal (nj) times

386 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


the value for the column total (nj).In this case, we would multiply 300 times 275,
which gives us 82,500.
The next step in calculating the expected frequency for this particular cell is
to divide 82,500 by the X—that large number that represents all the students in
the study and that appeared in the lower-right corner cell of Table 13.8 above. In
this case, that value was 600. When we divide 82,500 by 600 we get 137.5—a
value representing the number of male students we would expect to enroll in
French courses if there were no significant differences between the boys' and the
girls' choices of language to study.
We continue calculating the expected frequency for every cell in the table.
Some of those frequencies have been calculated for you in Table 13.10 below:

TABLE 13.10 Partially completed expected frequencies for


enrollment of male and female students in three
language courses
Gender/Language French Spanish German

Male 137.50 91.67

Female 162.50

ACTION

Using the data from Table 13.9 and the examples in Tible 13.10 above,
calculate the expected frequencies for the number of male students
studying German and the number of female students studying Spanish and
German.

Given the observed and expected frequencies, we can now do the subtrac
tion to calculate the difference between the observed frequencies in the data
set and the expected frequencies (O —E), just as we did in the one-wav chi
square calculations above. Once again, this process will yield some negative
numbers, so for convenience, the statistic tells us to square those values so we
can get rid of the minus signs. We then divide the resulting values by the value

ACTION

Complete Table 13.11 by calculating the observed minus the expected


frequencies for the empty cells, and then squaring the results, and then
dividing each one by the expected frequency for that particular row. The
last step is to add the values in the far-right column to get the value of
chi-square observed.

Chapter 13 QuantitativeDataAnalysis 387


TABLE 13.11 Partial calculation of a two-way chi-square analysis
Language
Chosen by Observed Expected Observed Minus (0 -E)2
Gender ofSs Frequency Frequency Expected (0 - E) (0 - E)2 E

French bv Males 100 137.5 -37.5 1,406.25 10.23

French by Females 200 162.5 37.5 1,406.25 8.65

Spanish by Males 100 91.67 8.33 69.39 0.76

Spanish by Females 100 75.0

German Bv Males 75 45.80

German by Female 25 37.5

Xobserved

of E (the expected frequencies). Finally, we sum those values to get our chi-
square observed. Each of these steps is represented in the column headings of
Table 13.11.
In order to determine whether the value of this chi-square observed is statis
tically significant, we go back to the table of critical values (Table 13.8above).To
use the table, we need to know' the degrees of freedom. In a two-way chi-square
analysis, the degrees of freedom is equal to the number of columns minus one
times the number of rows minus one. In our enrollment data, we were working
with three columns (enrollment in French or Spanish or German) and two rows
(male and female students). So we calculate degrees of freedom like this:

(3 - 1X2 - 1) = 2 x 1 = 2.

ACTION

Compare the value of chi-square observed (which you got when you
totaled the last column in Table 13.11) with the values of chi-square criti
cal given in Table 13.8. Assume that alpha wasset at .05. Are your findings
statistically significant? How do you interpret the result?

What does all this calculating have to do with classroom research issues that
concern teachers and students? Let's revisit the sample study that was summa
rized at the end of Chapter 3. This was the investigation by Sato (1982)—the
ESL teacher who wanted to investigate the perception that Asian students did
not participate as much in ESL classes as non-Asian students. Sato used the

388 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


one-way chi-square statistic to investigate this issue. Here is a summary of her
findings:

1. The 19 Asian students took 107 (36.4%) of the turns. The 12 non-
Asian students took 186 (63.5%) of the turns. Sato reported that the
chi-square observed for these data was 75.78, df = 1, p < .001. (p. 17)
2. The 19 Asian students self-selected for turns 52 times (33.99% of
the time)' while the 12 non-Asian students
i
self-selected 101 times
(66.01% of the time). The reported chi-square observed was 48.89,
df= l,p<.001.(p. 18)
3. The two teachers (including the researcher;—an Asian-American her
self) allocated 37 turns to the 19 Asian students (39.66% of the turns)
and 57 turns (60.44%) to the 12 non-Asian students. The reported
chi-square observed was 19.04, df = 1, p < .001. (p. 18)

REFLECTION

Given what you now know about the chi-sqiare statistic, how do you
interpret Sato's findings? What do these results say to us as language
teachers?

Comparing Two Means: Using t-Tests


The chi-square test works with frequency counts of nominal data, but often the
dependent variable in a study is some form of measurement involving interval
data, such as test scores. When that is the case, we
tween the means of two or more groups or for differences between pre-test and
post-test scores.
One widely used statistic for determining significant differences between
two meansis the t-test (always written with a lowercase t). The t-test is onlyused
if the measurements consist of interval data (such
ways used to compare only two sets of data. (You can remember this rule if you
think of the title of the old song, "Tea for Two.")
Another characteristic of the t-test is that it is designed to work well with
small data sets. It can even be applied when there are thirty or fewer subjects in
a sample. (It can also be used when there are more, but the noteworthy point is
that t-tests work wellwith small data sets, while some other statistical procedures
don't.) Since language classroom research often involves small data sets, the
t-test is frequently used in our field.
When there are two different groups contributing data, the independent
samples t-testis used. The word independent here is a technical term that means
that the scores or measurements from one group are not influenced by (i.e., are
independent of) the scores of the other group. Let's look at an example.

Chapter13 Quantitative DataAnalysis 389


Suppose a secondaryschool principle is pleased to see that the ESL students
enrolled in a sheltered social studies course seemed to do just as well on a stan
dardized achievement test of social studies knowledge as did the native English
speaking students who were enrolled in a regular (i.e., not sheltered) course. She
has the test results from fifteen ESL students in the sheltered class and twelve
students in the regular social studies class. She wants to determine whether the
test scores of these two groups are significantly different. (Her hypothesis is that
the test scores of the two groups will not be significantly different.)
The principal knows that there are just two groups and that the test scores
consist of interval data. She also knows that the scores of the two groups are in
dependent of one another. So, she enters the test scores of the two groups into
the spreadsheet program on her laptop computer and runs an independent sam
ples t-test. (She could also determine whether the differences are significant by
using a band calculator and computing the mean and the standard deviation for
each group and then entering those data into the formula for the independent
samples t-test.)
Just for your information, here is the formula for the independent samples
t-test:

X, - x2
tobs -
SD(X,-X:)

In this formula, the subscripts / and 2 refer to the first and second group.
The subscripts e and c are also used sometimes—referring to the experimental
and control groups in an experiment.

ACTION

Based on what you know already, interpret the formula above. What steps
are done and in what sequence? Talk through the steps with a classmate or
colleague.

When there is just one group contributing two sets of data, as in the one-
group pre-test post-test design, the post-test scores cannot be said to be inde
pendent of the pre-test scores since the same individuals are providing both sets
of data. In that case, we use a slightly different formula for the t-test, which is
called the dependent samples t-test. (Vou may also see the labels matched-pairs t-test
or correlated t-test, but we will not use those terms here as they are not so com
mon in our field.)
The dependent samples t-test is also used in another context. That is when
there are two groups, but the members of the two groups are intentionally
matched on some criterion. For example, adults in an experiment might be
matched in terms of their scores on a language learning aptitude test adminis
tered at the beginning of the study. This step would allow us to check that the

390 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


experimental group and the control group were equivalent prior to the adminis
tration of the treatment. This situation is not very common in language class
room research since in our field weseldom have t^e power to set up this kind of
matched design as we form groups for a study.

REFLECTION

Which statistic would you choose in each of the following situations—the


independent samplest-test or the dependent sainiples t-test?
1. A Spanish teacher has two classes of beginning students. It seems that
the afternoon class members are less motivated and study lessthan those
in the morning class.The teacher wishesto compare themidterm exam-
ination scores of the sixteen students in th& afternoon class with the
twelve students in the morning class to $ee if the differences are
significant.
2. Acommunity college composition teacher wjants to determine whether
there is a significantdifference in the knowledge of the native English-
speakingstudents and the non-native Engli1S1-speakingstudents in her
classes. She wants to compare their scores or a 100-point test (of gram-
mar, usage, vocabulary, punctuation, and mefchanics), which was taken
by all the students in her classes.
3. A teacher in Australia is investigating the use of authentic reading ma-
terialswith secondaryschoolESL students She setsup a post-test only
control group design in which she is abl£ to randomly select and
randomly assign students to groups. One jroup will read authentic
materials and the other willread prepared E<L textbook materials. The
teacher wonders if the writing system of the students' first language
mayinfluence their outcomes on a standardised test of English reading
proficiency, so she makes sure that for every Chinese speaker in the
control group, there is a Chinese speaker n the experimental group,
and so on. The two groups are composed ofl matched pairsof students
whose native languages are Arabic, Spanish, Korean, Turkish, Tagalog,
Yoruba,Japanese, and Russian.

Here is an example from a study that two teachers conducted (Bailey and
Saunders, 1998). The data are from university students who were lower-
intermediate EFL learners in Hong Kong. Over the course of an academic year,
the two teachers taught six different sections of a speaking and listening course.
They wanted to know if their students had made substantial improvement in
their listening skills, so they used a dependent samples t-test to see if there were
significant differences in the students' scores on a video-based listening test
before and after the fifteen-week course. The pre-test and post-test data are
displayed in Table 13.12.

Chapter 13 Quantitative Data Analysis 391


TABLE 13.12 EFL Students' pre-test and post-test means and
standard deviations on a listening test (n = 129)

Pre-Test Post-Test

Mean 66.23 76.94

Standard Deviation 8.87 6.89

REFLECTION

What do the means in Table 13.12 tell you about the students' pre-test
scores and their post-test scores? What do the two standard deviations say?
Compare your interpretations with those of a classmate or colleague.

These teachers used a software package called SPSS, the Statistical Package
for the Social Sciences, to calculate the dependent samples t-test in order to
compare the pre-test and post-test means. The results showed that that the stu
dents' mean post-test scores were indeed statistically significantly higher than
their mean pre-test scores (p < .001).

REFLECTION

How do you interpret the finding that the difference between the pre-test
and post-test means was statistically significant (p < .001)?

There are a few things to keep in mind here. First of all, as these authors
noted in their report, this was a one-group pre-test post-test design (see Chap
ter 4). There was no control group for comparison. So, the teachers cannot claim
for certain that the English course was what made the difference in the students'
listeningtest scores. Secondly, there wasonly one form of the listeningtest avail
able, so the results might have been influenced by the practice effect. (Fortu
nately, the students didn't know at the beginning of the course that they'd be
retested with the same instrument at the end of the semester.) Also, because
the t-test compares the means of two sets of scores, we cannot infer that every
single student made significant progress. We can only conclude that—overall—
the students' post-test scores were significantly higher than their pre-test scores.

Comparing Two or More Means: One-Way Analysis ofVariance (ANOVA)


The independent and dependent samples t-tests allow us to compare two means.
I lowever, there are many instances in which researchers wish to compare more
than two means. In such cases, the t-test is not appropriate. One statistic that is

392 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


appropriate for comparing the mean scores of two or more groups to test for
significant differences is called ANOVA—the acronym for analysis of variance.
The ANOVA procedure uses interval data.
In a nutshell, the ANOVA procedure asks if the differences between groups
are greater than the differences within those groups. Here's why. At the begin
ning of experimental studies, any differences within groups can be assumed to be
the result of the sampling techniques by which people were selected from the
population and assigned to groups. Any differences between groups at the end of
the study are assumed to be the result of the treatment(s) to which the groups are
exposed (providing the researchers have controlled for possible confounding
variables). So, the statistical family called ANOVA is one that calculates a ratio of
the measured differences within groups and the measured differences between
groups. This statistic is called the F ratio, after the person who invented it
(a Mr. Fisher).
The ANOVA procedure can be used when there are two or more groups
whose characteristics or performances are measured on interval scales. Let's take
an example. Imagine that the school principal described above wanted to com
pare the social studies achievement test scores of three classes rather than two.
The three classes she wished to compare are (1) the fifteen ESL students in the
sheltered class, (2) the twelve students in the regular social studies class, and (3)
ten additional students enrolled in the Advanced Placement (AP) honors class.
(Advanced Placement is a systemwherebysecondaryschool students can get col
lege credit by successfully completing a particular advanced curriculum.) The
principal wants to determine whether the test scores of these three groups are
significantlydifferent.
The principal knows she cannot use the t-test because she is now planning
to compare the mean scores for more than two groups, so she works with
ANOVA.In particular, since there is no moderator variable in this study, she will
use the one-way ANOVA—a name derived from the fact that the different groups
are being compared in terms of scores on one variable (which social studies class
they took). A visual image of this design is presented in Figure 13.1:

Sheltered Regular Advanced


Social Studies Social Studies Placement
Course Course Course

n=15 n=12 ! n = 10

FIGURE 13.1 A research design comparing three social studies


courses

I
In studies that use the one-way ANOVA to compare three or more group,
the authors will sometimes report on what are calledpost hoc comparisons. These
are statistics that systematically calculate all the possible comparisons of the
group means in the design. For example, if the one-wayANOVA used in this

Chapter 13 Quantitative DataAnalysis 393


study detected one or more statistically significant differences in the perform
ance of these three groups on the test, we would want to know where those sta
tistically significant differences were located. Was it the difference between the
sheltered course students' scores and those of the students in the regular social
studies course? Or between the students in the regular course and the AP
course? Or was it the difference between the students in the sheltered course and
those in the AP course that wassignificant? Different post hoc procedures can be
used to compare the means of the specific groups within the one-wayANOVA.

Comparing Two or More Means in Factorial Studies: Two-Way ANOVA


Another important use of ANOVA is in factorial studies—that is, in situations
where there is an independent variable and one or more moderator variables.To
illustrate, imagine that the principal noticed when she examined the socialstud
ies test scores that often the girls seemed to get higher scores than the boys.This
pattern seemed to hold true in the sheltered course for non-native English-
speaking students and the regular course for native speakers, as well as the ad
vanced placement course. So, she decided to add the students' gender as a
moderator variable. That is, she wanted to analyze the possible effect of gender
as well as the effect of the independent variable (the sheltered course versus the
regular courseversus the AP class). This designis depictedin Figure 13.2:

Sheltered Regular AP
Social Studies Social Studies Social Studies
Course Course Course
(n = 15) (n=12) (n = 10)

Female n = 6 n = 7 n = 6
Students
Male n = 9 n = 5 n = 4
Students

FIGURE 13.2 A research design comparing three social studies


courses with gender as a moderator variable

The principal knows she cannot use a t-test because of the moderator vari
able as well as the three levels of the independent variable. So, she uses an adap
tation of the ANOVA for factorial studies. This procedure is called a two-way
ANOVA because the different groups are being compared in terms of two
variables (the particular social studies course they took and their gender). And in
cases where one or more statistically significant differences are detected by
the two-way ANOVA, a post-hoc comparison can be used to pinpoint those
differences.
Using the two-way ANOVA with factorial designs not only allows us to test
for significant differencesin the levelsof the independent variable (here, the type

394 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


of social studies course), as well as in the levels of the moderator variable (here,
the students' gender). It also lets us determine if one group or another is favored
by the various levels of the treatment. In other words, we can ask whether male
or female students did particularly well (or badly) in the sheltered, the regular, or
the AP social studies course.
The possibility that the moderator variable may somehow interact with the
independent variable is called the interaction effect. Being able to detect an inter
action effect is one of the benefits of using a two-way ANOVA.The comparisons
between or among the levels of the independent variable are called the main
effectsfor A and the comparisons between or among the levels of the moderator
variable are called the main effectsfor B. These possible comparisons are depicted
in Figure 13.3:

Sheltered Regular AP
Social Studies Social Studies Social Studies
Course Course Course

Female
Students

= main effects tor A (significant differences between or among the levels


of the independent variable)
= main effects tor B (significant differences between or among the levels
of the moderator variable)

= interaction effects (A X B)

FIGURE 13.3 Possible comparisons in a three-by-two ANOVAstudy

When you read a research report that uses two-way ANOVA, there will
often be a table that shows the "main effects for A" and the "main effects for B"
and any possible interaction effects that may have been found. Being able to test
for possible interaction effects using the two-way ANOVA is important because

Chapter 13 QuantitativeDataAnalysis 395


we may wish to know if some materials or teaching methods or computer pro
grams or types of curricula are advantageous (or disadvantageous) for certain
types of students.

INFERENTIAL STATISTICS: SIGNIFICANT


RELATIONSHIPS BETWEEN VARIABLES

Heretofore we have considered statistics that allow us to compare groups or sets


of scores. Another important set of statistical procedures is used to determine
whether two or more variablesare related in a way that is too strong to be due to
chance. The technical term for such a relationship is a coirelation. As we saw in
Chapter 4, coirelation is the name of a research design in the ex post facto class,
but it is also the name of a family of statistical procedures that are used to deter
mine whether the relationship between two variables is significant. There are
three main correlation statistics that are used in language classroom research and
we will examine each in turn.

Pearson's Correlation Coefficient


The most common and most important correlation statistic is called Peanon's
product-moment coirelation coefficient, or Pearson's coirelation coefficient, or just
Pearson's r. (We can think of r as representing "relationship.") This procedure is
used when the two variables under consideration are both measured on interval
scales.
Imagine you want to determine the relationship between your college ESL
students' knowledge of English grammar and their ability to understand aca
demic lectures. You administer a twenty-five-point grammar test and then a lis
tening comprehension task based on video-taped lecturesfilmed during the first
week in two of their required introductory courses: psychology and biology.
Each listening comprehension task has ten points possible, for a total of twenty
points. After you score the thirty students' performance, you can mark their
scoreson a figure called a scatteiplot (or scattergram) in which the two axes repre
sent the two measures. For each student, a dot or an X is placed on the scatter-
plot at the point where his two scores intersect, as shown in Figure 13.4.
Asyou can see from Figure 13.4,as the students' scores on the listening task
increase, so do their scores on the grammar test. This pattern is called a positive
correlation.
When the data points in the scatteiplot cluster tightly along the diagonal
line, the correlation is said to be strong. As the points spread out further and the
dots are not so tightly clusteredaround the diagonal line, the correlation is less
strong. Also, sometimes there is a clear pattern but a few data points that don't
quite fit. For instance, look at the asterisk in Figure 13.4 that represents the per
son who got eighteen points on the grammar test but only four points on the lis
tening task. This person is apparently good at grammar but doesn't have strong
listeningskills (according to these two tests). Apersonwhosedata fall outsidethe
observed correlation pattern like this is called an outlier.

396 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


20-i
19- *

18- * *

17- *

16- *

15- * * *

S 14- *

* *
813"
& 12- *

•an- *

H 10- * *

be 9. * *

* *

2 7- * *

3 6- * *

5- * *

4- * *

3-
2-
1-
nu -
i i i i i i i i i i i i i i i i i i i i i i i i i

0 12 3 4 5 6 7 8 9 101112 13 14151617 1819202122 23 2425


Grammar Test Scores

FIGURE 13.4 A scatterplot of hypothetical grammar scores and


hypothetical listening task scores

There can also be negative coirelations. In that case, as scores on one variable
increase, the scores on the other variable decrease. Suppose you noticed that
your language students who had broad vocabulary knowledge always seemed to
finish in-class readings faster than students whose vocabulary knowledge wasn't
quite as strong. Youdecide to investigate this phenomenon, so you correlate the
relationship between your students' scores on a 100-point vocabulary test and
the speed with which they can read a passage in the target language.The scatter
plot for a group of thirty-five students might look like the one in Figure 13.5.
In Figures 13.4 and 13.5, the relationship between two variablesunder con
sideration is depicted by the points on the scatterplot. However, we often need a
more concise way of representing this relationship. When the two variables are
both measured on interval scales, we use the Pearson's correlation statistic to cal-
culate a numerical index that represents the relationship. The outcome of that
statistic is called the correlation coefficient. It is obtained by using this formula
(the rawscoreformula) with our data:
N(2XY) - (2X)(2Y)
V[N2X2 - (2X)2][NSY2 - (2Y)2]
This formula may seem like a monster, but if ybu look closely, you'll see that
you have the knowledge to figure it out. You already know that N stands for

Chapter 13 Quantitative DataAnalysis 397


100-i

**

8 80- ** *

o
o *** *
C/5

ti 60- * * *

-a 40-
•a * * * * *

o
> 20- * * *

* * *

n -
i i 1 1 1 1 i i

() 50 100 150 200 250 300 350 400


Reading Speed (in Seconds)

FIGURE 13.5 A hypothetical scatterplot of vocabulary knowledge


and hypothetical reading speed

number and 2 tells us to sum what follows. Here X represents the score on the
X-variable and Ystands for the score on the Y-variable. Remember that paren
theses tell you where to start. The square brackets work like parentheses.
In order to solve this formula by hand, it is convenient to set up a table, just
as we did for the chi-square test, which shows us the steps. We will use the fol
lowing (hypothetical) data to illustrate the process.
X Variable: Vocabularytest scores out of 30 points possible
Y Variable: Reading speed in seconds
Table 13.13 shows the column headings and uses a small data set to illustrate:

TABLE 13.13 Steps in calculating the raw score formula for


Pearson's r

ID Number X Variable X-squared Y Variable Y-squared X times Y

1 25 625 30 900 750

2 20 400 40 1,600 800

3 12 144 50 2,500 600

4 10 100 60 3,600 600

5 8 64 70 4,900 560

Totals 75 1,333 250 13,500 3,310

398 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


When we complete the multiplication and addition we get the following
setup:

5(3,310) - (75)(250)
V[5(l,333) - (75)2][5(13,500) - (250)2]
And continuingwith the calculations, we get the following values:
16,550 - 18,750 -2,200
V[6,665 - 5,625][67,500 - 62,500] V[l,040][5,000]
-2,200 -2,200
= -96.65 = r
V5,200,000 2,280.35
So our correlation coefficient is —96.65—but what does it mean? The
minus sign indicates that we have a negative correlation, and the magnitude
(96.65) indicates that it is a very strong correlation. When we check a table
of critical values for Pearson's correlation coefficient (e.g., J. D. Brown, 1988),
we see that when alpha is set at .05 and n = 5, the rcrjtjcai is .8783, so we can say
that this reserved is statistically significant. In other words, we can have confi
dence (p < .05) that our findings are stable. As vocabulary scores increase,
reading speed decreases.

Spearman's Rank Order Correlation Coefficient


The second common form of correlation is called Spearman s rank order correla
tion coefficient, or Spearmans rho, or just Spearman's r. It is symbolized by the
Greek symbol rho (p), or by the word rho, or sometimes by an r with the sub
script s for Spearman's (rs). Spearman's rank order correlation coefficient, as the
name implies, isused when oneor bothofthe variables underinvestigation con
sists of ordinal data. If one of the variables has been measured on an interval
scale, those data can be rank-ordered and thus converted to ordinal data before
the statistic is used.
Suppose that some of your EFL students were finalists in a regional speech
contest, and that two judges ranked-ordered the finalists. You might want to
know the extent to which the two judges agreed with one another in their rank
ing of the finalists. In this situation, you would use the formula for Spearman's r.
Likewise, if a teacher rank-ordered her students in terms of her impressions
of their speaking fluency and wanted to see if there was a correlation between
their speakingfluencyand their TOEFL scores, she could use Spearman's r to do
this. She would first rank-order the students' TOEFL scores to create ordinal
data and then correlate their TOEFL ranks with their fluency ranks.

Point-Biserial Correlation Coefficient


The third main type of correlation statistic is called the point-biserial correlation.
It is used under some circumstances when one variable is interval data and the

Chapter 13 Quantitative DataAnalysis 399


other is dichotomous nominal data. This term refers to categorical (nominal) vari
ables that have only two levels, such as right/wrong, yes/no, present/absent, and
so on. The wordpoint in the name of the statistic refers to which of the two nom
inal categories is entered in the data set, and the term serialrefers to the series in
the interval data set. The symbol that is usually used to represent this statistic is
r with the subscript,,!,; (r()|,i). This statistic is very important in language assess
ment and test development, but it is not used very often in language classroom
research.

Interpreting Correlation Coefficients


No matter which correlation formula you use, die correlation coefficient (the
numerical index you get when you use a particular formula with your data) is
interpreted in the same way. There are three main concepts represented in any
correlation coefficient. These are the magnitude, statistical significance, and
directionality.
The magnitude of a correlation coefficient is its size, which represents the
strength of the association between the two variables. The r value is read as a
decimal approaching the absolute value of the whole number 1. (Hence, a corre
lation coefficient of .99 is "stronger than" one of .75) A correlation coefficient of
r = 1.00 would be a "perfect" correlation and the dots in the scatterplot would
line up perfectly on the diagonal line.
As was the case in the discussion of significant difference studies, we want to
know whether the findings of a correlation study are consistent and trustworthy.
We want to guard against the possibility that the outcome may be in error. So, as
we did before, we check to see if the correlation coefficient we obtain is statisti
cally significant. We do this either by checking the value of r0bservetl against the
value of rcrjtital hi the table of critical values, or—if we are using a software pack
age to compute the statistic for us—by checkingthe value of p (probability) that
the computer program produces.
Correlation coefficients also have direction. Directionality refers to the fact
that correlations can be either positive or negative, as we saw in Figures 13.4 and
13.5, respectively. If a correlation coefficient is negative, that fact is indicated by
a minus sign in front ol the r value, as in "r = —.76, p < 0.05." If a correlation is
positive, no minus sign is used (e.g., "r = .76, p < .05").

REFLECTION

How do you interpret the following statements?


1. "... and we found that r = .92, p < 0.001...."
2. ". . . but in this case, r = -.84, p < 0.01 "
3. ". . . and it revealed that r = .24, p > .05 "

400 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Due to space constraints, we will not present the formulae lor the Spear
man's rank order correlation or the point-biserial correlation statistics here.
However, you can find them in many books in our field (see, e.g., Bailey, 1998b;
J. D. Brown, 1988; 2005; Brown and Rodgers, 2002; and Hatch and Lazaraton,
1991).

REFLECTION

What do you think it means if a researcher reports that for a particular


correlation coefficient "p < .01"?

USING TECHNOLOGICAL TOOLS TO ANALYZE


QUANTITATIVE DATA
Computer technology has greatly eased the burden and increased the speed ol
doingquantitative analyses. For example, the search function in word-processing
programs can help you do frequency counts for key words it you are looking for
patterns in field notes, or for particular linguistic forms in transcripts. Spread
sheet programs,such as Excel in the MicrosoftOffice program, will compute the
descriptive statistics and many of the most commonly used inferential statistics
for you. There are also several Web sites that offer you free access to tools for
computing the most frequently used statistical formulae.
Of course, learning to use these procedures takes time and a certain amount
of effort. Why should we bother to learn to use these technological tools? First,
computer analyses greatly increase the speed of our computations—especially if
you are working with a complicated formula, like ANOVA, or with large data
sets. Secondly, the use of computer programs to calculate a formula greatly in
creases the accuracy ofyour computations. The likelihood of making an error on
a hand calculator, for instance, is greatly reduced by the computer's systematic
programming.
Likewise, computer programs make it easy to check a data set, either in a
printout or on screen, before you run the formula. Using a computer is more
convenient than using a calculator to add new data if cases are added to the data
set and easier to correct data if errors are detected. In addition, it is convenient
to do new analyses with the same data set as new variables ol interest emerge or
as you receive feedback and suggestions about how to improve your analyses.
Various computer programs are also helpful in creating professional looking
tallies, graphs, and figures to represent your data.
Finally, there are the practical factors of storage and transportation. Com
puter technology makes it relativelyconvenient to store records and back-up sets
of data. Once they havebeen stored, those data are easily transportable, whether
physically or electronically. In the form of e-mail attachments, quantitative data
caneasily be "shipped" to a colleague in a spreadsheet just as qualitative data can
easily be sent as Word documents.

Chapter 13 QuantitativeDataAnalysis 401


OUALITY CONTROL ISSUES IN QUANTITATIVE
ANALYSES

There are several issues to keep in mind as we calculate and interpret quantita
tive analyses. The concern that is first and foremost is whether the right statisti
cal procedure has been chosen. Table 13.14 below provides a summary of the
issues that influence a decision about what sort of statistic to use when you are
looking for significant differences between or among groups. For instance,
havingposeda hypothesis or research questionabout significant differences, the
next concern is what type of data is being compared. We have simphfied this
initial presentation to involve just interval-scale measurements and frequency
counts. (There are other statisticsthat determine, for instance, significantdiffer
ences between groups where the dependent variable is based on ordinal data.)
These are just a few of the most basic statistical tests of significant differ
ences. A good statistics course will teach you much more about how to analyze
quantitative data to test various kinds of hypotheses and answer research ques
tions. However, the procedures discussed here are often used in language class
room research and you are likely to come across them in your reading of the
research literature in our field.
The choice of the right statistic to use is just as important in correlation
studies. The characteristics of the three correlation statistics we have studied are
summarized in Table 13.15. Once again, the choice of formula depends on the
type of data involved.

TABLE 13.14 Commonly used statistical tests of significant


differences

"type of Number Moderator Sample Matched Statistical


Data of Groups Variable Size Scores Test

Interval 2 No Small No Independent


Measure Samples
t-Test

Interval 2 No Small Yes Dependent


Measure Samples
t-Test
Interval 2 or more No Large No One-Way
Measure ANOVA

Interval 4 or more Yes Large No Two-Way


Measure ANOVA
Frequency 2 or more No No One-Way
Data Chi-Square
Frequency 4 or more Yes No Two-Way
Data Chi-Square

402 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 13.15 Types of data involved in three major correlation
coefficients

Correlation Procedure Symbol(s) Ifypesof Data Involved

Pearson's product-moment r Two sets of interval data


correlation coefficient
Spearman'srank order p or rs or rho Two sets of ordinal data (or one
correlation coefficient set of interval data and one set
of ordinal data)
Point-biserialcorrelation rpbi One set of interval data and one
coefficient set of dichotomous nominal data

REFLECTION

For each of the following situations, decide which correlation statistic


is called for: Pearson's r, Spearman's r, or the point-biserial correlation
coefficient.

1. Agroupofsecondary school Frenchstudents are finalists in an essay con


test, and the winner will receivea scholarship Two judgesindependently
rank-orderthe students' essays. Their combii Led rankings willdetermine
who wins the scholarship. The students' teachers wish to determine
whether there is a clear relationship between the two judges' rankings.
2. A secondaryschoolFrench teacherwishes tc see if there is a significant
relationship between her intermediate students' scores on a 100-point
French vocabulary test she designed and Dn a standardized French
reading test.
3. A researcher is investigating American university students' tolerance
for the accentedness of non-native speaking content-area professors.
She has the students rate tape-recorded samples of several non-native
speaking professors reading the instructiors for an assignment. The
studentsrate the speechsamples on a scaleof 1to 30,usingscaledescrip
tors. The students also must sayyesor no hi response to the question,
"Should this professor be hired to teach int -oductory courses in his or
her field at this university?" The researcher wants to determine the
relationship between the ratings on the thiity-point interval scale and
the yes/no votes for employment.

This chapter has described just a few of the inferential statistics that are
commonlyused in language classroom research, and for even these few, we have
just presented introductory concepts. If you would like to do some substantial
quantitative analyses, we recommend that you take an introductory course, but
there are alsoseveral good booksthat you can consultto help you makethe right
choices. (See the Suggestions for Further Reading at the end of this chapter.)
Chapter 13 Quantitative DataAnalysis 403
A SAMPLE STUDY

In Israel, an EFL context, a researcher named Bejarano (1987) wanted to


see which of three different instructional methods would be most effective. The
three methods were called (1) discussion groups, (2) student teams, and (3) the
whole-class method. (In her paper, Bejarano operationally defines these terms.)
We will use some information from this study to reinforce the concepts intro
duced in this chapter.
There were eighteen teachers and a total of 665 third-year secondary school
students involved in this study. The teachers were randomly assigned to use one
of the three teaching methods. This was a process-product study in the sense
that it included an observational component: "Three trained observers visited
every class twice during the experimental period" (ibid., p. 491).
The dependent variablein the study wasa test that wasgiven twice—once as
a pre-test and once as a post-test. Bejarano wanted to know whether the differ
ence in the means of the control group (the whole-class group) and the two
experimental groups (the discussion groups and the student teams) was a big
enough difference to conclude that one or both of the treatments had been
effective. Table 13.16 shows some results of the main descriptive statistics in
Bejarano's research project:

TABLE 13.16 Descriptive statistics for pre-test and post-test scores


(total test) of students in three teaching situations
(adapted from Bejarano, 1987, p. 492)
Discussion Groups Student Teams WholeClass
(n = 229) (n = 198) (n = 238)

Pre-Test Post-Test Pre-Test Post-Test Pre-Test Post-Test

Mean 50.52 57.28 53.94 61.39 57.58 62.45


Standard 19.59 20.87 19.94 21.35 17.49 18.31
Deviation

As you can see from Table 13.16, all three groups improved between the
pre-test and the post-test. (Unfortunately, the researcher did not report how
many points there were on the total test, so we cannot tell what these mean
scores indicate in any absolute sense—onlywhat they suggest in comparison to
one another.)
We can see from the descriptive statistics in Table 13.16 that the students in
the whole-class group started out the highest (their mean pre-test scores) and
ended up the highest (their mean post-test scores). But did they make the most
progress? Answering that question shows the benefit of a pre-test post-test
design like the one Bejarano used. Having pre- and post-test data allows us to
calculate the groups' gain scores—the difference between their performances on
the pre-test and the post-test.

404 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFLECTION

Use the data in Table 13.16 to answer the following questions:


1. Looking just at the pre-test means, which group started out with the
lowestscores? Which group started out with the highest scores?
2. Looking just at the post-test means, which group ended up with the
lowestscores? Which group ended up with the highest scores?
3. Compare the standard deviations for the three groups. Which group
showed the least variation in its pre-test scores? Which group showed
the least variation in its post-test scores?

Gain scores can be interpreted as a measure of students' progress between


the pre-test and the post-test. But keep in mind that sometimesstudents' scores
actually decrease over time. When that happens, we say that the gain scores are
negative. The gain scores for the three groups in Bejarano's study are presented
in Table 13.17.

REFLECTION

Use the data in Table 13.17 to answer the following questions:


1. Which group or groups improved the most? (Compare the gain scores
for the three groups.)
2. Are the differences between the pre- and post-test scores big enough to
say that all three groups made substantial progress?

Bejarano's report also provided some data about how well the students did
on the various subtests. One set of" data that is particularly interesting consists of
the three groups' listening subtest scores. These are shown in Table 13.18. (The
researcher reports that there wasa range of 0 to 57 points in these scores, but she
does not tell us the range for each group or the total points possible.)

TABLE 13.17 Gain scores of students in three teaching situations


(adapted from Bejarano, 1987, p. 492)
Discussion Groups Student Teams Whole Class
(n = 229) (n = 198) (n = 238)

Pre-Test Post-Test Pre-Test Post-Test Pre-Test Post-Test

Mean 50.52 57.28 53.94 61.39 57.58 62.45

Gain Score 6.76 7.45 4.87

Chapter 13 Quantitative Data Analysis 405


TA.BLE 13.18 Descriptive statistics for pre-test and post-test scores
(listening subtest) of students in three teaching
situations (adapted from Bejarano, 1987, p. 492)
Discussion Groups Student Teams Whole Class

Pre-Test Post-Test Pre-Test Post-Test Pre-Test Post-Test

Mean 31.51 36.00 34.46 38.80 36.74 38.97


Standard 11.78 12.91 11.83 13.02 10.47 11.15
Deviation
Gain Scores

REFLECTION

Use the data in Table 13.18 to answerthe following questions:


1. Lookingjust at the pre-test means, which group started out the lowest
scores? Which group started out with the highestscores?
2. Looking just at the post-test means, which group ended up with the
lowestscores? Which group ended up with the highest scores?

ACTION

Now calculate the gain scores on the listening subtest for these three
groups. Fill in the blanksin the row labeled"Gain Scores"in Table 13.18.
Which group(s)made the greatest improvement?

Bejarano (1987, p. 492) reported the following findings regarding the data
presented in Table 13.16 and Table 13.18, respectively:

Pupils in the discussion group had greater gains than those in the whole-
class situation on the total test: F (1, 465) = 4.23 (p < .05).
Pupils in the discussion group classes had greater gains than those in
the whole-class situation on the listening subtest: F (1, 465) = 11.99
(p < .001).

If you were reading Bejarano's report, how would you interpret these
statements? They are written in a kind of code or shorthand that is familiar to

406 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


people who have studied statistics but can be quite daunting to people who
have not. Let's crack the code.
First of all, what does /''represent in these claims? As noted above, /•'stands
for the F Ratio—the ratio we get when we divide the difference between
the groups by the difference within the groups. The F value reported here is
the Fobserved—that is, the value of the statistic that Bejarano obtained when she
put her data into the computer program that calculates Analysis ol Variance.
(Computing ANOVA is a very long and complicated process. People almost
always do it with a computer instead of by hand because there are so many
chances for error if you use a calculator.) According to the two statements
above, the F values Bejarano obtained were 4.23 for the difference on the
whole test and 11.99 for the difference on the listening subtest. But what do
these F values mean? In a nutshell, if" the value of F greatly exceeds the whole
number 1, the differences between groups were greater than the differences
within the groups.
Once again, to determine the statistical significance of the F ratio, we will
compare the observed value ol the statistic with the critical values printed in a
particular statistical table (and embedded in the computer's statistical program).
But when you consult the table of critical values for Analysis of Variance, you
need to have two different values in mind for degrees of freedom in order to
locate Fcntical' The two statements above both include the parenthetic statement
(7, 465)—and, as you may have guessed, these values represent degrees of free
dom. What do they mean? The / comes from the fact that two groups were
being compared (the discussion group context and the whole-class format). The
/ in the parentheses is the number of groups in the comparison, minus one. This
term is often represented by K —1, where K = the number of groups.
What about the other value reported in these findings, thai is, 465? It repre
sents the number of subjects minus the number of groups (N —K, where N =
the number of subjects). When we check Bejarano's report, we see that there
were 229 students in the discussion group format and 238 in whole class format.
When we add these two numbers we get 467—the value of N in this case. And
when we subtract X — K, we t>et 465.

ACTION

Interpret this paraphrase of one result from Bejarano's report (1987):


Pupils in the student teams group had greater gains than those in the
whole-class situation on the total test: F (1, 434) = 6.27 (p < .05).
Look inside the parentheses. Where did the number "1" come from?
Where did the number "434" come from? (Hint: There were 198 students
in the student teams format and 238 students in the whole class format.)
(p. 492)

Chapter 13 Quantitative Data Analysis 407


Once again, in this statement, the 1 in the phrase"F (1,434)" tellsus that the
Fohserved here is based on a comparison of two groups—the student teams group
and the wholeclass group (2-1 = 1). The number 434 tellsus that scores from
436 students were involved in these two groups, because 2 (the number of
groups) added to 434 in the phrase above is 436. Knowing the degrees of free
dom for the number of groups and the number of subjects enables us to check
the criticalvalues table or interpret a computer printout. Since the F0bserved was
much greater than the Fcritjcai, Bejarano was able to conclude that her findings
were statisticallysignificant (p < .05).
We haveonlyreported a few of Bejarano's findings here in order to illustrate
some of the concepts associated with descriptive statistics and with the inferen
tial statistic called ANOVA. We recommend you read the full article and we
hope that you feel empowered to interpret many statistical findings in this and
other research reports. But please keep in mind that we have only provided an
introductionto someof the statistical procedures that are commonly usedin lan
guage classroom research. There are many others to learn about.

PAYOFFS AND PITFALLS

There are several payoffs associatedwith analyzing quantitative data. The first is
that this approach to investigation is widelyused around the world and is under
stood by researchers trained in the psychometric tradition. Once you are famil
iar with the various statistics and how to interpret them, you will find that you
can understand (well-written) quantitative research articles in professional books
and journals.
The second is that, for better or for worse, people often find statistical re
sults convincing. (This fact is unfortunate because statistics can be misused.
Newspaper reports are notorious for providing partial data and/or the results of
inappropriate statistical procedures. They can get away with such shoddy prac
tices because interpreting statistical evidence is so foreign to most people.)
Another benefit of quantitative analyses is that they are compact. Reporting
the mean and the standard deviation for a group of learners speaks volumes to
those who can interpret descriptive statistics, so a great deal of information can
be conveyed in a small space—animportant economical fact for the publishers of
books and print-based journals. For example, look back at the sample study in
Chapter 9, where the authors (Lynch and Maclean, 2000) characterized two
students, Alicia and Daniela, very succinctly in terms of their scores on the
TOEFL, the IELTS, and a dictation.
In addition, regardless of the approach you choose—psychometric, natura
listic, or action research—numerical results are informative. We will return to
this concept in Chapter 15, where we consider mixedmethods studies—those that
involve both quantitative and qualitative data analyses.
As usual, the pitfalls are related to the payoffs. The first is that while people
trained in interpreting statistics can understand quantitative analyses, people

408 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


who lack this kind of experience and training may find statistical results confus
ing, or even uninterpretable. No matter what statistically significant differences
or statistically significant correlationsyou may havefound, it is important to re
port your findings in clearprose,with ample explication, so that readers unfamil
iar with statisticalprocedures and the logic of hypothesis testing can still benefit
from your findings.
Also, while statistical results often seem impressive and convincing, they can
be easily misused. For example, whatdoes it meanj if a serious person in a white
lab coat in a TV advertisement says, "No toothpaste has been found to be more
effective than Shiny-White toothpaste"? This statement can be interpreted in at
least three ways:

1. Research has been conducted that found no other toothpaste was more
effective than Shiny-White, because Shiny-yVhite performed the same as
the other brands.
2. In the research that was conducted, Shiny White scored lower than the
others but the difference was not statistically) significantly different.
3. No research has been conducted comparing Shiny-White to other brands
of toothpaste—hence, no other toothpaste has been shown to be more
effective.

It is this kind of multiplicity and ambiguity of interpretation that makes many


people mistrust statistics.

REFLECTION

Watch for examples in the popular press when; statistics are used to sup-
port a certain point of view. Do you find the ejxamples credible? Why or
why not?

Another issue is that almost all of the inferential statistics are based on as
sumptions about the data—assumptions that may not be met in language class
room research. For example, as we noted above, the independent samples t-test
can only be used to compare the means of two groups on an interval scalewhen
the two groups' scores are independent of (not influenced by) each other.
Another example is found with the chi-square test. In cases where df = 1 (as in
Sato's [1982] data), a variation of the chi-square statistic called Yates' correctionfor
continuity, or Yates' collection factor, must be used, though some researchers have
skipped this step. Likewise, in working with Pearson's correlation coefficient,the
measures on the two variables being correlated must meet the following assump
tions (Hatch and Lazaraton, 1991, pp. 549-550):

1. The two data sets being correlated are interval in nature.


2. The measures have equal reliability (i.e., they have been shown to be
consistent). \

Chapter13 Quantitative DataAnalysis 409


3. The two data sets beingcorrelatedare independent.
4. Using data from a restrictedrange of scores (e.g., using data from just in
termediate learners without including data from beginners or advanced
students) may mask a correlation (Jaeger, 1993, p. 69).
Given these assumptions, we must proceedwith caution in using and interpret
ing such analyses. We recommend that you read further and take a good basic
course in statistics.
Finally, let us return to the metaphor we used at the beginning of this chap
ter. Don't let yourself get overwhelmed by all this mathematical terminology!
Learning to interpret and to compute statisticsis very much like learning a new
languagewhen you enter a new culture. There is a great deal of vocabulary to be
learned, and there are ordering rules to follow (just like syntax) as you calculate
the formulae. There are also cultural values associated with statistical logic. We
have only scratched the surface of this complex and fascinating new culture, but
we hope the ideas presented here will help you be a more confident consumer of
quantitative analyses and encourage you to try some quantitative procedures (as
appropriate) with your own data.

QUESTIONS AND TASKS


1. Read a research report on a topic that interests you that uses quantitative
data. Does it incorporate any of the procedures discussed in this chapter?
If so, check your ability to interpret the information provided about criti
cal values, observed values, the degrees of freedom, and the probability
levels.
2. In the study you read, checkto see if the author(s)used the appropriate sta
tistic. Identify (1) the hypothesis being tested, (2) the type of data collected,
(3) the number of groups, (4) the presence or absence of a moderator vari
able, and so on.
3. Look at the data in Table 13.5 about the enrollment patterns (French =
250, Spanish = 200 Ss, and German = 150 Ss). Using the layout in Table
13.7, calculate the value of chi-square observed.
4. Compare your results to the critical values given in Table 13.8. Are the
results statisticallysignificant? How do you interpret your findings?
5. Draw a scatterplot of the data in Table 13.13.
6. Here is another statement from Bejarano's report. How would you
interpret it—particularly the mathematical parts?

Pupils in the student teams group had greater gains than those in
the whole-class situation on the listening subtest: F(l, 434) = 8.60
(p < .005).

410 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


SUGGESTIONS FOR FURTHER READING
There are many excellent statistics books and books that introduce novice re
searchers to reading quantitative analyses. For instance,we recommend the sec
ond edition ofJaeger's (1993) Statistics: ASpectator ^port, though his examples are
from general education rather than language teaching. J. D. Brown (1988) and
Perry (2005) also provide goodintroductions to people whowant to learnabout
interpreting statistics. Their examples come from the field of language teaching
and applied linguistics.
Tb learnto actually calculate the various statistics, werecommend Shavelson
(1981; 1996) and Hatch and Lazaraton (1991). For guidelines on using SPSS
with data from applied linguistics studies, see Dornyei (2007). J. D. Brown
(2005) gives clear guidance on computing basic statistics using Excel with data
from language tests.

Chapter 13 Quantitative DataAnalysis 411


CHAPTER

14

Qualitative Data Analysis

I have a rule ofthumbforjudging the value ofapiece ofart. Does itgive


meenergy, ortake energy away? When Istaggered outof United 93,this
rule hadlost traction. I realised I hadspent most ofthe screening
crouchingforward halfoutofmy seat, with my hand clamped around
myjaw. Something inme hadbeen violently shifted offcentre. ...I'm
[leftwith] the same oldhaunting question: why dostories matter so
terribly tousthat we will offer ourselves up to, andlater begratefulfor
anexperience that we know isgoing tofill uswith griefanddespair?
(Garner, 2006, p.62)

INTRODUCTION AND OVERVIEW

In this chapter, we look at techniques for qualitative data analysis. Whereas


quantitative data have to do with measuring, qualitative data have to do with
meanings. Qualitative data have an immediacy and ways of touching us that
quantitative data typicallydo not, as the reaction by Helen Garner to the dram
atized documentary United 93 so vividlyattests. We begin the chapter by defin
ing qualitative data. We then look at specific techniques for finding patterns
through meaningcondensation and grounded analysis. We then briefly compare
discourse analysis, conversational analysis, and interaction analysis before con
sidering the use of technology in qualitative data analysis. We examine some
quality control issues in doing qualitative analyses. As usual, the chapter ends
with the summary of a sample study and a discussionof the pitfallsand payoffs of
this approach to analyzing data.

412
What Is/Are Qualitative Data?
Both quantitative and qualitative data are important in language classroom re
search, but when it comes to making sense of research, qualitative data come
first. In saying this, we mean that while qualitative data can be quantified, all
quantitative research must ultimately be referenced against the qualitative
sources that gave rise to them in the first place. For example, a researcher inves
tigating the 'good language learner' might collect test scores from a sample of
secondary school language learners. 'Good learners' might be operationally de
fined as those students who scored better than two standard deviations above the
mean of the sample as a whole. In order for the study to have any value, the re
searcher would then need to identify what it was that gave rise to the superior
scores in the firstplace. (Did thesestudentsdevote more time to language study?
Did they attempt to activate their language out of class? Did they use a greater
rangeof learning strategies?) Answers to these questions can onlybe determined
through the analysis of qualitative data: learner diaries, focused interviews,
responses to open-ended itemson questionnaires, and so on.
In the subheading to thissection of the chapter] weuseboth the singular and
plural form of the verb to be separated bya backslash. We do so to reflect a use
ful distinction drawnby Holliday(2002), whosuggests that quantitative research
consists of counting occurrences across large populations:

Data [thereforel are essentially plural—the number of Ford or Peugot


cars sold, a number of questionnaire responses, or the number of times a
teacher asks a question in class. Qualitative data is conceived verydiffer-
endy. It is what happens in a particular social setting—in a particular
place or amongst a particular group of people. I usethe word'data' asan
uncountable noun—e.g. 'data is' instead of 'data are'. The uncountable
singular form is in popular use but considered less correctbymanyqual
itative as well as quantitative researchers. I use the uncountable form
because to me it signifies a body ofexperience. This is a conceptual break
from quantitative research whichsees data as a number of items, (p. 69)
Qualitative data in second language classroom research can take many
forms, including the following:

© Narrative accounts of a typical school day by a teacher or a student


© Observers' notes about lessons
® Maps showing the position(s) of the participants and furniture
© Transcripts of lessons
© Lesson plans and teachers' notes
© Open-ended questionnaire responses
® Video or audio recordings of classroom interaction
® Stimulated recall responses from students or teachers based on viewing
video recordings of lessons

Chapter14 Qualitative DataAnalysis 413


© Focused interview protocols
© Entries in teachers' or learners' diaries
© Copies of students'work

There is also a type of information called archival data—existing (i.e., archived)


recordsthat mayshed light on an investigation. In language classroom research,
these can include enrollment data, course syllabi and descriptions, school policy
statements, memos to parents, and so on.
The basis of all these data sources is language. As Freeman (1996b) notes in
his coda to a collection of research studies into teacher education:

Language provides the pivotal link in data collection between the un


seen mental worlds of the participants and the public world of the
research process. In the study of teacher knowledge and the cognitive
processes that are part of teaching, this issue is a central one since lan
guage is always used to express—and to represent—thought. Thus, the
ways in which language data come about in a particular study are inti
mately connected to the purposes of the study; there is a crucial link
between means and ends. (p. 367)
Table 14.1 provides Freeman's useful synthesis of issues in data collection and
analysis.
Although it is not exhaustive, Table 14.1 illustrates the richness and diversity
of qualitative data sources and forms of analysis. The data can be recorded in
written, tape-recorded, or videotaped form. They can provide insider (emic) or
outsider (etic) perspectives, and interpretation can be bottom-up (grounded) or
top-down (guided).
All qualitative datacan be quantified in someway. In other words, thingscan
be counted in qualitative data. In fact, there is almost no limit to the things that
can be counted in qualitative data sets. Consider a lesson transcript. Here is a
nonexhaustive list of some of the things that can be quantified:
© The number of displayquestions versusthe number of referential questions
© The number of times errors are corrected
© The amount of time spent focusing on form versus the amount of time
spent focusing on meaning
© The number of positive evaluations by the teacher versus the number of
negative evaluations
© The amount of time the target language is used during a lesson
© The amount of time spent 'on task' versus the amount of time spent 'off
task'
© The length of 'wait time' a teacher allows between asking a question and
requiring a response
© The amount of time spent on drill and reproductive language work versus
the amount of time spent on communicative and creative language work

414 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 14.1 Summary of research methodology in the studyof teacher
learning (adapted from Freeman, 1996b, p. 368)
Howarethe data analyzed?
How are the datagathered? Analysis Analysis Analysis
Data source Gathering stance process categories

Observation/ Time "real" What is the How are data How is the
field notes versus ex researcher's linked to interpretation of
Interviews post facto relationship analysis? the data arrived
collection to the study? at and by whom?
Documentary Relation of Participatory Linear/ Emic
analysis researcher to 1 Iterative 1
data 1 1
Stimulated 1 Grounded
recall 1 1
(videotape) Emic 1 1
I
Classroom Self-generated Collaborative Negotiated
discourse I
language Collaborative I
data Guided
(audiotape) Documentary
I
Survey data / I Declaratory A priori
questionnaire Etic I
Etic

© The percentageof time spent on hstening, speaking, reading, and/or writing


® Whether (and if so when) the teacher responds to the code or to the mes
sage in a student's utterance
© Whether students' responses are extended or restricted to a single word,
clause, or sentence
© The number of opportunities that students take to initiate the discourse

REFLECTION

What are some of the things that could be coiinted in the following data
sets? Why might a researcher want to count those particular things?
© A narrative account of atypical school day b|r ateacher orastudent
© Observers'notes on a lesson
• Lesson plans and teachers'notes

Chapter14 Qualitative DataAnalysis 415


® Open-ended questionnaire responses'
© Stimulated recall responses from students or teachers based on viewing
a video of a lesson
© Focused interview protocols
© Teachers'or learners'diaries
© Copies of students' work

WORKING WITH QUALITATIVE DATA


Analyzing qualitative data is an iterative process of reading, thinking, rereading,
posing questions, searching through the records, and trying to find patterns.
Some authors have described the process as sifting or combing or searching. It is
somewhat intuitive and involves a number of mental processes that are rather
difficult to explain. In this section, we will describe ways that we and other
researchers haveworked with qualitative data from languageclassrooms.

Finding Patterns in the Data


When people write about analyzing qualitative data, they often say that they
look for patterns. But how does a researcher find patterns in a data set?
One way is to look for repeated themes or key words. In this procedure,
the data are scanned for wordsor phrasesthat appearto be significant. These are
highlighted. Related concepts are then grouped together, given a superordi-
nate heading, and then tabulated. Here is an example from an unpublished
language learning diary by one of the authors.
I tried creating little dialogs from the phrasesI was trying to learn in
order to give them some context, but didn't get very far. I don't even
know how to say 'yes'. From what I know of other Asian languages, I
guess there won't be a single word as there is in English. (Nunan's diary
entry, December 5, 2002)
Trying the 'creative construction' technique, I come up with the
following question form for 'what is this?' 'matyeh hei ni?' I bet it's
wrong. I'll check with a native speaker before I try to learn it. (Nunan's
diary entry, November 11, 2002)
Superordinate heading: Independent learning strategy
dialog creation
creative construction

Another way to find patterns is to look for parallel or connected comments.


For example, in her analysis of eleven diary studies, Bailey(1983b) noticed that
commentsrelated to competitiveness often appearedwith remarksabout anxiety.

416 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


After combing through a great deal ofqualitative dkta, she posited a relationship
between competitiveness and anxiety among adultsecond language learners.
Looking for metaphoric uses of language can also help you find patterns in
quahtative data. In Bailey's (1983b) study of competitiveness and anxiety, she
found language associated with racing imagery. Phrases suchasfalling behind and
can't keep up the pace wereassociated with anxiety-inducing episodes in the diary
data. I
Sometimesthere are natural divisions or stages in longitudinal data. For ex
ample, in a diary study about learning Spanish at a language school in Mexico,
Campbell (1996) provides journal excerpts identified by the particular week in
the two-month course. In Schmidt and Frota's (1986) diary study of Schmidt's
learning Portuguese, there were three periods that emerged in the data, which
werekept overa period of twenty-two weeks. When Schmidtfirstwent to Brazil,
he was not taking a Portuguese class and he had no interaction in Portuguese in
the environment. Then there was a period of time when he was taking a
Portuguese course as well as interacting with Portuguese speakers outside of
class. Later the class ended and he continued learning only through interaction
with native speakers.
Qualitative researchers also look for turning points and highly salient events
that seem important but that may not be part of a pattern. For instance, Bailey
(1981) describes a heated discussion after a French test that the students were
very angry about and how that argument seemed to clear the mounting tension
in the class. Similarly, in his study of anxietyamong sixteen- and seventeen-year-
old ESL students in residential school in Singapore, Hilleson (1996) used a
rangeof qualitative data to track the students' experiences and feelings. He notes
an important turning point:

Students at all proficiency levels reported an almost magical transfor


mation when they commenced the second term (about four months
after beginning the course). Many had been away on holiday, some in a
non-English-speaking environment, but they] all remarked on the dif
ferencein their attitude after the break.They felt as if a barrier had been
lifted and suddenly their confidence had returned. . . . Students were
more relaxed after the break because the demands they were making on
themselveswere more realistic, (pp. 273-274)

It is also important to look for contrasts, inconsistencies and/or unanswered


questions. For example, Bailey (1984) conducted a study of the classroom com
munication problems of international teaching assistants (TAs) who were teach
ing math and science courses at a U. S. university in English—their second
language. She made audio recordings and handwritten field notes while she
observed twelve native speaking and twelve non-native speaking TAs over the
course of a ten-week semester. Through a series of qualitative analytical proce
dures, Bailey was able to identify five types ofjteaching styles exhibited by
twenty-one of these TAs: (1) inspiring cheerleaders, (2) entertaining allies,
(3) knowledgeable helpers and casual friends, (4) mechanical problem solvers,

Chapter 14 Qualitative DataAnalysis 417


and(5) active but unintelligible TAs. Buttherewere threeTAs who didnot quite
fit among the types. For example, one of the native-speaking math TAs was a
casual friend but not a knowledgeable helper to his students since he himself
was unable to do theirmath homework problems. Such anomalies help to clarify
strengths and problemsin categorization systems.
Another way to organize your data analysis procedures is to contrast the
viewpoints of various constituencies in the culture you are studying. For in
stance, Block (1996) contrasted students'interpretations of EFL lessons in Spain
with the teacher's intended purposes. Harrison(1996) provides extensive quotes
from both regional inspectors and EFL teachers in the Sultanate of Oman in
analyzing the outcomes of a curriculum renewal project.

Meaning Condensation
Workingwith qualitative data is a different matter from usingthe kinds of quan
titative analytical procedures described in Chapter 14. When we are analyzing
numerical data, we can use mathematical procedures to calculate the mean, the
standard deviation, and the other descriptive statisticsthat concisely characterize
the sample in a study. The parallel activity in analyzing qualitative data is a major
challenge: reducing large amounts of text (whether spoken,written, or graphic)
to manageable proportions that allowfor patterns in the data to emerge.
One way to accomplish this data reduction is through a technique known as
meaning condensation, which involves abridging free-form questionnaire re
sponses, interview transcripts, observers' field notes, and so on into shorter for
mulations. Long statements are compressed into briefer statements in which the
main sense of what is said is rephrased in a few words. Meaning condensation
thus involves a reduction of large quantities of text into briefer, more succinct
formulations. This process results in condensed statements that are then
subjected to further analysis.
Here is an example of a condensed narrative. It was constructed from a life-
history interview with a second language learner called 'Gloria' (a pseudonym).
In this study, Gloria was identified as a "good language learner" because she had
received a grade of A on the Use of English exam at the end of high school. Her
retrospective story was part of an investigation into the lifelong language learn
ing experiences of fifty language learners (Nunan, 2007b). The interview ran to
thousands of words. The condensed narrativeis just over a thousand words.

A Condensed Narrative: The Case of Gloria


I first encountered English in kindergarten. I really don't remember if I ever
heard it before then. I remember that the first thing we learned was the
alphabet—A, B, C, A is for apple, that kind of thing. It wasnothing special, just
one more subject. But I didn't think it was a very important subject.
I don't remember whether my primary school was supposed to be Chinese- or
English-medium. I don't think it was ever said. All subjects were taught in
Chinese, evenEnglish. The mainfocus of the lessons was vocabulary and simple

418 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


conversations. Hello, I'm Gloria. Who are you?—that sort of thing. I remember
that it was pretty boring. We had a book, and had to follow alongas the teacher
read. Now and then, he'd ask us to spell words. Most of the time in primary
school was spent copying stuff out. It didn't matter what the subject was. In
English class, the teacher would give you a sentence and tell you to write it out
several times.

I had no contact with English at all out of class, unless you consider doing
English homework as contact. Extra-curricular activities after school were
mainly sports. There was nothing in English.
When I got to years 5 and 6,1 still didn't think that English was very important.
We prepared for the Academic AptitudeTest, but the emphasis was on Chinese
and mathematics. We didn't have any special preparation for English or extra
homework, so I didn't think that it was important. I remember that the focus in
class was on grammar—memorizing tenses and that sort of thing.
After primary school, I went to an English-medium secondary school. In the
beginning, what that meant was that for many subjects the textbook was in
English. In class, the teachers spoke Chinese because their job was to makesure
we understood, and the best way to do that was through Chinese.
Although we had a School English Society, my friends and I never thought of
joining it on our own initiative. We thought more about what sports we would
play when we joined the Sports Club. English wasn't an activity that you could
use or have fun with, it was a subject that you had to study and learn.
When I started in high school,I had more contact with English becauseit wasan
English-medium school and the teacher more-or-less had to speak English.
Then my view of English began to change. I began to see that in addition to
being a subject to be studied, it could also be used as a tool to study other sub
jects. For example, I studied history, and classes were conducted in English, so
English became more important. In most classes the teachers used a mixture of
Cantonese and English—probably fifty-fifty. There was a lot of switching be
tween languages. Some people say this is bad, but the main thing is that the
teachers use language that we can understand. What's the point of teaching a
perfect lessonin English if we can't understand? So Chinese played an important
part, even in English class.
In senior high school, the most important influences were the public examina
tions and preparing for them. English was now more important than other sub
jectsbecauseI needed it to learn the other subjects. Also, the English exams were
different. In the past, you only had to know grammar and vocabulary, but now
you needed a much deeper understanding because you were tested on listening
and speaking. The public exams completely dominated my life because my fu
ture depended on getting good results, and getting good results required good
English. Everything we did was based on the exams. What it tested, we learned!
But I also started to see the importance of English out of class. I realized that I
needed the language if I wanted to communicate with other people. When I was

Chapter 14 QualitativeDataAnalysis 419


young, it neveroccurred to me that I would talkwitha foreigner in English. The
teacher also stressed the importance of using English out of class. She
encouraged us to watch English television and subscribe to English language
newspapers. But I hardlyeverdid these things, I was too lazy. I couldn'tsee how
theywould help me pass the public examination. English was importantbecause
of the exams. Sometimes I would read a newspaper if it was required for an
assignment, but that's all.
Then in form seven, I had an experience that changed my attitude.I took a sum
mer job at Philips and because it was on Hong Kong Island, I came into contact
with a lot of foreigners. I was the only one in the store who could speak much
English at all, and it made me feel superior. But speaking with foreigners made
me realize my deficiencies. I sometimes had to get them to repeat three or four
times before I could understand. And I noticed that the English that foreigners
spoke was different from the English that Chinese people spoke. This experience
made merealize that I really didneedto learnEnglish morewholeheartedly, that
I would have a need to communicate with other people oneday, andthat English
is really very important.
Now that I'm at university, I think of English in a verydifferent way from when
I was in school. I don't have the pressure of an English exam hanging over me,
and I use English, not because I have to take an exam, but because it's the
medium of communication. Many of my lecturers are foreigners, so if I talk to
them I have to use English. You have to write, speak and think in English. It's
part of daily life. Also, if you're good at English you feel superior, and other
people look at you as though you're superior. One of the differences between
Englishand other subjects, such as geography, is that I don't look at peoplewho
are good at geography as that smart, necessarily, but I think of someone who is
fluent in English as very smart.

REFLECTION

What patterns/insights occur to you as you read the condensed narrative


account of Gloria's English learning history?

Insights that came to Nunan (2007b) as he created and reflected on the


condensed narratives included the following: 'Good' language learners such as
Gloria (1) reflect on their experiencesas language users and use these reflections
as the basis for further learning; (2) see language as a tool for communicating
rather than as a body of content to be memorized; and (3) integrate inside-the-
classroom learning opportunities with outside-the-classroom opportunities to
activate their language. Nunan also noted that (4) language learning and atti
tudes towards language learning are unstable and change over time. In addition,
(5) collecting data from informants about events that occurred over a prolonged
period revealed that as learners accumulated experiences and developed their
proficiency, their beliefs and attitudes changed. Finally, from looking at many

420 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


such life histories, (6) itis clear that learner difference is acomplex construct that
cannot be reduced to the influence of isolated variables.
Through die process of meaning condensation, an interview that was over
6,000 words long was condensed to a 1,000-word narrative and subsequently
woven into the six-point summary above, based on data from all the informants.
Meaning condensation enables insights to emerge that may not have been readily
apparent in the original data. The danger, of course, is that these insights may in
some ways be the result of the meaning condensation process. In die process of
condensing the data and eliminating hesitations, false starts, repetitions, and other
infelicities of natural speech in the original interview, the researcher may have
highlighted some issues that are not particularly salient in the original data, while
overlooking or downplaying others. As a safeguard against this problem, Gloria
and the other participants in the study were each asked to review the condensed
versions of their stories to verify that the summary was accurate and complete.

A'Grounded'Approach to DataAnalysis
Lincoln and Guba (1985) called dieir pioneering approach to qualitative re
search grounded because the analytical categories emerge from the data rather
than being imposed on them. Analysts working within this tradition use induc
tive reasoning processes, in contrast with deductive approaches. Deductive rea
soning begins with a theory and looks for data to confirm or disconfirm that
theory (as in experimental research). In contrast, inductive reasoning beginswith
data and ends up with a theory:
[In the] grounded theory approach, the researcher begins with the data
and through analysis (searching for salient themes or categories and ar
ranging these to form explanatory patterns) arrives at an understanding
of the phenomenon under investigation. These themes and patterns do
not simply jumpout at the researcher—discovering them requires a sys
tematic approach to analysis based on familiarity with related literature
and research experience. (Ellis and Barkhuizen, 2005, pp. 254-255)
Ellis and Barkhuizen go on to point out that inductive and deductive approaches
should be seen as either end of a continuum, radier than as a pair of binary
opposites. Qualitative research (or any other kind of research, for that matter)
cannot be entirely based on one or the other.

ACTION

The following statements are some data David Nunan collected in a work
shop he ran for English teachers in Hong Kong. (The teachers were asked
to describe three beliefs they have about language development that influ
ence the way they teach.) Do a key word analysis and assign the data to cat
egories. (Hint: In the original study, these statements were organized into
six categories.)

Chapter 14 Qualitative Data Analysis 421


Spoken language should be mastered before written.
Children need to be immersed in all types of writing/reading literature.
Children learn by using the language.
Children need to be a part ofa rich language environment.
All children benefit from immersion of[sic] the written print.
Children's language develops through experiences so in order for the
children togain the most outofany given lesson, many experiences should
be given.
[Language] occurs across the curriculum and therefore should not be seen
as a separate subject.
I believe grammar, spelling and reading are thebasis for language develop
ment.

Language develops through all curriculum areas.


A child needs to be aware of basicgrammatical structures.
There is a strong relationship between oral language development and
expression and the ability to express oneself in writing.
Children learn best when there is a positive encouraging environment.

Here is the categorization of these comments from the original study (Nunan,
1993):

IMMERSION
Children need to be immersed in all types of writing/reading literature.
All children benefit from immersion of [sic] the written print.
LEARNING BY DOING/EXPERIENTIAL LEARNING
Children's language develops through experiences so in order for the children to
gain the most out of any given lesson, man)- experiences should be given.
Children learn by using the language.
LANGUAGE ACROSS THE CURRICULUM
It occurs across the curriculum and therefore should not be seen as a separate
subject.
Language develops through all curriculum areas.
GRAMMAR, STRUCTURE, CORRECTNESS
A child needs to be aware of basic grammatical structures.
I believegrammar, spelling and reading are die basis for language development.
ORAL/WRITTEN LANGUAGE RELATIONSHIPS
Spoken language should be mastered before written.

422 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


There is a strong relationship between oral language development and
expression and the ability to express oneself in writing.

CREATION OF RICH, POSITIVE ENVIRONMENT


Children need to be a part of a rich language environment.
Children learn best when there is a positive encouraging environment.

REFLECTION

Compare the categories you developed with those listed above. How simi
lar werethey? Did they overlap or diverge? In cases where there was diver
gence, whatexplains the differences? If you are working with classmates or
colleagues, compare your categories to theirs.

Discourse Analysis, Conversational Analysis, and Interaction Analysis


Discourse analysis is a very broad term that covers a range of methods, techniques,
and approaches. In a recent book on language, it is defined as "the systematic
study of language in context" (Nunan, 2007a, p. 208). Discourse analysis is
sometimes contrasted with text analysis, which focuses on analyzing the formal
properties of language.
Conversation and interaction analysis are closely related to discourse analy
sis. In fact, some linguists see them as part of discourse analysis although there
are differences of emphasis. Discourse and interaction analysts work with both
elicited and naturalistic data, while conversation analysts work with naturalistic
data. Discourse analysts work with either spoken or written language, while con
versation and interaction analysts only work with spoken data. Discourse ana
lysts very often use categorical schemes in their analysis, while conversation and
interaction analysts carry out detailed interpretive analyses—sometimes of quite
limited samples of language.

Developing Themes in Qualitative Data


One of the most straightforward ways of analyzing condensed statements was
developed by Lieblich, Tuval-AIashiach, and Zilber (1998). While their research
focused on the analysis of life-history data, the procedure itself can be used to
identify- patterns in any qualitative data. The procedure has five steps as follows:
1. Read the material several times until a pattern emerges. . . . There are as
pects of the lifestory to which you might wish to pay special attention, but
their significance dependson the entire story and its contents. Such aspects
are, for example, the opening of the story, or evaluations .. . of the parts of
the story that appear in the text.
2. Put your initial and global impressions of the case into writing.

Chapter 14 Qualitative DataAnalysis 423


3. Decide on special foci of content or themes that you want to follow in the
story as it evolves from beginning to end.
4. Using colored markers . . . mark the various themes in the story, reading
separately and repeatedly for each one.
5. Follow each theme throughout the story and note your conclusion. Be
aware of where a theme appears for the first and last times, the transitions
between the themes, the context for each one, and their relative salience in
the text, (ibid., pp. 62-63)

An illustrationof this procedureis found in a study by a teacher who wished


to support her teaching decisions with data (O'Farrell, 2003). O'Farrell was
teaching an English listening comprehension course to international scholars
studying nuclear nonproliferation. Her students had to listento videotaped lec
tures by experts on this topic. The teacher transcribed the lectures and then used
two procedures to analyze the language used. First, she identified frequently
used grammatical structures so she could include those structures in her lessons.
Secondly, she used the color-highlighting function of her word-processing
system to identify and code the speech acts the lecturers used. As a result, she
could easily see in the data the various speechacts that were most important in
the lectures her students would have to understand.

The Card Sort Technique


A similar procedure is the card sort technique initially developed by Lincoln and
Guba (1985). (See also Strauss, 1988.)This procedure has five steps.
1. Place each individual statement on an indexcard and placethe cardsin a pile.
2. Select the first card in the pile, read it and note its contents. The first card
represents the first entry in the yet-to-be-named category. Place it to one
side.
3. Select the second card, read it and note its contents. Make a determination
on tacit or intuitive grounds whether this second card is a "look-alike" or
"feel-alike"with Card 1, that is, whether its contents are "essentially"sim
ilar. If so, place the second card with the first and proceed to the third card;
if not, the second card represents the first entry in the second yet-to-be-
named category.
4. Continue with successive cards. For each card, decide whether it is a
"look/feel alike" of cards that have already been placed in some provisional
category or whether it represents a new category. Proceed accordingly.
5. After some cards have been processed, the analyst may feel that a new card
neither fitsany of the provisionally established categoriesnor seemsto form
a new category. Other cards may also be recognized as possibly irrelevant to
the developing set. These cards should be placed into a miscellaneous pile;
they should notbe discarded at this point, but should be retained for later
review, (adapted from Lincoln and Guba, 1985, pp. 347-348)

424 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


From this process of creating categories, you can then write descriptors of each
category. These category definitions should be specific enough that someone
unfamiliar with your data set could use them to sort the cards and come up with
the same groupings you generated.

REFLECTION

Think about a research question that has interested you as you think
about conductingyour ownclassroom-based or cjlassroom-oriented studies.
How might you use the card sort technique to vork with the data in that
investigation?

USING TECHNOLOGY FOR OUALItTATIVE


DATA ANALYSIS

Technology has become ubiquitous. In research, it is extremely valuable. It is dif


ficult to imagine doing research without word processors, statistical packages,
spreadsheets, and so on. Technology is indispensable for collecting, recording,
and analyzing large sets of data. In the electronic classroom, where learners in
teract with texts and with each other through their keyboards, computers are
able to capture, record, and analyze large amounts of data:
[CJomputers can collect information about the user's actions at the
keyboard, such as by recording each single keystroke in real time. These
can be played back by the researcher to see,' for example, successive
drafts in the writingprocess and use(andabuse) of tools suchasspelling
and grammar checkers and templates. (Beatty, 2003, p. 178)
In this section we will look briefly at some techniques that exploit technol
ogy for data analysis. The two main techniques that we will consider in this sec
tion are data tagging and concordancing. We give more space to concordancing
than tagging because it is a technique that is relatively easy to use. Tagging re
quires either programming skills or expensive software. What they share is the
capability of revealing patterns, regularities and relationships in large sets of
data,called corpora. (Some corporacurrentlybeingassembled in Europe contain
several hundred thousand words.) Because of the amount of data involved,
l
patterns are not immediatelyapparent without the aid of technology.

Data Tagging
Data tagging is a procedure in which information about a pieceof text is tagged and
embeddedinto the text.Wlien the textisanalyzed, the computer can be instructed
to find and extractitems of interest. For example, particulargrammatical features

Chapter14 Qualitative DataAnalysis 425


can be tagged. This process could be useful in investigating patterns of learner
errors in largesamples of classroom data, student writingsamples, and so on.
In a study into the errors made by French-speaking learners of English,
Dagneaux, Denness, and Granger (1998) worked with a database of 150,000
words. The database was tagged for errors using the following comprehensive
error tagging and classification system. In terms of the major categories, FGrep
resented formal grammatical problems. X stood for lexico-grammatical issues,
Rfor register problems, and S for Styleissues. The tag ^represented a missing
word, redundant word, or a word order problem. These major categories also
contained subcategories:
For the grammatical category, for instance, the first subcode refers to
the word category: GV for verbs, GN for nouns, GA for articles. This
code is in turn followed by any number of subcodes. For instance, the
GV category is further broken down into GW (voice errors) etc. The
system is flexible: analysts can add or delete subcodes to fit their data,
(ibid., p. 166)
These tags are inserted into the text to be analyzed. Here is an example of a
tagged text.The dollarsigns indicate the correct version of the item.

There was a forestwith dark green densefoliage and pastureswherea herd


of tiny (FS) braun$brown$cows was grazingquietly (XVPR) watching at
$watching$the toy train going past (ibid.).

The tags can then be retrieved using a similar procedure to that used in concor
dancing (see below). This lists all of the errors bearing a certain code and then
lists these along with the immediate linguistic environment in which they
appear. Here are some lines from the output of a search for errors bearing the
code XNPR.

1. complemented by other (XNPR) approaches of $approaches to$ the


subject. The written
2. are concerned. Yet the (XNPR) aspiration to Saspiration for$ a more
equitable society
3. canwalkwithoutpaying (XNPR) attention of Sattentionto$ the (LSF)
circulation $traffic$

One of the mostwidely used pieces of tagging software is a program called


NUD»IST The name stands for Non-Numerical Unstructured Data Indexing,
Learning and Theorizing (Weitzman and Miles, 1995). This program was de
veloped for analyzing texts in research into the social sciences in general, rather
than in applied linguistics or language teaching. Unfortunately, the program is
expensive and is difficult to use withoutextensive training. However, it is both

426 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


powerful and flexible. In some ways, it is the electronic equivalent of the card
sort technique discussed above. Once the data are tagged, the program can pull
out, rearrange, and cluster pieces of text that are thematically related. It is
therefore much more flexible and time saving than the use of the card sort tech
nique, in which the researcher has to go through the relatively tedious and
time-consuming process of transferring data on to cards, losing the context in
the process. j
Concordancing
Aswe havealready seen,concordancing is a procedureused in coipus linguistics (the
study of extremely large sets of language data) to extract and show in context a
single linguistic item. It can reveal patterns that are not immediately apparent
from a casual inspection of the data.This process is illustratedby a searchfor the
wordgot(Carter, Goddard, Reah, Sanger, and Bowering, 2001, p. 161):

1. He couldn't turn the water on. And he got badly burned. It hap
pened in Mar
2. anyone in the neighborhood who got broken into recently? I
know
3. any extra precautions since the car got broken into last time?
Er well, I
4. he jilted her at the altar. So she got brought up by her
grandmother
5. she's been a bit nervous ever since we got burgled and dark nights
6. you know of? They got burgled. They got burgled once. Yeah.
That was a while
7. by crime that you know of. They they got burgled. They got bur
gled once.
8. done that so I suppose I could have got caned. Yeah. And as
you've gone
9. fool for being honest. You know he got called an idiot for being
honest

Searching for naturally occurring instances of a word in context can yield


interesting insights. For example, Nunan (2007a) points out that from this database
we find that people tend to use get as a passive voice rather than active
voice construction to describe things that happened to them personally
rather than to describe what is done to impersonal things—the only
inanimate object in the above sample is the car in line 3. . . . Another
feature of the data is that get appears to be associated with emotionally
charged,evenviolent, actions and events—burned, broken, burgled, caned,
deported, (p. 53).

Chapter 14 Qualitative Data Analysis 427


Concordancing programs reveal patterns like this one quickly and easily. The
technology makes locating such patterns much faster and easier.
In second language research, concordancing has been used to investigate
error patterns in large samples of learner data. In his review of the computer-
based analyses of learner corpora, Barlow (2005) makes the point that storing
data digitallygreatly facilitates error classification, form-function mappings, and
other forms of linguistic analysis. It is simple to extract particular lexical items
and grammaticalpatterns along with their accompanying linguisticcontext.

OUALITY CONTROL ISSUES IN ANALYZING


QUALITATIVE DATA
As we have noted in the chapters about collecting data qualitatively, there are
concerns about reliabilityand validityin naturalistic inquiry and action research,
just as there are in the psychometric approach. Writing about the life history
method in qualitative inquiry, Plummer (1983) makesthis contrast:
Reliability is primarily concerned with technique and consistency—
with ensuring that if the study was conducted by someone else similar
findings would be obtained; while validity is concerned with making
sure that the technique is actually studying what it is supposed to. A
clock that was consistently ten minutes fast would hence be reliable but
invalid since it did not tell the correct time. In general, reliabilityis the
preoccupation of 'hard' methodologists getting the attitude scale or
the questionnaire design as technically replicable as possible through
standardization, measurement, and control—while validity receives
relatively short shrift, (p. 101)
He notes that reliability can be achieved more easily than validity, which be
comes harder to demonstrate as we investigate more complex phenomena. Still,
there are somewidely used procedures for demonstrating reliability and validity
in qualitative analyses and we will discuss some of them here.

Coder AgreementIndices
One concernaboutanalyzing qualitative datathat relates to reliability is the issue
of subjectivity. How much of what we find has come out of the data and how
much of the interpretation has been insertedby the researcher? Woulddifferent
researchers find the samething in a data set? One way to sort out this problem is
to determine intercoder agreement—an index of the consistency withwhich differ
ent people categorize the same data. (This construct is analogous to interrater
reliability in quantitative research.) Asimplepercentageis calculated by dividing
the number of items upon which coders agree by the total number of items that
were coded.The general rule of thumb is that intercoder agreement should be at
least 85% for readers to have confidence in the reported findings (Allwrightand
Bailey, 1991).

428 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


A parallel issue is intracoder agreement—the extent to which a single person
codes or categorizes the same set of data consistentlyover time. The same math
ematicalsteps are used, and again, a minimum of 85% consistency is expected. If
you are workingwith a team of researchers or can find a colleague to help you
code your data, you should use the usual intercoder agreement procedure. If
you are conducting classroom research alone and |do not have someone to help
you with the coding, you should compute intracoder agreement.
The following quote details the procedures used to determine intercoder
agreement by researchers at the University of Hawaii in coding transcripts of
classroom data: j
Four pages of two transcripts were coded according to the guidelines by
three coders who reached consensus on the codes. Three- to four-page
segments of four additional transcripts were coded separately by two of
these coders until 90% agreement was reached, and the remaining
differences were used to establish modifications and clarifications of
the guidelines and a standard set of codes for the entire set of training
transcripts. The first two transcriptswere then discussed in a group with
all the coders to assure understanding of the guidelines. (Chaudron,
1988, p. 24)

Using these steps to establish indices of coder agreement can help you locate
holes or ambiguities in your coding categories. In addition, acceptably high
inter- or intracoder agreement indices can give your readers confidence in the
categories you use to analyzeyour data.
You can use the card sort technique as a quality control mechanism.
When youhave used the card sort technique to establish categories, youcan
have a colleague sort the same cards and see if the same categories emerge. As a
different way ofusing the card sort to check yourcategories, youcan provide your
colleague with the descriptors for the categories you wish to use and have that
person distribute all the cardsinto those categories.' Where there are cardsthat do
not fit well in an existing category, discuss these items with yourcolleague. You
may find that you need to adjust the descriptors and/or add new ones.

Member Validation

One procedure for checking the validity of quahtative data analyses is called
member validation or member checking. It is used to determine whether qualitative
data are convincing as evidence. (Someresearchers [e.g., Dornyei, 2007] alsouse
the terms respondent feedback or respondent checking.) This step involves asking
people (the members) in the culture under investigation (whether it is a school,
a department, or a classroom) to review the data and the interpretation thereof
to provide the researcher with feedback. So, for example, when Nunan (1993)
had Gloria and the other good language learnersin his study verify his conden
sations of their lengthy narratives, he was using a member checking strategy.
According to Richards, such validation "involves more than simply asking

Chapter14 Qualitative DataAnalysis 429


membersto confirmwhat we havewritten about them: as participants in the re
search process, they have a wider call on our attention and it may be worthwhile
to involve them in other ways"(p. 264).
Dornyei (2007) states that participants should comment on the conclusions
of a study: "They can, for example, read an early draft of the research report or
listen to a presentation of some tentative results or themes, and can then express
their views in what is called a 'validation interview'" (pp. 60-61). If the re
searchers and the participants agree on the findings, then the researchers can
have confidence in the validity of their results.
WTiat happens in cases where the researchers and the participants disagree
on the interpretation of the data? Dornyei points out that "even though the par
ticipants are clearly the 'insiders,' there is no reason to assume that they can in
terpret their own experiences or circumstances correctly" (ibid., p. 61). As an
example, he notes that in heated situations, such as family arguments, "insiders
often have conflictingviews about the same phenomenon" (ibid.).

Qualities of Qualitative Analyses


Working with ideas from Coffey and Atkinson (1996), Richards describes
seven important characteristics of qualitative analyses. These are reprinted as
Table 14.2.

A SAMPLE STUDY

As mentioned above, Bailey (1982) investigated the classroom communication


problems of non-native speaking teaching assistants in a U. S. university—the
University of California at Los Angeles (UCLA). The study had both a quanti
tatively oriented language assessment component and a qualitative component,
but here we will focus on just the qualitative data analysis.
The research question was, "What are the classroom communication prob
lems of non-native speaking teaching assistants?" This question arose from a
highly politicized situation inwhich many undergraduate American students had
complained (to parents, administrators, and legislators) that their TAs didn't
speak English well enough to be teaching at UCLA. (The legislators were in
volved because UCLAis a publicly funded university—its budgetisderived from
public taxes.)
To determine what the communication problems were, Bailey observed
physics and math courses overa ten-week term. Physics and math were chosen
because these twodisciplines had numerous international teaching assistants and
manyof the complaints lodged by the undergraduates had been about coursesin
thesedepartments. Since men faroutnumbered women amongthe international
graduate students who received teaching assistantships in math and physics, all
the TAs observed were men.

430 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 14.2 Essential qualities of qualitative analysis
Quality Explanation
Artful Successful analysis is founded on good technique, but this
in itself is not enough and diere is always more to be
learnt.
Imaginative Analysisis not mechanistic. In order to penetrate beneath
the surface of things, the researcher must make time to
stand back and find different waysof seeing the data.
Flexible Where necessary, the researcher must be prepared to find
alternative approaches to organizational and interpretive
challenges, which means not adhering too rigidly to any
one approach.
Reflexive In order to make the best of opportunities to advance the
process of discovery, the researcher should keep in review
the continually evolvinginterrelationship between data,
analysis, and interpretation.
Methodical Althoughfeelingand instinct may playan important part
in all of the above, the researcher must decide on
appropriate analytical methods and continue to reflect on
these as they are applied.
Scholarly Analysis should take placeiin the context ofa wider
understandingof the relevantliterature, whether relating
to analytical or interpretive issues.
IntellectuallyRigorous The researcher must be prepared to make available the
workings of the analytical process and take account of all
the available evidence, includingdiscrepantcases. This
also means that the researcher has to resist the temptation
to reduce everything to a single explanation.

REFLECTION

Answer the following questions about this stud)', based on whatyou know
so far:

1. What is the design of the study? (This is a trick question. Even though
the data were qualitatively collected and analyzed, the investigation
used one of the research designsfrom the ps rchometric approach.)
2. What is the independent variable and how rrany.levels did it have?
3. What are some control variables in this studjr?
4. Based on your reading and your own expe; lences as a teacher and a
student, what do you think the NNS TAs' communication problems
might have been?

Chapter 14 Qualitative DataAnalysis 431


TABLE 14.3 Sampling decisions in comparingNNS and NS teaching
assistants

Department/Course NNS TAs NSTAs

Math Basic

Math {V
Math
Math
Math
V
Math Advanced
Math Totals
Physics Basic

Physics A
Physics
Physics
Physics V
Physics Advanced
Physics Totals

For comparisonpurposes,the observations were conducted in classes taught


bynon-nativespeaking TAs(NNSs) and alsonativeEnglishspeaking TAs(NSs),
in a range of basic to advanced classes. Each NNS TA in the sample was paired,
in the analysis, with a NS TA who taught the same course during the same
semester. These pairings are shown in Table 14.3.
Time sampling involves decisions aboutwhenyou collectdata in a qualitative
study. Bailey observed each teaching assistant three times—once at the begin
ning, once at the middle, and once at the end of the ten-week term. The hand
written field notes she took during the classes were fleshed out as soon as
possible afterwards. That is, abbreviations were spelled out, examples were
added, and ambiguities were clarified. Then the field notes were rewritten in a
format that put the prosedata on the right two-thirds of the page while the left
third of the page remained blank, as space for coding, inserting questions, and
so on.

The analytic procedure that finally enabledBailey to capture meaningful in


terpretations of the data involved a form of meaning condensation to discover
the themes. First, each set of field notes wassummarized. Baileyforced herself to
distill the copious and detailed lesson descriptions into a two-page summary.
(Two colleagues helped her with these summaries, asa checkon the data conden
sation process.) Then the summaries about each TA were combined and re
worked to generate a prose profile of that TA. This process is summarized in
Figure 14.1:

432 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


r y. r ~\

Ohsen'ation 1 Ohsen'ation 2 Ohsen'ation 3


ofTAl of TA 1 ofTAl

..

r \ /" "\

Summary 1 Summar\' 1 Summary 3


ofTAl ofTAl of TA 1
V J v J

Profile
of TA 1

FIGURE 14.1 Summarizing lengthy field notes to create prose


profiles of TAs (adapted from Bailey, 2006, p. Ill)

When the twenty-four TA profiles were available for comparison, types of


teacher styles quickly emerged from the data. In a process analogous to the card
sort technique, Bailey compared and contrasted the twenty-four summaries to
discover types or categories of teaching styles, as depicted in Figure 14.2:

/ > r >| r > r >

Profile Profile Profile Profile Profile


ofTAl ofTA2 of TA 3 ofTA4 ofTA 5
v ) \ J V ) V J V J

f \ r \ \

—*~ T ype A —*~ T KPe B Tyj eC

FIGURE 14.2 Deriving a TA typolog}'from prose profiles of TAs


(adapted from Bailey, 2006, p. Ill)

In the process, five clear types of teaching styles emerged among these
twenty-four teaching assistants. As mentioned above, these were the (1) inspir
ing cheerleaders, (2) entertaining allies, (3) knowledgeable helpers and casual
friends, (4) mechanical problem solvers, and (5) active but unintelligible TAs. In
the research report, each type is described and then illustrated by a profile of
one of the TAs who represented that type. This process of condensation and

Chapter 14 Qualitative Data Analysis 433


comparison allowed Bailey to interpret the field notes and make claims about
the types of TAs that were more and less successful in the eyes of their under
graduate students.

REFLECTION

Based on the five categorylabels above, which typesof TAs do you predict
would be seen as successful and which would be seen as less successful in
the opinion of the undergraduate students whom these TAs taught?

PAYOFFS AND PITFALLS

As noted at the outset of this chapter, qualitative data are powerful. They are
more accessible to readers without statistical training than are the kinds of so
phisticated quantitative analyses that appear in many published journal articles.
They can be used to explain important concepts to teachers, administrators,
journalists, and parents in human terms, while quantitative data sometimes seem
too abstract and detached or—conversely—too concrete and impersonal.
There are many pitfalls in choosing to collect and then in analyzing qualita
tive data. The first, especially for novice researchers, is the possibility of getting
overwhelmed by the sheer volume of data. Detailed field notes about an hour-
long lesson can run to thousands of words.
If you are generating handwritten observational field notes during a lesson,
you maychooseto word process the data beforeyou analyze it. (We recommend
that you do so.) While word processing is initially time-consuming, having the
data stored electronically can save you time in the long run and will make the
analysis easier.
The length of qualitative data sometimes createsproblems when researchers
try to publish their findings. Most journals and many anthologies impose page
restrictions on manuscripts submitted for publication. It is very difficult to pro
vide a convincing analysis of reams and reams of data in a short article. Some
times researchers choose to focus on very particular issues in reporting their
findings. In somecases, lengthystudies have been published as journalarticles or
book chapters. For example, Schmidt and Frota's (1986) analysis of Schmidt's
learning of Portuguese in Brazil is eighty-nine pages long.
Sometimes qualitative data analyses don't pan out. For example, Bailey
originally coded her observational field notes on math and science teaching
assistants' lessons using a category system designed to analyze classroom dis
course (Sinclair and Coulthard, 1975). The categories had seemed promising in
terms of addressing the research question, but they did not really reveal the core
issues separating the successful and lesssuccessful TAs.It wasnot until she tried
the summarizing process described above that Bailey was able to arrive at a

434 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


satisfying explanation of the communication prob ems experienced by the non-
native speaking TAs.
As noted above, transcription is an important tool in classroom research,
but it is a tedious and time-consuming process.|Allwright and Bailey (1991)
described the data collection procedure used by R. L. Allwright (1980), which
involved suspending stereo audio microphones from a wire above the students'
desks. The stereo function allowed the researchers to tune out some of the
extraneous noise as they were trying to transcribe the data, and having the
microphones suspended (rather than sitting on the desks) minimized some of
the bumps and scrapes, as people moved chairs, opened books, and so on.
Researchers working with children have sometimes used wireless recording
microphones clipped to the students' clothing to get better qualityaudio record
ings. Doing so is helpfulin makingtranscription easier, but it is also costly.
In sum, time and the propensity to drown in the data are the biggest prob
lems associated with doing qualitative analyses. Yet, it is partly the element of
extensive time spent with the data that gives qualitative research its power to
illustrate, to illuminate, and to convince readers. So,if you are considering doing
a study that involves qualitative data collection and analysis, plan to spend ample
time in working with the data.

QUESTIONS AND TASKS


1. Read a published article on classroom research that utilized qualitative
data. How were the data analyzed? What difficulties did the researcher(s)
report? Can you think of other difficulties that might have occurred but
that were not discussed by the author(s)?
2. In the article you read, what were the benefits of the qualitative analyses
undertaken by the author(s)?
i

3. Think about a study you would like to conduct. Focus on the research
question(s).What kinds of qualitative data would you want to collect? How
would you want to analyze those data? Sketch out your ideas and share
them with a classmate or colleague.
4. Ask a language learner you know to tape-record or write his or her
language learning history. Then create a condensed narrative from that
narrative.

5. Analyze some of your own qualitative data using the card sort technique. If
possible, compare your results with another reader. What similarities and
differences are there between the two analyses?
6. If you are teachingor taking a language class, try to write a report of a typ
ical day. (For an example, turn to Chapter 7 for van Lier's [1996a] descrip
tion of a typical day in a bilingualschool in Peru.) As you do so, be aware
of the mental processes you use in deciding what information to include
and what to exclude.

Chapter14 Qualitative DataAnalysis 435


7. Go to a concordancing program on the Internet (the one below is free) and
complete a search for a word, idiom, or grammatical item that interests
you. What insights can you derive from the data? (The aim of this task is
to give you an experience of using a concordancing program rather than
analyzing learner data.)If you are teachinga language class, howcouldyou
use these data in developing a unit or lesson plan?
British National Corpus Sample Queries:
http://sara.natcorp.ox.ac.uk/lookup.html

SUGGESTIONS FOR FURTHER READING

We recommend K. Richards' (2003) book Qualitative Inquiry in TESOL. It


contains clear explanations and many helpful examples.
We also suggest that you consult Holliday's (2002) Doing and Writing
Qualitative Research. While this book is written by an applied linguist, it provides
a general introduction to planning, organizing, and writing up qualitative
research. Although many of the examples are taken from disciplines outside of
applied linguistics such as sociology, management, and health care, the princi
ples and techniques introduced in the book can be readily applied to applied
linguistics and language education.
Dornyei (2007) has a very fine treatment of issues related to analyzing
qualitative data.
Another helpful resource is the special issue of TESOL Quarteiiy edited by
Davis and Lazaraton (1995), entitled Qualitative Research in TESOL.
If you would like to learn more about concordancing as a research tool, see
Biber, Conrad, and Reppen's (1998) Coipus Linguistics: Investigating Language
Structure and Use. See also Barlow (2005).
Weitzman and Miles (1995) explain many computer software programs for
analyzing qualitative data. They discuss thevarious programs in the categories of
text retrievers, text-base managers, programs for coding and retrieving, code-
based theory builders, and conceptual networkbuilders.

436 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


C H A PT E R

15

Putting It All Together

What we learn in the university isnotscientific theory andcertainly not


a theory ofhow todo science. We are exposed to practices: the practices
ofour teachers in the classroom and laboratoryand the practices they
admire, which we read about in the articles they assign tous. The theory
ofhow science should be done isalmost never taught. And even the
theory that explains the practices and articles towhich we are exposed
andthatgives the discipline some coherence isconstructed after thefact.
Itisnot always taught directly, isalways incomplete, and isoften
internally contradictory. (Piore, 2006, pp. 143-144)

INTRODUCTION AND OVERVIEW

In this concluding chapter, we build upon some of the main themes that have
been presented earlier in the book. Looking back, |you will recall thatafter pro
viding an overview of language classroom research in Chapters 1, 2, and 3, we
turned to issues related to planning and implementing research (Chapters 4
through 8).The focus then shifted to data collection (in Chapters 9, 10, and 11)
and data analysis (Chapters 12, 13, and 14). We realize that we have covered
quite a bit of material in the foregoing pages, so the intent in this chapter is to
help you put it all together.
To that end, in this chapter we review some issues raised earlier and discuss
ways of combiningqualitative and quantitative procedures in your own proposals
and subsequent investigations. We also look at the practicalities of doing re
search and suggest some steps that you can take to make the enterprise more
successful and more satisfying. Finally, we address issues related to reporting on

437
your research findings, whether in formal or informal contexts, both in writing
and in oral presentations.
Research as it is presented in scholarly books, journal articles, and formal
monographs is in many ways a misrepresentation. These publications seem to
suggest that research is neat and tidy, and that it flows logicallyand irrevocably
from abstract to conclusion. But such reports are products. Some of the
processes by which the products were arrived at are reported, but some can only
be inferred. The missteps, blind alleys, false starts, and frustrations that the re
searchers encountered in the process of arriving at the final products are rarely
discussed (see Gass and Schachter, 1996). Dingwall (1984) contrasts the messi-
ness of the research process with the resulting product (be it a book, paper, or
conference presentation) as follows:

It may be too strong to suggest that there is a 'conspiracy of silence'


among academics about the problems, the possibilities, the limitations,
and the pressures of research practice; but certainly for most graduate
researchers working in comparative isolation, it is painful to discover
the extent of compromise and ambiguity inherent in their work. (p. 1)
Neophyte researchers who are unaware of these ambiguities often experience a
sense of demoralization—if not outright distress—at what they perceive to be
their own inadequacies and shortcomings when they come to conducting and
writing up their own research.

ACTION

Skim through two or three articles about language classroom research.


What problems, if any, do the authors report? Share your findings with a
classmate or colleague.

The messiness of the research process is mirrored in the history of scientific


inquiry.Just as individual research projects do not grow neatly from conception
through gestation to birth, neither does the advancement of scientific knowl
edge. Rather it is marked by both incremental changes and seismic intellectual
earthquakes and paradigm shifts (Kuhn, 1996). The broadening out of ap
proaches to doing language classroom research is one such shift. The field has
now moved from a rather narrow early focus on the psychometric approach and
experimental attempts to control variables, to the methodological eclecticism of
naturalistic inquiry, and on to the legitimization of action research. Thus, lan
guage classroom research has seen substantial changes in the ways that data are
collected and analyzed as well as in the types of issues that are investigated.
The way research methods texts talk about doing research has changed too.
Kuhn (1996) writes about reconstructed logic and logic in use. The former term
refers to the articulated, codified procedures that science uses and replicates,
while the latter refers to those judgment calls,hunches, and ideas we try out as we
go about the tricky and sometimes frustrating business of conducting research.

438 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


In fact, although we have emphasized planning, systematicity, and careful deci
sion making based (in part) on reviewing the available literature, we acknowl
edge that intuition, judgment, insight, and inspiration also have a role to play in
research. This claim may be especiallyapt in language classroom research, where
real students and real teachers come together to grapple with the challenges of
language teaching and learning.
In this book, we have tried to go a bit beyond the often-described
procedures of experimental science to develop discussions of case studies, ethno
graphies, surveys, classroom observation techniques, diary studies, and other
introspective processes for gathering and analyzing data. We close with a brief
discussion of the melding of approaches in what has come to be called mixed
methods research.

COMBINING QUANTITATIVE AND


QUALITATIVE APPROACHES

Throughout this book, we have argued that the qualitative/quantitative divide is


unnecessary and unhelpful. Neither approach is inherently superior to the other,
and referring to research as either quantitative or qualitative is, in most cases, an
oversimplification. The types of data that are collected and the types of analyses
that are performed on those data should be driven by the research questions that
are posed as the study unfolds, not by some preconceived notion of one approach
being automatically superior to another.
Some studies, particularly those seeking causal relationships or correlations
between variables, will lead the researcher towards an experimental research
design that is likely to involve quantitatively collected data and quantitative
analyses. Other studies, which focus more on description and portrayal rather
than on determining causal or correlational relationships, will incorporate data
that are qualitatively collected and qualitatively analyzed. Still others will involve
a ///ixed methods or hybrid design that combines qualitative and quantitative data
collection and analysis procedures. Dornyei (2007) defines a mixed methods
study as one that "involves the collection or analysis of both qualitative and
quantitative data in a single study with some attempts to integrate the two
approaches at one or more stages of the research process" (p. 163).

REFLECTION

Are you familiar with any mixed methods studies? If so, think about one
that interests you. What combination of qualitative and quantitative proce
dures did it involve?

Some studies that are predominantly naturalistic in their approach have nev
ertheless used quantified data to support the authors' claims and provide infor
mation about the participants. For example, in the sample study by Lynch and

Chapter 15 Putting It AllTogether 439


Maclean (2000) described in Chapter9, the English proficiency of twostudents
was characterized as follows: Alicia had a 4.0 on the IELTS and a 400 on the
TOEFL, and had scored only 5% on a dictation administered at the start of the
course. Daniela had a score of 7.0 on the IELTS and a 600 on the TOEFL. Her
dictation score was 97% at the beginning of the course. Even if we know noth
ing about the dictation, the IELTS, or the TOEFL, the contrast between the
two learners is striking, and the quantitative characterizationhelps us to imagine
their English proficiency.
Other predominantly qualitative studies have included quantitative data as
part of the findings. For example, van Lier's (1996a) ethnography of the Spanish-
Quechua bilingual program in the Peruvian highlands (summarizedat the end of
Chapter 7) included percentage data on the teachers' language use (Spanish,
Quechua, or both), as well as test results on the children's knowledge of colors,
numbers, and body parts in both languages.These quantitative data allowedvan
Lier to state that "as the students were learning Spanish, they were unlearning
Quechua. In other words, a process of subtractive bilingualism was underway"
(p. 382).
Still other studies interweave qualitative and quantitative procedures as key
partsof the analyses. As an example, let us return to Bailey's research on the class
room communication problems of non-native speaking (NNS) teaching assis
tants, described in Chapter 14 (Bailey, 1982; 1984). Her original proposal had
calledfor the coding of observational field notes using a modificationof Sinclair
and Coulthard's (1975) categorysystem for analyzingclassroominteraction, but
although that analysis wasdone, it failed to reveal anymeaningful outcomes. Ad
ditionalquantitative datawerecollected and further analyses wereconductedthat
involvedboth quantitative and qualitative procedures, as shown in Table 15.1.
After many false starts and problems applyingcoding categoriesto her data,
Bailey ended up reducing the voluminous observational field notes in a sequen
tial meaning condensation process that involved summarizing, comparing, and
profiling (see Table 15.1 and Figures 14.1 and 14.2). Through these procedures,
Bailey produced a typology of teaching assistants. In this grounded theory ap
proach, the categories arosefromthe data rather than beingimposed upon them.

TABLE 15.1 An example of a mixed methods study


Data Collection Data Analysis

Quahtative Field notes written during and Field notes summarized and
after observations in the regularly profiles generated for each TA;
scheduled classes of 24 TAs 24 profiles compared in order
(12 NSs and 12 NNSs) to identify types of TAs
Quantitative Students' end-of-course Mean scores computed for each
evaluations of the TAs collected TA type; testing for statistically
using the university's regular significant differences across the
numerical rating scales ratings of the various types of TAs

440 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


The five clearTA types that emerged amongthis cohort werelabeled with
metaphors that described their behavior and apparent attitudes toward their stu
dents. These labels were(1) the inspiring cheerleaders, (2)the entertainingallies,
(3) the knowledgeable helpers and casual friendsj (4) the mechanical problem
solvers, and (5)the active but unintelligible TAs. Twenty-one of the twenty-four
TAscouldbe categorized usingtheselabels, but three couldnot. Those three did
not quite fit the categories for a variety of reasons. For example, one mathTA (a
native speaker of English) was a "casual friend" to his students but couldn't be
considered a "knowledgeable helper" since he was unable to solve the students'
homework problems, and they left his classes frustrated and confused.
In fact, after rereading the field notes many times, Bailey realized that dur
ing the classes she had observed, the students had responded differently to the
TAs who adopted these various teaching styles. So, she turned to the students'
numerical ratings of these TAs and matched the quantitative data to the TA
typology. The university's regular end-of-course evaluation system included
categories for evaluating TAs' overall effectiveness and their helpfulness to the
students outside of class. The mean ratings reported in Table 15.2 are based
on a scale in which the average is 50 and the standard deviation is 10. (Higher
numbers indicate more positive ratings.)

REFLECTION

What pattern do younotice in Table 15.2 wljen you examine the mean
scoresfrom the students' evaluations acrossthe! five TA groups in terms of
their overall effectiveness and outside helpfulness?

The stable pattern of change across the mean scores in the student evalua
tion data shows that the students did indeed evaluate teaching assistants who
used some teaching styles in the typology more favorably than they did others.
That is, the inspiringcheerleaders were rated more highlythan the entertaining

TABLE 15.2 Mean ratings on overall effectiveness and outside


helpfulness of five TA types (adapted from Bailey, 1982,
p. 140)
X Overall X Outside
TAType N • Effectiveness Helpfulness

Inspiring Cheerleaders 2 62.7 59.7

Entertaining Allies 2 59.0 57.2

Knowledgeable Helpers & Casual Friends 8 55.0 53.2


Mechanical Problem Solvers 6 42.7 46.5

ActiveUnintelligible TAs 3 36.8 36.1

Chapter15 PuttingItAll Together 441


allies, who in turn were rated more highly than the knowledgeable helpers and
casual friends, and so on.
Bailey did not set out to find these five types, or styles, among the teaching
assistants. As noted in the sample study in Chapter 14, her original research
question was, "What are the classroom communication problems of non-native
speaking teaching assistants?" In the proposal stage, she and her advisory com
mittee envisioned a discussion of the linguistic elements (e.g., pronunciation,
vocabulary, syntax, morphology, etc.) that influenced the NNS TAs' abilities to
communicate with their students. While these linguistic issuesdid in fact play an
important role, the differences amongthe students' meanratings of the TAs rep
resenting these five types suggest that how the TAs related to their students was
part of the communication problem, in addition to the language competence of
the TAs. (See Rounds, 1987, for further discussion of this typology in her own
classroom research.)
Since the early 1980s, it has become more common to incorporate both
quantitative and qualitative data collection and analysis procedures in the same
investigation (see Chaudron, 1986). In recent years, researchers have employed
hybrid designs that involve both qualitative and quantitative data collection and
interpretation. AsNunan (2005)points out, "Classroom researchers appear to be
increasingly reluctant to restrict themselves to a single data collection tech
niques, or even to a single research paradigm" (p. 237).Although the qualitative-
quantitative debate introduced at the beginning of this book is still aliveand well,
the issue has lost a great deal of the heat that it had some years ago.
We will now return to issues covered earlier in order to synthesize the con
cepts presented in recent chapters.Our hope is to help you integrate ideasabout
quantitative and qualitative data collection and analysis in your own research
projects.

DEVELOPING AND CARRYING OUT


A RESEARCH PLAN

If you are completing a master's degree or doctoral studies, you will probably
have to create a research plan for your proposedstudy and get it approved by a
committee or your research supervisor before you proceed. If you are a re
searcher seeking either internal or external funding, you will have to convince
the fundingagencythat you have a clear plan as wellas the knowledge and skills
to carry it out. Even if you are working completely independendy and do not
need approval from any other individual or group to carry out your investiga
tion, we encourage you to develop a research plan before you begin. Doing so
will save you time and trouble later.
A formal plan for research is called a proposal because the author proposes
the study to a group or an advisor by submitting a document that clearly articu
lates the plan. In graduate programs, the proposal must typicallybe approved by
a faculty committee before the researcher begins the study. In other contexts,

442 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


researchers present proposals to funding agencies in hopes of getting grants to
support the research.
Although there is variation in the format required of proposals in different
contexts, a research proposal normally consists of some predictable, standard
sections. The introduction usually includes a literature review, including iden
tifying the research gap (Cooley & Lewkowicz, 2003), which helps provide the
motivation for the new study (see Chapter 2 above). The research questions
and/or hypotheses are clearly articulated and key terms are operationally de
fined. The next section may be called "Procedures!" or "Methods." It includes a
description of who will participate in the study, the steps to be taken to collect
and analyze the data, and a description of the instruments to be used. If a study
involves a questionnaire or a structured interview, copies of these tools will typ
ically be included in an appendix to the proposal. The likelyoutcomes and the
potential value of the study are also discussed.

Steps in the Research Process


You will recall from Chapters 2 and 3 that the research process begins with an
area of interest or concern. This focus can come from reading theory or learning
about what others have set out to explore or establish. More often than not, in
the case of language classroom research, it comes from our own teaching experi
ence. Some of us are interested in the effects of our own actions and practices
(such as the kinds of questions we ask or the classroom climate we set up) on
classroom dynamics and learning outcomes. Others wish to focus on aspects of
student behavior (such as preferred learning stylesand strategies, task types, and
student-student interaction). For others, the experience of being a language
learner provides a stimulus for research. (See, fof example, Campbell's [1996]
diarystudy based on her experiences as a student of Spanish in Mexico.)
After establishing an area of interest, the crucial next step is to frame that
interest as a research question. Pinning down what is initially a vague prospect
into a specific questioncanbe extraordinarily difficult. We havebeen involved in
projects, either as researchers or research supervisors, in which it has taken
weeks, and sometimes even months, to get the question right.
Framinga research question properlywillforce you to be precise. Imagine,
for instance, that your area of interest is learning styles and strategies. This is a
huge general area that has spawned an enormous number of research questions,
including the following:
© Do effective language learners share a particular set of learning strategies?
© Is there a relationship between cultural background and learning style?
• Is there a relationship between previous learning experiences and learning
strategy preferences?
© What are the preferred classroom learning strategies of learners who have
been identified has having particular learning styles?
© What effect on students' language learning does deliberate teaching of
learning strategies have?

Chapter 15 Putting ItAllTogether 443


REFLECTION

Look back at the research on learning styles by Nunan and Wong (2006),
which served as the sample study in Chapter 5. What research question(s)
did the authors pose? Did they use qualitative or quantitative data collec
tion and analysis procedures, or both?

I laving established a research question, the hard work of planning a research


design begins. Is the stud}' to be an exploratory one or one that confirms or con
tests prior research? Is it going to test the strength of" relationships between two
or more variables or is it to be more open-ended and descriptive? Isit going to in
volve the collection of one large set of data at a single point in time or is it to be
an iterative process in which multiple data sets are collected over a period of time?
Considering questionssuch as those posed above may lead you to adopt one
of the two 'pure' research paradigms discussed in Chapters 1,4, 7, and 10
(Grotjahn, 1987). One of these, you may recall, is the analytical-nomological
paradigm, which involves an experimental or quasi-experimental design, quanti
tatively collected data, and statistical analyses. The other 'pure' paradigm is the
exploratory-interpretive kind, in which selection and intervention are eschewed
in favor of naturalistic inquiry. This research paradigm entails a nonexperimen
tal design, qualitatively collected data, and interpretive analyses.
Returning to the theme of mixed methods research, it is quite possible that
you will atlopt a 'hybrid' design involving both qualitative and quantitative data
collection and analysis. For example, an investigation into learning styles and
strategies might involve the administration of a questionnaire that would yield
quantitative data that could be analyzed statistically. Vou could combine those
results with a series of follow-up interviews that would provide qualitatively
collected data to be interpreted qualitatively.
In Chapter 1 we introduced Grotjahn's two 'pure' research designs—the
psychometric approach (experimental design, quantitative data, statistical anah •
sis) and the naturalistic approach (non-experimental design, qualitative data, in
terpretive analysis). The remaining mixed forms of Grotjahn's classification
system are listed in Table 15.3.
Some studies invoke more than one paradigm in a single investigation. As
Dornyei (2007) has noted, "mixed methods research offers researchers the
advantage of being able to choose from the full repertoire of methodological
options, producing as a result many different kinds of creative mixes" (p. 168).

ACTION

Look back at the sample studies that conclude the chapters of this book.
Try to categorize two or three of them using Grotjahn's classification
system for identifying research paradigms (see Table 15.3).

444 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


TABLE 15.3 The mixed methods paradigms in Grotjahn's research
classification (adapted from Grotjahn, 1987, pp. 59-60)

Paradigm Number
and Label ResearchDesign Data Collection Data Analysis

1. Analytical-nomological Experimental Quantitative Statistical


2. Exploratory-interpretive Nonexperimental Qualitative Interpretive
3. Experimental- Experimental or Qualitative Interpretive
qualitative-interpretive quasi-experimental
4. Experimental- Experimental or Qualitative Statistical
qualitative-statistical quasi-experimental
5. Exploratory- Nonexperimental Qualitative Statistical
qualitative-statistical design
6. Exploratory- Nonexperimental Quantitative Statistical
quantitative-statistical design
7. Exploratory- Nonexperimental Quantitative Interpretive
quantitative-interpretive design
8. Experimental- Experimental or Quantitative Interpretive
quantitative-interpretive quasi-experimental

Having decided on the researchquestion(s), the design, the type(s) of data to


be collected, and the type(s) of analysis to be done, the next step is to identifythe
informants for the study.These people are often called the subjects in experimen
tal research, the cohort in naturalistic inquiry, the respondents when surveys or
interviews are involved, and the participants in action research.
Sometimes a very well-constructed project with a well-articulated research
question never gets beyond the planning stage because appropriate subjects
cannot be located. Or it may be that the subjects are available, but the conditions
cannot be arranged that willenableyou to carry out the study to your satisfaction.
This problem arises,for instance, in classroom-based experiments requiring the
assignment of subjects to experimental and control groups. In manyschoolcon
texts, the randomization required of the true experimental designs is simply not
feasible. In fact, "in our experience, in the world of real schools, real teachers,
and real students, this almost never happens" (Spada, Ranta, and Lightbown,
1996, p. 38).
Another important preliminary step is to define and operationalize the key
constructs underlying the study. Imagine, for instance, that motivation emerges
as a problem in your teaching situation. You may feel that the motivation of your
students is declining over the course of the academicyear, and you would like to
carry out a study to determine whether there is any evidence to support your
perception. Having reviewed the literature, you might decide on a definition of
language learning motivation along the lines of "the internal drive to persist in
language learning over a protracted period of time."

Chapter15 PuttingIt AllTogether 445


The process of operationalizing the construct involves creating proce
dures and tools that will enable you to collect data on the construct. For in
stance, you might decide that motivated students are those who take steps to
improve their language outside of the formal classroom environment. You
therefore decide to keep a record of the frequency and duration of their visits
to the school's self-access language center. Your operational definition of mo
tivation has thus become "the frequency and duration of visits to the self-access
language center." This operationalization could be problematic in terms of in
ternal validity, however, because it is possible to posit alternative reasons for
the students' actions. Some may visit the center because it is a comfortable and
quiet place to check e-mails, chat with friends, read magazines, watch videos,
etc.—activities that might have little to do with the students' motivation for
language learning.
In Chapter 2, we discussed the importance of doing a literature review. One
function of the literature review is to help the researcher operationallydefine the
constructs under investigation. In many cases, it perfectly acceptable (in fact,
often desirable) to operationalize your constructs in the same way that other
researchers have done. Under some circumstances, you may choose to adapt
others' operational definitionssomewhatto suit your study. Either way, you need
to cite your sources accurately and appropriately.
One type of print-medium resource that can be particularlyhelpful for peo
ple getting started on a research project is the genre known as the handbook. This
type of book is a compendium of key information, usuallywritten by established,
credible authors with recognized expertise on the topic. For example, Hinkle
(2005) has edited the thousand-plus-page Handbook of Research in Second
Language Teaching and Learning—a collection of short explanations of key topics
in the field. The equally voluminous Handbook of Qualitative Research (Denzin
and Lincoln, 2000)is not solelyabout languagelearning, but it is helpful for lan
guage classroom researchers. Burns and Richards (in press) have edited the
Cambridge Guide to Language Teacher Education, while Long and Doughty
(in press) have produced a collection called the Handbook of Second and Foreign
Language Teaching. Twoother books, whichwere not designed specifically for re
search purposes but will be helpful nonetheless, are Carter and Nunan's (2001)
The Cambridge Guide to Teaching English to Speakers of Other Languages and
Nunan's (2003) Practical English Language Teaching. The chapters in all these
handbooks are typicallywritten to be concise and clear. They identify key issues
on the topics and can be helpful sources of definitions and of bibliographic
information as well.

ACTION

Locate one of the handbooks described above. Skim through it to see


which operationaldefinitions providedby the authors might be helpful in
researchprojectsthat you would be interested in conducting.

446 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Decidingon how your data are going to be analyzed is the next step in the
process. Qualitative data, as we saw in Chapter 1^-, are usually voluminous and
often must be condensed, sorted, and categorizedbefore patterns emerge. If the
studyinvolves quantification, youneedto determine whatstatistical toolsareap
propriate for working with the data in question. Are you looking at significant
differences in mean scores? Correlations between different data sets? Differ
ences in the frequency with which various events occur? All of these require
different statistical procedures, as we discussed in Chapter 13.

Ethical Concerns in Doing Language Classroom Research


In addition to articulating all the procedures involved in planning a study, we
must also be concerned about ethical issues. These include being honest and fair
with the participants in research projects and meeting professional standards for
how the study will be carried out.
Before you actually begin collecting any data, you need to address such
ethical considerations. Most universities these days have ethics committees (also
calledhuman subjects committees) that are required to vet and approve all research
proposals before a study begins. The procedures usually require researchers to
inform potential subjects about the study and obtain formal written approval
from them prior to any data being collected. This process is normally called
obtaining informed consent. If you are observing or interviewing children, you
must typically get the informedconsent of their parents or guardians.

ACTION

In the contextwhereyou would like to conduc t research, find out if there


is a policy aboutinvolving people in research. If there Is, what steps must
youtake with regard to infonning thepotentia participants in yourstudy?

Some journals require contributing authors to comply with informed con


sent procedures. For instance, the TESOL Quarterly has two ways that potential
authors can meet the informed consent guidelines. Contributors can either
document that they have followed the human subjects review procedures of
their home institutions or that they have complied with a list of conditions for
informing the people in the study. These steps can be found in the last fewpages
of the TESOL Quarterly, along with guidelines about publishing data from
students' work.
Obtaining informed consent can be problematic in the case of investigations
into language learningand use.Tellingthe subjects what we are lookingfor may
prompt them to change their behavior, thus invalidating our results. As Labov
(1972) pointed out through his conceptof the observer's paradox, the purposeof
much language-related research is to document how people use language when
they are not being observed, but the main way to collect such data is through
observation, which may influence how they use language.

Chapter 15 PuttingIt All Together 447


Another interesting ethical challenge for researchers wishing to conduct in
vestigations in second language contexts is to make sure that the informed con
sent process is understood by the participants in the study. That is, written
documents that explain the study must either be translated into the subjects' first
language or written at a level that will be clear to them, so they will fully under
stand what they are agreeing to do.

ACTION

Think of a study that you would like to conduct. Draft the statement about
the research that you would give potential participants in order to obtain
their informed consent to be involved in the study.

Evaluating the Research Plan


At all stages in doing research, it is important to keep the research plan and
the data collection and analysis procedures constantly under review. Here is a
nonexhaustive list of questions that can help guide the planning and implemen
tation of your research. This list has been developed from a set originally pro
posed by Nunan (1992). You can use these questions to reflect on your study and
self-evaluate your work as the investigation evolves. These questions can also
guide a peer review process. As shown in Figures 15.1, 15.2, and 15.3, some
questions are most logically posed before the study, others during the investiga
tion, and some as you are concluding the research.
We wish to stress again the importance of addressing these issues before you
begin collecting your data. More time spent planning will pay off in time saved
later. Remember the carpenter's rule: Measure twice, cut once.
Not all the problems that may occur can be identified in advance, however.
Figure 15.2 lists six kev questions to ask during the investigation.
Finally, as your research project is coming to a close, there are many issues
to consider as you interpret the data. These issues will influence how and where
you report on your study. Questions about the results and how to present them
are listed in Figure 15.3.
Although we have separated these questions into those that are posed at the
beginning of a project, during the data collection and analysis, and after the study
has been conducted, it is actually a good practice to anticipate all these questions
at the outset. Doing so will help you save time and minimize frustrations.

GettingSupport and Going Public


As noted in Chapter 1, we believe there isvalue in 'going public' with the results
of your study. If you do decide to share your research findings (and we hope
you will), a good way to start is by presenting your findings to a supportive audi
ence. We typically have our research students present both their initial plans and
then later their findings to groups of their peers.

448 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Research Questions
Are the research questions worth investigating?
Are the questions feasible, givenavailable resources?
Do the research questions imply a strongcausal or correlational relationship
between two or more variables?
What are the constructs underlying the questions?
How are these constructs to be operationalized?
Subjects
Will it be possible to obtainthe requisite numberof relevant subjects?
What sampling strategies, if any, willbe usedto obtainsubjects?
Does the research design require random assignment of subjects to different
conditions? If so, willthe required randomization be feasible?
If the study entails longitudinal datacollection, will subjects be available for a
sufficiendy long period of time?
Data Collection Method
What data collectionmethods can be used for investigating the questions?
Which of these are feasible, given available resources and expertise?
Is it possible and/ordesirable to utilize morethan one datacollection method?
Given the chosen data collection method(s), what threats are there to the internal
reliability of the study?
Given the chosen data collection method(s), what threats are there to the external
reliability of the study?
Given the chosen data collection method(s), what threats are there to the internal
validity of the study?
Given the chosen data collection method(s), what threats are there to the external
validity of the study?
How will the data collectiontools and proceduresbe field tested prior to the
actual data collection?

DataAnalysis Procedures
Does the research entail statisticalanalyses, interpretive analyses, or both?
Given the research questions and the type(s) of data,what statistical and/or
analytical tools are appropriate?
Is it necessary or desirable to quantify qualitative data? If so, howwill this
quantification be done?
Do you, as a researcher, havethe necessary skills to carry out the statistical
analysis? If not, is consultinghelp available?

FIGURE 15.1 Questions to pose before the study formally begins


(adapted from Nunan, 1992, p. 227)

Chapter15 Putting ItAllTogether 449


What practical problems are emerging asthe research proceeds?
What solutions to these problems are available?
In whatway(s) is the research changing and evolving during the course of the
data collection process?
Are alternative questionsor issues emergingas the data are collected?
Is the nature of the research changing as the research proceeds?
Areadditional data or datacollection procedures required?

FIGURE 15.2 Questions to poseduring the study (adapted from


Nunan, 1992)

Results
What are the actual outcomes of the research?
Does the investigation answer the research question(s) originally posed?
Does it answer other questions as well (or instead)?
Are the results consistent with the findings of similar studies?
Are there any contradictory findings? If so, how can these be accounted for?
What are the implications of the findings for practice?
What additional questions and suggestions or further research areprompted by
the research?

Presentation
How can the research best be presented?
Who is/are the appropriate audience(s) for this study?
What form(s) will the research report take—monograph, thesis, journalarticle,
conference presentation?

FIGURE 15.3 Questions to poseafter conducting the study(adapted


from Nunan, 1992)

Our research students have found it useful to get evaluative feedback from
fellow students at different stages in the research process. Peer feedback can be
extremely valuable, particularly if it is from students who are a stage or two
ahead in the research process. In their guide to writing qualitative research pro
posals, Marshall and Rossman (1995) state,
The experiences of our graduate students suggest that the support of
peers is crucial for the personal and emotional sustenance that students
find so valuable in negotiating among faculty whose requests and
demands may be in conflict with one another. Graduate seminars or
advanced courses in qualitative methods provide excellent structures
for formal discussions as students deal with issues arising from role

450 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


management to grounded theory-building in their dissertations.
Student support groups also build in a commitment to others. ... By
establishing deadlines and commitments to one another, students
become more efficient and productive, (p. 136)
Theyadd that"these groups bridge the 'existential aloneness' ofthe conduct of
dissertation research" (ibid.).
Collaborative feedback can also be provided in round-table discussions
among students who are at the same stage in their research. In addition to
providing substantive feedback from people who are 'in the same boat' as it
were, round-table discussions can reduce the sense of isolation that research
students often experience. The questions listed in Figures 15.1 through 15.3
above can be adapted for peer evaluations and round-table discussions as well
as for reflective self-evaluation.
For those who are ready to 'go public,' we often encourage our research
students to give presentations at professional conferences. Depending on the
culture of the professional organizationyou choose, you will find that most peo
ple who attend such conferences do so because they are interested and wish to
learn. In addition,ifyou are presentingin a concurrent session (i.e., in a time slot
when there are several different presentations scheduled), conference goers can
choose the session that most appeals to them. As a| result, the people who attend
your talk are likely to be motivated and interested audience members, As such,
they can be quite supportiveof research students and teachers reporting on their
original investigations.

REFLECTION

What are the possible advantages and disadvmtagijes of giving a public


presentation about your research? Brainstorm a list with a colleague or
classmate.

How does one get to givea presentation at a conference? Normally, confer


ence organizersrequire potential presenters to submit an abstract—a concise, in
formative summarythat willappearin the program bookletso that attendeescan
decide which session will be most beneficial for them to attend. Sometimes a
longer text (e.g., 200 words) is reviewed by a selection committee and a very
short version (40 or 50 words) appears in the printed program. These summaries
are usuallysubjected to a blind review process—whiah means that the refereeswho
review proposals don't know who has written them and the authors don't know
who reads them. The blind review procedure is used to promote fairness and
impartialityin selecting the papers to be presented at the conference.
Conference presentations can take the form of workshops, demonstrations,
or formal papers that are read aloud. One format that we have found particularly
useful for novice researchers is the poster presentation. In this context, the pre
senter creates a poster representing his or her project. Such posters can include

Chapter15 Putting It AllTogether 451


the research question(s), a list of the data collection and analysis procedures
used, flowcharts ofhow thestudy proceeded, tables or graphs depicting thefind
ings, and a list of recommendations based on the results. At the conference, in
stead ofreading a paper aloud to anaudience, the presenter stands bythe poster
for a period of time and explains the information to interested people who come
to view the poster. In thisway, novice presenters get the opportunity to talk to
individuals or small groups in an interactive fashion. In addition, presenting the
ideas more than once can build confidence and help the speaker anticipate the
kinds of questions that will arise.

Some Practicalities of Doing Research


We close this section by summarizingsome of the practicaladvice that has been
offered throughout this volume. Specifically, we will discuss anticipating prob
lems, making sure the project is feasible, letting the questions lead the way, and
knowing where to go to get help.
First, if you can anticipate problems, they can often be forestalled. In Chap
ter 2 we listed the following problemsidentified in a survey of graduatestudents
and teachers involved in action research (Nunan, 1992). However, these issues
can be problems in undertaking a study using any research method:
1. Lack of time (a particular problem for those who are working full time)
2. Lack of expertise, particularly with critical phases of the research, such as
formulating the research question and determining the appropriate re
search design and statistical tools
3. Difficulty in identifying research subjects
4. Problems in negotiating access to research sites
5. Issues of confidentiality
6. Ethical questions relating to collecting data
7. Problems flowing from the growth of the project after its initiation
8. Sensitivity of reporting negative findings, particularly if these relate to
worksites or individuals with whom one is associated
9. Preparation of a written report of the research, (p. 219)
Of these, lack of time was seen as the most important.
The time issue is, of course, related to whether or not the project is feasible.
Some of the most exciting research ideas founder because of practical impedi
ments. These are many and various. They can include lack of time or expertise
on the part of the researcher, ethical considerations, cost factors, or the inability
to find sufficient numbers of appropriate subjects. It is therefore important to
develop a detailed and carefullyconsidered plan before launching into the proj
ect proper. A plan that is detailed, well considered and—if possible—reviewed by
someone who has research experience will reveal pitfalls that may not be imme
diately apparent to less experienced researchers.
It is also important to let the questions lead the way. One point we have
made a number of times in this book is that no research tradition or method is

452 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


inherently superior to any other. The procedures thatyou adopt should be de
termined by the research questions you pose. The research questions are the
lynchpins oftheentire investigative enterprise, which iswhy they are so impor
tant. This is not to say that the questions should be set in stone. They may very
well evolve asyoubegin to collect and interrogate your data, particularly in nat
uralistic inquiry and action research.
This pointis vividly illustrated by Freeman (1992). In this classroom-based
study, theresearcher setouttoinvestigate issues ofteacher discipline and control
in secondary school French-as-a-foreign-language classrooms in the United
States. However, asthe research proceeded, he discovered that teacher discipline
was less of an issue than student self-discipline and self-control. In the course of
his investigation, he found an intriguing connection between control over the
target language andself-discipline on the part of the students.
Finally, throughout the life of an investigation, knowing where to get help
when it is needed is important. Sources of help include information available in
books, journals, and on the Internet;other researchers; and peers. If the research
is part of a formal degree, the research supervisor has an importantrole to play
in reviewing and evaluating the research. As Marshall and Rossman (1995) state,
In planning qualitative dissertation research, support from university
faculty to make judgments aboutthe adequacy of the proposal iscrucial.
At least one committee member, preferably the chair, should have had
experience conducting qualitative studies. Such experience should also
help in makingdecisions about how to allocate time realistically to var
ious tasks, given that all-important idea that qualitative research often
takes much more time than one might predict. Faculty support and en
couragementare critical for developing research proposals that are sub
stantial, elegant, and doable, and for advocacy in the larger university
community to legitimize this particular study and qualitative research
generally, (pp. 135-136)

QUALITY CONTROL ISSUES


The quality controlissues in mixed methods studies are an amalgam of those re
lated to quantitative research (e.g., trying to ensure both internal and external
reliability as well as internal and external validity) and those related to more
qualitative research. Rather than repeatwhat we said in Chapters 13 and 14 on
thesepoints,let us justsuggest a metaphorhere. That is,doing a mixed methods
study is rather like a balancing act. There are strengths and weaknesses to both
approaches, and it is generally felt that usingprocedures from both may help to
address the weaknesses.
Sometimes researchers, especiallynovice researchers, can get bogged down
in the sheer volume of data that emerges in a mixed methods study.A technique
we have found helpfulis to take a conceptual step backand try to review the big
picture. Is your study primarily quantitative in nature, with the qualitative data
(observational field notes, interview transcripts, etc.) meant to explicate the

Chapter15 PuttingItAll Together 453


numerical outcomes? Or isit, like van Lier's (1996a) study of bilingual education
in Peru, primarily qualitative in design with some quantified data that can be
used to illustrate and substantiate patterns observed in and interpretations sug
gested by the qualitative data?
Sometimes the qualitatively and quantitatively collected data share equal
weight in the research equation. To mix our metaphors justa bit,in thissituation
it is almost as if the two sorts of data are dance partners, but the lead changes
continuously. As illustrated in Bailey's struggle with the international TA data,
sometimes the qualitative datapoint the way, and then the quantitative data take
overand point you in a different direction. Being able to cope with this sort of
ambiguity and beingopen to the nudges from the data are useful characteristics
for mixed methods researchers.
One practical technique is to create a chart for each research question you
are addressing, which lists the quantitative datacollection procedures on one axis
and the qualitative datacollection procedures on the other. Byfilling in the cells
in the chart, you can keep track of what information is emerging and what
remains elusive. Sometimes the quantitatively and qualitatively collected data
dovetail nicely, and other times they diverge. In either case, you will need to
explain the patterns that emerge.

SAMPLE STUDY

Asour final samplestudy, we willsummarizeresearch by Katz (1996). We chose


this mixed methods study because it uses both quantitative and qualitative data
collection methods as well as quahtative and quantitative analyses. It also
illustrates some interesting problems in operationalizing classroom constructs.
Katz (ibid.) studied the teachingstyles of four teacherswho all taught differ
ent sections of the same ESL composition course at a large university in the
United States. The course assignments were intended "to prepare students for
the academicwriting in their universitycontent courses"(p. 59).The students in
these classes represented forty-three different first-language backgrounds. The
report "describes the interaction between teachers and university students as
students are engaged in learning how to improve their writing skills in a second
language" (p. 57). The four instructors involved in the study were selected
because of (1) their interest in participatingin the project, (2) their recognized
excellence as teachers, and (3) their teaching style.
Katz particularly wished to investigate this last construct—teaching style.
Initially she defined this concept as "the manner in which the teacher interprets
his or her role within the context of the classroom in creating the culture of the
classroom" (p. 58). She notes, however, that teaching style "is a slippery con
struct" since it involves both teachers' behaviors (which are observable) and their
beliefs (which may not be directly observable)(p. 61).
This study employed multiple data collection procedures. Katz tape-
recorded two interviews with each of the four teachers—one at the beginning
and one at the end of the semester. In addition, she observed their regularly

454 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


scheduled classes throughout the semester: "two weeks at the beginning of the
semester, one in the middle, and one at the end,| averaging eleven hours per
teacher" (p. 60). Duringthe observations, shemade bothwritten field notes and
audio recordings. Katz also keptnotes in her researcher's journal throughout the
project.
To compare these fourteachers, Katz examined teaching activities and tran
scripts of the teachers' talk. She contrasted classroom events in the four teachers'
lessons across the dimensions of class openings, taking roll, the teachers' use
of space, the dominant structures of talk during lessons, turn selection, use of
narratives, use of rhetorical questions, and the teachers' policies on students
arriving late. She examined these data in order to discover
the similaritiesand differencesacross teachers as they engage in shaping
their writing classrooms. The portraits that follow combine an analysis
of these behaviors with excerpts from teacher interviews as we talked
about their goals and beliefs about writing and teaching, discussing
specific aspects of what occurred in their classrooms (p. 61).
Using these procedures, Katz selected metaphors to identify four styles among
the teachers she observed. Meg was characterized as the Choreographer, Sarahas
the Earth Mother, Ron as the Entertainer, and Karen as the Professor.The qual
itative analyses yielded a prose profile of each. Katz alsoused quantitative data to
help explain the differences among these four styles, as showin Table 15.4.
Katz (ibid.) used these (and other) quantitative data to help characterize the
four teachers and explain how they used their lesson time. But the quantitative
data alone do not fully convey the stable style differences that emerged. For ex
ample, if we compare the percent of teacher-to-class interaction across the four
teachers,we find that Meg, the Choreographer, and Karen, the Professor, appear
to be quite similar, with 79.1% and 79.7% of their class time, respectively, given
over to the teacher addressing the entire class. Katz points out that Karen's and
Meg's classrooms "actually differdramatically in terms of howeachteacherplays

TABLE 15.4 Percent of classtime various interactional groupings were


used by four teachers (adapted from Katz, 1996,p. 67)
Meg Sarah Ron Karen
(Choreographer) (Earth Mother) (Entertainer) (Professor)

Teacher-to-class 79.1 64.0 65.3 79.7

Individual work 11.4 2.2 6.0 6.5

Group 9.5 28.7 9.0 13.8

Student-to-student 5.1 3.3


(brief)
Student-to-student 16.4
(extended)

Chapter 15 Putting It AllTogether 455


out her lessons during these frames of teacher-to-class interaction" (p. 82). As
shown in numerous transcripts, Meg's talk time consisted largely of "carefully
shaped question and answer sequences," while Karen"delivered the lesson con
tent by means of extensive lectures" (ibid.). It would not have been possible to
make this important contrast confidently without the qualitatively collected data
(the transcripts and observation field notes).
Here we have summarized just a small portion of Katz's study of teacher
style, but we hope these excerpts give you the flavor of how one classroom
researcher was able to combine quantitative and qualitative data collection and
analysis proceduresto produce a rich picture of classroom interaction. In a stan
dard experimental design, these four teachers might have been considered to be
quite similar: They all taught from the same curriculum in the same program
with students drawn from the same population. Yet the teaching styles they
adopted and the ways theyenacted that curriculum differed markedly, as shown
by the rich data sources Katzemployed in this mixed methods study.

REFLECTION

Katz (1996) warns against the problem that a metaphormight "shapeper


ceptions rather than clarifying them" (p. 62). Look back at the sample
study described at the end of Chapter 4. That process-product study in
volved Bailey and some colleagues as observers in low-level language
classes at a U. S. mihtary installation. Consider this anecdote:
There was one teacher whom Bailey and the other observers pri
vately thought of as the "Gestapo" because her error treatment
strategies were so intense. The teacher would interrupt students and
emphatically model the correct form if they mispronouncedwords or
made grammar or vocabulary errors during lessons. She sometimes
belittled students about their errors. However, one day after an
observation, a student who had just been the object of the teacher's
scathing error treatment made the unsolicited comment to the
observer that he really liked this teacher because she prepared him
and his classmates so well for the exams they would have to take.
Since those exams emphasized accuracy, he appreciated her consis
tent emphasis on the learners being correct.
What metaphor do you think this student would haveused to characterize
this teacher?

PAYOFFS AND PITFALLS

It has been saidthat "booksthat dealwith classroom researchdo little to help re


searchers and future researchers understand the complexities and the problems
of conductingclassroom-based research" (Schachter and Gass, 1996, p. viii). We

456 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


hope that this book has proven to be an exception to this claim (with which we
generally agree). The difficulty with being candid is thatwe don't want to dis
courage anyone from getting started on classroom research. Withthat caveat in
mind, let us turn to the lastset of "payoffs and pitfalls" in this book, startingwith
potential problems.
The first pitfall associated with undertaking a mixed methods study has to
do with data.It is easy to get overwhelmed with masses of information any time
you areworking with qualitatively collected data, and this problem can be com
pounded byhaving quantitative data added to thestudy. We would suggest that
you develop techniques for managing the data in your particular research proj
ect that will enable you to seeclearly whathas beendone andwhatneeds to be
done next. It is essential that you labelall data with the date, time, and place of
collection, and that you make back-up files immediately of anyinformation you
cannot afford to lose.
Another issue related to managingcopiousqualitative and quantitativedata
sets is connected to the ethicalissues discussed above. If you havepromised that
yourdata will be treated confidentially, thenyoumust lock them away—either
electronically or physically—and shield the identities of the participants. If you
have promised the participants anonymity, you must provide pseudonyms that
neither give hints as to their identities nor obscure relevant information. You
need to develop a written key about the actual names and the pseudonyms you
use for your participants in anywritten reports or oral presentations. Keep that
list hidden and safe and only use the pseudonyms any time you discuss your
project.
Yetanother majorpitfall associated with mixed methodsstudiesis the problem
of time. Many of our research students over the years have eschewed statistical
research because theywerenervous about mathematical analyses or didn't want to
spend the timeneeded to learn aboutstatistical procedures. This choice is ironic
since, in our experience, nice, neat quantitative analyses usually take much less
time than does the laboriousprocessof analyzingquahtativedata or working back
and forth between qualitative and quantitative data setsin the samestudy.
One reason why mixed methods studies can be so time-consuming is that
different sorts of data maysuggest differentinterpretations of the results. For in
stance, if there is no significant difference between the pre-test and post-test
scoresof a group of learnerswe have observed in a study, does this finding indi
cate that no learning has occurred? If we have observational field notes and tran
scripts of audio recordings we have made in a language class, we may be able to
look at those data and find evidence that what ^he students learned was not
addressed in the test, but doing so is very time-consuming.
This lastpoint,ofcourse, provides uswitha segue into the payoffs. Combining
quantitative andquahtative data collection and analysis procedures, asKatz (1996)
and van Lier (1996a) did, can help us understand anomalies in our results. The
quahtative findings often shed light on the statistical outcomes, but at the same
time, numerical information can flesh out patterns observedin quahtativedata.
An advantage of using quantitative data is that, for better or for worse,
policy makers, legislators, and administrators are often impressed by "hard

Chapter 15 PuttingItAllTogether 457


data"—statistical results showing significant gains or differences or relation
ships. An advantage of using the results of qualitative data is that they are often
more memorable andmoreintelligible to laypersons or to teachers whomaylack
formal statistical training.
We predict that as technological tools for collecting and analyzing data
continue to increase in number and improve, there will be more mixed meth
odslanguage classroom studies in the future. If youwould liketo try your hand
at a project that involves both quantitative and qualitative data collection and
analyses, we suggest you start small. Investigate an issue in a class you are
teaching or taking that can utilize the existing or naturally occurring data sets.
For instance, if you are a teacher and you wish to document the development
ofyoursecondary school language students' confidence and their writing skills
over time, photocopy the homework and in-class writing assignments they
submit to you. These data are naturally collected as part of the workyou must
do anyway, and they can provide a rich source of information about the stu
dents' emerging syntactic, morphological, lexical, and discourse development.
Combined with test scores or questionnaire data, they could help you deter
mine whether the learners' writing skill and confidence have increased in the
course of a school year.

CONCLUSION

In this final chapter, we have revisited some of the themes, issues, and concerns
that run through this book. We have considered combining quantitative and
qualitative data collection and data analysis procedures to produce mixed meth
ods studies. We have suggested a sequence of steps to take in developing a re
search plan and reviewed the steps involved in planning and conducting
language classroom research. Our hope is that this chapter has both introduced
new ideas and reminded you of concepts introduced earlier in order to help you
put it all together in designing and carrying out your own language classroom
research.

QUESTIONS AND TASKS


1. Both Bailey (1982; 1984) and Katz (1996) used metaphors to describe
different teaching styles. Following the work of Miles and Huberman
(1984), Katz notes that using metaphors helps to reduce massiveamounts
of qualitative information by turning particularities into generalizations
and thereby creating patterns. She also notes, however, that using
metaphors runs the risk of "shaping perceptions rather than clarifying
them" (p. 62). What do you think? Can you envision a study in which you
would want to characterize participants in some way that employed
metaphor?

458 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


2. Think of teachers that you have observed. Based on your experience, can
you recall a teacher who could be characterized as an "Entertainer" or a
"Choreographer" or an"Earth Mother" or a "Professor"? These labels are
from Katz's(1996) research (seeTable 15.4). We can ask the same question
about the metaphors Bailey (1982; 1984) usedto describe the teaching as
sistants in her study. In other words, do thesemetaphors seem trustworthy
or broadly applicable based on your own experience (as an observer, a stu
dent, a researcher, or a teacher)?
3. Preparea detailed research planfor a projectofyour choice. Include infor
mation on the following aspects of the research. (The following list is
adapted from Nunan, 1992, pp. 227-228.)

I. Background
General area
Title
The problem or issue
Purpose of the study
Likely importance of the study
Background to the study
The aims and justification of the study
Limitations of the study
What the study proposes to do
II. Literature review
A plan for the literature review including headings and sub
headings
Resources (books, journals, Web sites, etc.) to be consulted
HI. Research design
The research questions or hypotheses
Definitions of terms
Sampling strategies
Subjects, participants, or informants
Data collection methods
Data analysis procedures
IV. Presentation and dissemination
Statement on how the results will be presented
Suggestions as to which conferences and/or publications would
be appropriate

Chapter 15 PuttingItAll Together 459


4. Grotjahn's (1987) two 'pure paradigms' were introduced in Chapter 1 and
discussed furtherin Chapters 7,4, and 10. In thisfinal chapter, wehave in
troduced the other combinations of the variables in his framework. Given
the studyyousketched out in the task above, which of Grotjahn's research
paradigms would you use? Think about (a) the research involved, (b) the
types of data to be collected, and (c) the types of analyses to be done.
5. Review and evaluate your plan.What problems or difficulties do you antic
ipate? How might these be dealt with?
6. Share yourplan witha classmate or colleague for feedback and discuss any
issues that may arise.
7. Visitthe Web sites of at least three professional organizations that interest
you. Find out when their upcoming conferences are scheduled and what
you must do in order to proposepresentations that you couldofferat those
conferences.
8. From the Web sites you visit, choose the conference that appeals to you
most. Draft an abstract for a presentation you would like to make follow
ing the organization's guidelines. Share your abstract with a classmate or
colleague for feedback.
9. Visit the Web sites of some organizations that fund research in our field.
We suggest these three:
A. The Spencer Foundation:
www.spencer.org

B. The Social Sciences and Humanities Research Council of Canada:


http://sshrc.ca
C. The International Research Foundation for English Language Educa
tion (TIRF):
www.tirfonline.org
Find out what lands of research they support and what their application
processes involve.
10. Compare the submission guidelines for at least three professional journals
that interestyou. (The kinds of journals that acceptlanguage classroom re
search reports include the TESOL Quarterly, Modern Language Journal,
Prospect—A Journal ofAustralian TESOL, The ELTJournal, Language Teach
ingResearch, Asian Journal ofEnglish Language Teaching, and so on.) How do
their submission requirements differ? Which journal might be the best for
you to considerifyou wouldliketo submita research articlefor publication?
11. Here is a 200-word summary that could be submitted to a conference ad
judicating committee. Space in conference program booklets is usually
quite limited, however, and often all that appears there is a forty- or fifty-
word abstract. Imaginethat you are the author of this study. Using the in
formation below, compose an interesting and informative fifty-word
abstract which would encourage conference attendees to come to your
presentation.

460 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


Summary: This paperreports on the results of a yearlong investigation
of ESL courses at community colleges in the United States. The study
addressed the issue of howteachers usingcontent-based instructionshift
the focus of the lesson from content foci to language foci. Content-based
instruction (CBI)is definedas an approach to language curriculumdesign
that entails "the integration of content learningwith language teaching
aims" (Brinton, Snow, & Wesche, 2003, p. ix). Also, "the form and
sequence of language presentation are dictated by content material"
(Brinton et al., 2003, p. ix). Thus, the research question posed in this
language classroom research was whatstrategies do teachers useto teach
language in content-based English as a second language classes? The
researcher observed classes in Arizona, California, Hawaii, Florida,
Illinois, and Texas, taking running field notes to document the activities
involved. In some cases, teachers were interviewed. The refined field
notes were sent to the teachers for member checking. This presen
tation will concentrate on the patterns observed in the lessons of six
teachers whose classes were the most clearly content-based. A question-
and-answer period will follow the presentation and participants will
be given Internet access to a lengthy reference list on content-based
instruction.
Reference: Brinton, D.M., Snow, M.A., & Wesche, M. (2003). Content-
based second language instruction. Ann Arbor: University of Michigan
Press.

12. If you were a member of the conference selectioncommittee that reviews


such summaries for the conference program, what questions or comments
would you have for the author of the summary above? What suggestions
would you have for improving the summary?

SUGGESTIONS FOR FURTHER READING

In an earlier chapter, we suggested you read the book Second Language Classroom
Research: Issues and Opportunities, edited by Schachter and Gass (1996). We re
mind you of that book here becausethe authors in liiat volume intentionally dis
cussed the problems they encountered in their studies in hopes that other
researchers would benefit from a candid discussion of the sorts of difficulties that
arise. In particular, the chapters by Markee (1996) and Polio (1996) are useful.
For helpful advice on writing research proposals and reports, we recom
mend Dornyei (2007), Galvan (1999), and Hatch and Lazaraton (1991). For
guidance on planning qualitative research, see Marshall and Rossman (1995;
1999; and 2006).
Some ethical considerations in proposing and conducting language class
room research are discussed by McKay (2006).

Chapter15 Putting It AllTogether 461


For an interesting and influential discussion about the qualitative/
quantitative debate, see Smith and Heshusius (1986).
In recent years, teacher research has become more widely accepted. The
volume edited by Beaumont and O'Brien (2000) provides many examples of
teacher research in our field, as does the one edited by Edge and Richards
(1993). Some teacher research has been motivated by Allwright's work. Several
authors contributed to Understanding the Language Classroom (Gieve & Miller,
2006)—a book inspired by his ideas. See also Burnaford, Fischer, and Hobson
(1996), Nunan (1997a, 1997b), and Stewart (2006). In general education,
Kincheloe (1991) and Lankshear and Knobel (2004) are good resources.
For two 'user-friendly' books see K. E.Johnson's (1998) Teachers Understand
ing Teaching and (1999) and Undeistanding Language Teaching: Reasoning inAction
(1999). For research on languageteachers' decision making, see Woods (1996).
In addition, the TESOL organization has published a series of edited re
search volumes that focus on studies conducted by teachers in various parts of
the world, includingAsia (Farrell, 2006), Europe (Borg and Farrell, 2007), the
Americas (McGarrell, 2007), the Middle East (Coombe and Barlow, 2007), and
Australia and New Zealand (Burton and Burns, 2008). We recommend these
volumes because they provide numerous interesting examples of research done
by teachers in their own professional contexts.

462 EXPLORING SECOND LANGUAGE CLASSROOM RESEARCH


REFERENCES

Acheson, K. A., & Gall, M. D. (1997). andpractice in language teacher


Techniques in the clinical supervision education: Voicesfrom thefield
ofteachers: Preservice andinservice (pp. 115-133). Minneapolis, MN:
applications (4th ed.).New York: Cente r for Advanced Research on
Longman. Language Acquisition.
Adelman, C, Jenkins, D., & Kemmis, S. Allwright, D. (2003). Exploratory prac
(1976). Rethinkingcasestudy: tice: Rethinking practitioner research
Notes from the second Cambridge in language teaching. Language
conference. Cambridge Journal of Teaching Research, 7, 113-141.
Education, 6(3), 139-150. Allwright, D. (2005). Developingprin
Aljaafreh, A., & Lantolf,J. P.(1994). ciples for practitioner research:The
Negative feedback as regulation and case of exploratory practice. Modern
second language learning in the zone LanguageJournal, 89(3), 353-363.
of proximaldevelopment.Modern Allwright, D., & Bailey, K. M. (1991).
LanguageJournal, 7^,465-483. Focus on the classroom: An introduction to
Allen, P.J., Frohlich, M., & Spada, N. classroom researchfor language teachers.
(1984). The communicative orienta Cambridge: Cambridge University
tion of second language teaching. In Press.
J. Handscombe, R. Orem, & B. Taylor Allwright, D., & Lenzuen, R. (1997).
(Eds.), On TESOL '83:The question of Exploring practice:Work at the
control (pp. 231-252). Washington, Cultura Inglesa, Rio de Janeiro,
DC: TESOL. Brazil! Language Teaching Research,
Allison, D. (1998). Investigating learners' 1(1), 71-79.
course diaries as explorationsof Allwright, R. L. (1980).Turns, topics,
language. Language Teaching Research, and tasks: Patterns of participation in
2(4), 24-47. language learning and teaching. In D.
Allwright, D. (1983). Classroom-centered Larsen-Freeman (Ed.), Discourse
research on language teachingand analyses in second language research
learning: A brief historical overview. (pp. 165-187). Rowley, MA: Newbury
TESOL Quarterly, 17(2), 191-204. House.
Allwright, D. (1988). Observation in the Alreck, R, & Settle, R. (1985). Thesurvey
language classroom. New York: research handbook. Homewood, IL:
Longman. Irwin.
Allwright, D. (1997). Quality and sustain- Amanpour, C. (2005). Commentary on
ability in teacher-research. TESOL National Pubic Radio.
Quarterly, 31(2), 368-370. Andrews] R. (Ed.). (1993). The Columbia
Allwright, D. (2001). Three major dictionary ofquotations. New York:
processes of teacher development and Columbia University Press.
the appropriate designcriteria for Appel,J. (1995). Diary ofa language
developingand using them. In B. teacher. Oxford: Heinemann English
Johnston & S. Irujo (Eds.), Research Language leaching.

463
Applewhite, A., Evans HI, W, & research insecond language acquisition
Frothingham, A. (2003). And I quote: (pp. 67-102). Rowley, MA: Newbury
The definitive collection ofquotes, sayings, House.
andjokesfor the contemporary speechmaker. Bailey, K. M. (1984). A typology of
New York: Thomas Dunne Books. teachingassistants. In K. M. Bailey,
Asher,J. J., Kusudo,J. A., & de la Ibrre, R. F. Pialorsi, & J. Zukowski/Faust
(1993). Learning a second language (Eds.), Foreign teaching assistants in
through commands: The second field U.S. universities (pp. 110-125).
test. In J. W Oiler, Jr. (Ed.), Methods Washington, DC: NAFSA.
that work: Ideasfor literacy and language Bailey, K. M. (1990). The use of diary
teachers (2nd ed., pp. 13-21). Boston: studiesin teacher education programs.
Heinle & Heinle. In J. C. Richards & D. Nunan (Eds.),
Auerbach, E. (1994). Participatory action Second language teacher education
research. In A. Cumming, Alternatives (pp. 215-226). Cambridge: Cambridge
in TESOL research: Descriptive, University Press.
interpretive and ideological orienta Bailey, K. M. (1991). Diary studies of
tions. TESOL Quarterly, 28(4), classroom languagelearning:The
673-703. doubting game and the believing
Bachman, L. E, & Palmer, A. S. (1996). game. In E. Sadtono (Ed.), Language
Language testing inpractice: Designing acquisition andthe second/foreign
and developing useful language tests. language classroom (Anthology Series28),
Oxford: Oxford University Press. (pp. 60-102).Singapore: SEAMEO
Bailey, K. M. (1981).An introspective Regional Language Center.
analysis of an individual's language Bailey, K. M. (1992). The processesof
learning experience. In S. Krashen & innovation in language teacher devel
R. Scarcella (Eds.), Issues in second opment: What, why and how teachers
language acquisition: Selectedpapers of change. In J. Flowerdew, M. Brock, &
the Los Angeles Second Language S. Hsia (Eds.), Perspectives onsecond
Research Fonim(pp. 58-65). Rowley, language teacher education (pp. 253-282).
MA: Newbury House. Hong Kong: City Polytechnicof Hong
Bailey, K. M. (1982).Teaching in a Kong.
second language:The communicative Bailey, K. M. (1996). The best laid plans:
competence of non-native speaking Teachers' in-classdecisionsto depart
teaching assistants.Ph.D. dissertation from their lesson plans. In K. M. Bailey
in Applied Linguistics, University of & D. Nunan (Eds.), Voicesfrom the lan
California, Los Angeles. guage classroom: Qualitative research on
Bailey, K. M. (1983a). Some illustrations language education (pp. 15-40). New
of Murphy's Law from classroom York: Cambridge UniversityPress.
centered research on language use. Bailey, K. M. (1998a).Approaches to
TESOL Newsletter, 17(4),August. empirical research in instructional
(Reprinted in Selected articlesfrom the languagesettings. In H. Byrnes (Ed.),
TESOL Newsletter, 1966-1983, Learningforeign and second languages:
pp. 81-85, byJ. F. Haskell, Ed., Perspectives in research andscholarship
Washington, DC: TESOL). (pp. 75-104). New York: Modern
Bailey, K. M. (1983b). Competitiveness Language Associationof America.
and anxiety in adult second language Bailey, K. M. (1998b). Learning about
learning: Looking at and through the language assessment: Dilemmas, decisions
diary studies. In H.W Seliger & and directions. Boston: Heinle &
M. H. Long (Eds.), Classroom oriented Heinle.

464 Reference
Bailey, K. M. (2001a). Action research, coursel Paper presented atthe
teacher research, and classroom LanguageTesting Research
research in language teaching. In M. Colloquium,Monterey,California.
Celce-Murcia (Ed.), Teaching English Barlow, M. (2005). Computer-based
asa second orforeign language (3rd ed., analyses of learner language.In R.
pp. 489^198). Boston: Heinle & Ellis & G. Barkhuizen,Analyzing
Heinle. learner language (pp. 335-357).
Bailey, K. M. (2001b). What my EFL Oxford: Oxford University Press.
students taught me. The PACJournal, Bateson, G. (1972).Steps toan ecology of
7, 7-31. mind. New York: Ballantine.
Bailey, K. M. (2005). Lookingbackdown Beatty,K (2003). Teaching andresearching
the road: A recent history of language computer-assisted language learning.
classroom research. Review ofApplied London: Longman.
Linguistics in China: Issues in Language Beaumont, M., & O'Brien, T. (2000).
Learning andTeaching, 1, 6-47. Collaborative research insecond language
Bailey, K. M. (2006).Language teacher education. Stoke on Kent, England:
supervision: A case-based approach. New Trentham Books.
York: Cambridge University Press. Bejarano, Y. (1987). A cooperativesmall-
Bailey, K. M., Bergthold, B., Braunstein, group methodology in the language
B., Fleischman, N. J., Holbrook, classroom. TESOL Quarterly, 21(3),
M. P., Tuman, J., Waissbluth, X., & 483-5b4.
Zambo, L. (1996).The language Bell,J. (1987). Doingyourresearch project.
learner's autobiography: Examining Milton Keynes,England: Open
the "apprenticeship of observation." University Press.
In D. Freeman & J .C. Richards
Benson, P., & Nunan, D. (2005). Learners'
(Eds.), Teacher learning in language
stories: Difference anddiversity in lan
teaching (pp. 11-29). New York:
guage learning. New York: Cambridge
Cambridge University Press.
University Press.
Bailey, K. M., Curtis, A., & Nunan, D.
(2001). Pursuingprofessional develop
Biber, D.L Conrad, S., &Reppen, R.
(1998). Corpus linguistics: Investigating
ment: The selfassource. Boston:
language structure and use. Cambridge:
Heinle & Heinle.
Cambridge University Press.
Bailey, K M., & Nunan, D. (Eds.)
Birch, G. J. (1992). Language learning
(1996). Voicesfrom the language class
case smdy approach to second
room: Qualitative research onlanguage
language teacher education. In J.
education. New York: Cambridge
University Press.
FlowJrdew, M. N. Brock, &S. Hsia
(Eds.)L Perspectives on second language
Bailey, K. M., & Ochsner, R. (1983).A teacher education (pp. 283-294). Hong
methodological review of the diary Kong City Polytechnic of Hong
studies: Windmill tilting or social Kong
science? In K. M. Bailey, M. H. Long,
Bley-Vroman, R., & Chaudron, C.
& S. Peck (Eds.), Second language
(1994). Elicited imitation as a measure
acquisition studies (pp. 188-198).
of second-language competence. In
Rowley, MA Newbury House.
E. E. Tarone, S. M. Gass, & A. D.
Bailey, K. M., & Saunders, S. (1998, Cohen (Eds.), Research methodology in
March). Relationships among course second-language acquisition (pp.
objectives, self-assessment, and 245-262). Hillsdale, NJ: Lawrence
achievement in a learning strategies Erlbaum Associates.

Reference 465
Block, D. (1996). A window on the class (pp. 121-134). Washington, DC:
room: Classroom events viewed from TESOL.
different angles. In K. M. Bailey & D. Brown, C. (1985b). Requestsfor specific
Nunan (Eds.), Voicesfrom the language language input: Differencesbetween
classroom: Qualitative research on lan older and younger adult language
guage education (pp. 168-194). New learners. In S. Gass & C. Madden
York: Cambridge University Press. (Eds.), Input insecond language
Block, D. (1998). Tale of a language acquisition (pp. 272-284). Rowley, MA:
learner.Language Teaching Research, Newbury House.
2(2), 148-176. Brown, H. D. (2004). Language assess
Borg, S., & Farrell, T. S. C. (Eds.). (2007). ment: Principles andclassroom practices.
Language teacher research in Europe. White Plains, NY: Longman.
Alexandria, VA: TESOL. Brown, J. D. (1988). Understanding
Bowers, R. (1980). Verbal behaviour in research insecond language learning: A
the language teaching classroom. teacher's guide tostatistics andresearch
Unpublished doctoral dissertation, design. Cambridge: Cambridge
University of Reading, England. University Press.
Braine, G. (1994). Comments on A. Brown,J. D. (2001). Using surveys in
Suresh Canagarajah's "Critical language programs. Cambridge:
ethnography of a Sri Lankan Cambridge University Press.
classroom." TESOL Quarterly, 28(3), Brown, J. D. (2005). Testing in language
609-623. programs: A comprehensive guide to
Brinton, D., & Holten, C. (1989). What English language assessment. New York:
novice teachers focus on: The McGraw-Hill.
practicum in TESL. TESOL Brown, J. D., & Rodgers, T. S. (2002).
Quarterly, 23, 343-350. Doing second language research. Oxford:
Brinton, D. M., Snow, M. A., & Wesche, Oxford University Press.
M. (2003). Content-based second Brown, R. (1973). Afirst language: The
language instruction. Ann Arbor: early stages. London: Allen and Unwin.
University of Michigan Press. Brumfit, C, & Mitchell, R. (1990a). The
Brock, C. (1986). The effect of referential language classroom as a focus for
questions on ESL classroom research. In C. Brumfit & R. Mitchell
discourse. TESOL Quarterly, 20(1), (Eds.), Research in the language class
47-58. room: ELT Documents, 133 (pp. 3-15).
Brock, M. N., Yu, B., & Wong, M. London: Modern English Publica
(1992). Journaling' together: tions and British Council.
Collaborative diary-keeping and Brumfit, C, & Mitchell, R. (Eds.).
teacher development. In J. Flowerdew, (1990b). Research in the language class
M. N. Brock, & S. Hsia (Eds.), room: ELT Documents, 133. London:
Perspectives on second language teacher Modern English Publications and
development (pp. 295-307). Hong British Council.
Kong: City University of Hong Kong. Bruner,J. (1983). Child's talk: Learning to
Brown, C. (1985a). Two windows on the use language. New York: Norton.
classroom world: Diary studies and Burling, R. (1978). Language develop
participant observation differences. ment of a Garo and English-speaking
In P. Larson, E. L. Judd, & D. S. child. In E. M. Hatch (Ed.), Second
Messerschmitt (Eds.), On TESOL '94: language acquisition: A book ofreadings
Brave new worldfor TESOL

466 Reference
(pp. 54-75).Rowley, MA: Newbury National Centre for English
House. Language Teaching and Research,
Bumaford, G., Fischer, J., & Hobson, D. Macquarie University.
(1996). Teachers doing research: Practical Burns, A., de Silva Joyce, H., & Hood, S.
possibilities. Mahwah, NJ: Lawrence (Eds.). I(1999b). Staying learner-centred
Erlbaum Associates. in a competency-based curriculum:
Burns, A. (1995). Teacher-researchers: Teachers' voices 4. Sydney:National
Perspectives on teacher action Centre for English Language
research and curriculum renewal. Teaching and Research,Macquarie
In A. Bums & S. Hood (Eds.), Teachers University.
voices: Exploring course design ina chang Bums, A.,|de SilvaJoyce, H., &Hood, S.
ingcurriculum (pp. 3-29). Sydney: (Eds.). (1999c). Teaching casual conver
NCELTR, Macquarie University. sation: Teachers' voices 6. Sydney:
Bums, A. (1997).Valuing diversity: National Centre for English Language
Action researching disparate learner Teaching and Research, Macquarie
groups. TESOL Journal, 7(1), 6-9. University.
Bums, A. (1999). Collaborative action Bums, A., de SilvaJoyce, H., & Hood, S.
researchfor English language teachers. (Eds.). (2000).A new look at reading
Cambridge: Cambridge University practices: Teachers' voices 5. Sydney:
Press. National Centre for English
Bums, A. (2000). Facilitating collabora LanguageTeachingand Research,
tive action research: Some insights Macquarie University.
from the AMEP. Prospect: AJournal of Burns, A., & Richards, J. C. (in press).
Australian TESOL, 15(3), 23-34. Cambridge guide tolanguage teacher
Bums, A. (2004). Action research. In E. education. Cambridge: Cambridge
Hinkel (Ed.), Handbook ofresearch in University Press.
second language teaching andlearning Burton, J., & Bums, A. (Eds.). (2008).
(pp. 241-256). Mahwah, NJ: Language teacher research inAustralia
Lawrence Erlbaum Associates. and New Zealand. Alexandria, VA:
TESOL.
Burns, A., & Burton, J. (Eds.). (2008).
Language teachei' research inAustralia Busch, M, (1993). Using Likert scalesin
and New Zealand. Alexandria, VA: L2 research. TESOL Quarterly, 24(4),
TESOL. 733-736.

Bums, A., de Silva Joyce, H., & Hood, S. Campbell, C. C. (1996). Socializing
(Eds.). (1995).Exploring course design with the teachers and prior language
in a changing curriadnm: Teachers' voices learning experience: A diary study. In
1. Sydney: National Centre for K. M. Bailey & D. Nunan (Eds.),
English LanguageTeachingand Voicesfrom the language classroom:
Research, Macquarie University. Qualitative research onlanguage
Burns, A, de Silva Joyce, H, & Hood, S. education (pp. 201-223). New York:
(Eds.). (1997). Teaching disparate Cambridge University Press.
learner groups: Teachers' voices 2. Campbell D. T, & Stanley, J. C. (1963).
Sydney: National Centre for English Experimental andnon-experimental
Language Teachingand Research, designsfor research. Washington, DC:
Macquarie University. American Educational Research
Association.
Bums, A., de SilvaJoyce, H., & Hood, S.
(Eds.). (1999a). Teaching aitical Canagarajah,A. S. (1993). Critical
literacies: Teachers' voices 3. Sydney: ethnography of a Sri Lankan

Reference 467
classroom: Ambiguities in student Chan, Y. H. (1996). Action research as
opposition to reproduction through professional development for ELT
ESOL. TESOL Quarterly, 27(4), practitioners. Working Papers in ELT
601-625. and Applied Linguistics, 2(1), 17-28.
Carless, D. (1999). Catering for individ Hong Kong: Hong Kong Polytechnic
ual learnerdifferences: Primary school University.
teachers' voices. Hong KongJournal of Chaudron, C. (1986). The interaction
Applied Linguistics, 4(2), 15-40. of quantitative and qualitative
Carr,W, & Kemmis, S. (1986). Becoming approaches to research: A view of the
critical: Education, knowledge andaction second language classroom. TESOL
research. London: Falmer. Quarterly, 20(4), 709-717.
Carrasco, R. L. (1981). Expanded Chaudron, C. (1988). Second language
awarenessof student performance: A classrooms: Research on teaching and
casestudy in appliedethnographic learning. Cambridge: Cambridge
monitoring of a bilingual classroom. University Press.
In H. T. Trueba, G. P. Guthrie, & Choi,J. (2006). A narrative analysis of
H. P. Au (Eds.), Culture and the second languageacquisition and iden
bilingual classroom: Studies in classroom tity formation. Unpublished master's
ethnography (pp. 153-177).Rowley, of science dissertation, Anaheim
MA: Newbury House. University, Anaheim, California.
Carroll, M. (1994).Journal writing as a Chrisrison, M.A (2003). Learning styles and
learning and research tool in the strategies. In D. Nunan (Ed.), Practical
adult classroom. TESOL Journal, 4, English language teaching (pp. 267-288).
19-22. New York: McGraw-Hill.
Carson,J. G., & Longhini, A. (2002). Chrisrison, M. A., & Bassano, S. (1995).
Focusing on learning styles and strate Action research: Techniques for
gies:A diary study in an immersion collectingdata through surveys and
setting. Language Learning, 52(2), interviews. The CATESOL Journal,
401-438. 8(1), 89-103.
Carter, R., Goddard, A, Reah, D., Chrisrison, M. A., & Nunan, D. (2001,
Sanger, K., & Bowering, M. (2001). March). Pedagogical functions in
Working with texts: A core introduction synchronous e-learning interaction:
to language analysis (2nd ed.). London: The online conversation class.
Roudedge. Paper presented at the international
Carter, R., & Nunan, D. (Eds.). (2001). TESOL Convention, St. Louis,
The Cambridge guide toteaching Missouri.
English tospeakers ofother languages. Clark,J. L. D. (1969). The Pennsylvania
Cambridge: Cambridge University Project and the audiolingual vs.
Press. traditional question.Modern Language
Celce-Murcia, M. (1978). The simultane Journal, 53, 388-396.
ous acquisition of English and French Clarke, M. A.(1995, March). Ideology,
in a two-year-old child. In E. M. method, style: The importance of par-
Hatch (Ed.), Second language acquisi ticularizability. Paper presented at the
tion: A book ofreadings (pp. 38-53). international TESOL Convention,
Rowley, MA: Newbury House. Long Beach, California.
Chamot, A. U. (1995). The teacher's Cleghorn, A., & Genesee, F. (1984). Lan
voice: Research in your classroom. guages in contact: An ethnographic
ERIC/CLL News Bulletin, 19(2), 1, 5. study of interaction in an immersion

468 Reference
school. TESOL Quarterly, 18(4), Curtis, A. (1999). Use of action research
595-625. in exploring the use of spokenEnglish
Coffey, A., & Atkinson, P.(1996). Making in Hong Kong classrooms. In Y. M.
sense ofqualitative data. Thousand Cheah1 &S. M. Ng (Eds.), IDAC
Oaks, CA: Sage. Monograph: Language instructional
issues in theAsian classroom (pp. 78-88).
Cohen, A. D., & Hosenfeld, C. (1981).
Some uses of mentalistic data in
Newark, DE: International Reading
Association.
second language research.Language
Learning, 31(2), 285-313. Curtis, Al, &Bailey, K. M. (In press).
Cohen,J. M., & Cohen, M.J. (1980). The Diary studies. On CUEJournal.
Penguin dictionary ofmodern quotations Dagneaux, E., Denness,S., & Granger, S.
(2nd ed.). London: Penguin. (1998).Computer-aided error analysis.
Cohen, L., & Manion, L. (1985). Research Systenl 26, 163-174.
methods in education. London: Croom Damon, W., & Phelps, E. (1989). Critical
Helm. distinctions among three approaches
Cole, R., Raffier, L. M., Rogan, P., &
to peer education. InternationalJour
Schleicher, L. (1998). Interactive nalofEducational Research, 58, 9-19.
group journals: Learningasa dialogue Danielson, D. (1981, March). Views of
among learners. TESOL Quarterly, language learning from an "older
32(3), 556-568. learner" (Part H). CATESOL
Newsletter, 12(5), 6 and 16.
Cooley,L., & Lewkowicz, J. (2003).
Dissertation writing inpractice: Turning Davidson, F. (1998). The ordinal-interval
ideas into text. Hong Kong: Hong distinction reconsidered. Language
Kong University Press. Testing Update, 23, 56-64.
Coombe, C, & Barlow, L. (Eds.). (2007). Davis, KJ A., & Lazaraton, A. (Eds.).
Language teacher research in the Middle (1995). Qualitative research in ESOL.
East. Alexandria, VA: TESOL. TESOL Quarterly, 29(3).
Crago, M. B. (1992). Communicative Delaney, A. E., & Bailey, K. M. (2000,
interaction and second language March). Teaching journals: Writing
acquisition: AnInuit example. TESOL for professional development. ESL
Quarterly, 26(3), 487-505. Magazine, 16-18.
Crookes, G. (1993). Action research for Denzin, N. K. (1978). The research act:
second language teaching: Going A theoretical introduction tosociological
beyond teacher research.Applied methods (2nd ed.). New York:
Linguistics, 14, 130-142. McGraw-Hill.

Crookes, G. (1998). On the relationship Denzin, N. K, & Lincoln, Y. S. (Eds.).


between second and foreign language (2000).Handbook ofqualitative research
teachers and research. TESOL Journal, (2nd ed.). Thousand Oaks, CA: Sage
7(3), 6-11. Publications.
i

Crookes, G. (2005). Resources for incor Dingwall,S. (1984). Critical self-reflection


porating action research as critique and decisions in doing research: The
into applied linguistics graduate case df a questionnaire survey of EFL
education. Modern Language Journal, teachers. In S. Dingwall & S. Mann
89(3), 467-475. (Eds.),Methods andproblems in doing
Curtis, A. (1998). Action research: What, research (pp. 3-30). Lancaster,
how, and why. The English Connection, England: Department of Linguistics
3(1), 12-14.
and Modern English Language,
University of Lancaster.

Reference 469
Donato, R. (2000). Sociocultural contri Duff, P.A.(1991b). The efficacy ofdual-
butions to understanding the foreign language education in Hungary: An
and secondlanguage classroom. In J. P. investigation ofthree Hungarian-English
Lantolf (Ed.),Sociocultural theory and programs. (Final Report for Year-Two
second language learning (pp. 27-50). [1990-91] of the project). Los Angeles,
Oxford: OxfordUniversity Press. CA: University of California, Los
Donato, R., & Adair-Hauck, B. (1992). Angeles, The Language Resource
Discourse perspectiveson formal Program.
instruction. Language Awareness, 1(2), Duff, P.A. (1995). An ethnography of
73-89. communication in immersion class
Donato, R., Antonek, J. L., & Tucker, rooms in Hungary. TESOL Quarterly,
G. R. (1994).A multiple perspectives 29(3), 505-537.
analysis of aJapanese FLES program. Duff, P. A.(1996). Different languages,
Foreign Language Annals, 27(3), different practices: Socialization of
365-378. discourse competence in dual-language
Dornyei, Z. (2003). Questionnaires in school classrooms in Hungary. In K.
second language research: Construction, M. Bailey & D. Nunan (Eds.), Voices
administration, andprocessing. Mahwah, from the language classroom:
NJ: Lawrence Erlbaum Associates. Qualitative research on language
Dornyei, Z. (2007). Research methods in education (pp. 407-443). New York:
applied linguistics. Oxford: Oxford Cambridge University Press.
University Press. Duff, P. A. (2008). Case study research in
Dornyei, Z., & Murphey, T. (2003). applied linguistics. New York: Lawrence
Group dynamics in the language Erlbaum Associates/Taylor & Francis.
classroom. Cambridge: Cambridge Dulay, H., & Burt, M. (1973). Should we
University Press. teach children syntax? Language
Doughty, C, & Pica, T. (1986)."Infor Learning, 23, 245-258.
mation gap" tasks: Do they facilitate Duterte, A. (2000).A teacher's investiga
second language acquisition? TESOL tion of her own teaching. Applied
Quarterly, 20(2), 305-325. Language Learning, 11(1), 99-122.
Dowsett, G. (1986). Interaction in the Edge, J. (Ed.). (2001).Action research.
semi-structured interview. In M. Alexandria, VA: TESOL.
Emery (Ed.), Qualitative research Edge, J., & Richards, K. (Eds.). (1993).
(pp. 50-56). Canberra: Australian Teachers develop teachers research:
Association of Adult Education. Papers onclassroom research andteacher
Duff, P. A. (1990). Developments in the development. Oxford: Heinemann
case study approach to SLA research. International.
In T. Hayes & K. Yoshioka (Eds.), Edwards,J. A., & Lampert, M. D. (Eds.).
Proceedings ofthe 1st Conference on (1993). Talking data: Transcription and
Second Language Acquisition and coding in discourse research. Hillsdale,
Teaching (pp. 34-87). Tokyo: NJ: Lawrence Erlbaum Associates.
International University ofJapan. Ehlich, K. (1993). HIAT: A transcription
Duff, P. A. (1991a). Innovations in system for discourse data. In J. A^
foreign language education: An Edwards & M. D. Lampert (Eds.),
evaluation of three Hungarian-English Talking data: Transcription andcoding
dual-language programs. Journalof in discourse research (pp. 123-148).
Multilingual andMulticultural Hillsdale, NJ: Lawrence Erlbaum
Development, 12, 459-476. Associates.

470 Reference
Ellis, R. (1984).Second language classroom suggested three-stage approach to
development. Oxford: Pergamon. exploratory practice.In S. Gieve &
Ellis, R. (1985). Understanding second I. K. Miller (Eds.), Understanding the
language acquisition. Oxford: Oxford language classroom (pp. 175-199).
University Press. Basingstoke, Hampshire, England:
Ellis, R. (1988). Classroom second language
Palgrave Macmillan.
development. London: Prentice Hall. Farrell, T S. C. (Ed.). (2006).Language
teacherresearch in Asia. Alexandria, VA:
Ellis, R. (1989). Classroom learning styles
TESOL.
and their effect on second language
acquisition: A study of two learners. Fetterman, D. M. (1989).Ethnography:
System, 17, 249-262. Step bystep. Newbury Park,
Ellis, R. (1990a).Researching classroom CA: Ssige.
language learning.In C. Brumfit & Flanders, N. (1970).Analyzing teaching
R. Mitchell (Eds.), Research in thelan behavior. Reading, MA: Addison-
guage classroom (pp. 54-70).London: Wesley.
Modern English Publications. Flick, U. (1998). An introduction to
Ellis, R. (1990b). Instructed second language qualitative research. London: Sage.
acquisition: Learning in the classroom. Fowler, P. (1988). Survey research methods.
Oxford: Basil Blackwell. Newbury Park, CA: Sage.
Ellis, R, & Barkhuizen, G. (2005). Fox, K. (2004). Watching the English:
Analysing learner language. Oxford: The hidden rules ofEnglish behaviour.
Oxford University Press. London: Hodder and Stoughton.
Ericcson, K. A, & Simon, H. A. (1984). Fraser, BJ,Rintell, E., & Walters, J.
Protocol analysis: Verbal reports asdata. (1980). An approach to conducting
Cambridge, MA: MIT Press. research on the acquisition of
Ericcson, K A., & Simon, H. A. (1987). pragmatic competence in a second
Verbal reports on thinking. In language. In D. Larsen-Freeman
C. Fajrch & G. Kasper (Eds.), Intro (Ed.), Discourse analyses in second
spection in second language research language research (pp. 75-91). Rowley,
(pp. 24-53). Clevedon, England: MA: Newbury House.
Multilingual Matters. FreemanJ D. (1989). Teacher training,
Ericcson, K. A, & Simon, H. A. (1993). development, and decision making:
Protocol analysis: Verbal reports asdata Amodel ofteaching and related
(Rev. ed.). Cambridge,MA:MIT Press. strategies for language teacher
Fajrch, C, & Kasper, G. (Eds.). (1987). education. TESOL Quarterly, 23(1),
Intrvspection insecond language research. 27-451
Clevedon, England: Multilingual FreemanJ D. (1992). Collaboration:
Matters. Constructing shared understandings
Fanselow, J. E (1977). Beyond in a second language classroom. In
Rashomon—conceptualizing and D. Nunan (Ed.), Collaborative language
describing the teaching act. TESOL learning andteaching (pp. 56-80).
Quarteriy, 11(1), 17-39. Cambridge: Cambridge University
Press.
Fanselow, J. F. (1987).Breaking rides—
generating andexploring alternatives in Freeman, D. (1996a). Redefining the
language teaching. White Plains, NY: relationship between research and
Longman.
what teachers know. In K. M. Bailey
& D. Nunan (Eds.), Voicesfrom the
Fanselow,J. E, & Barnard, R. language classroom: Qualitative research
(2006).Take 1, take 2, take 3: A

Reference 471
on language education (pp. 88-115). New Gass, S. M., & Mackey, A. (2007). Data
York: Cambridge University Press. elicitationforsecond andforeign language
Freeman, D. (1996b). The "unstudied research. Mahwah, NJ: Lawrence
problem": Research on teacher Erlbaum Associates.
learning in language teaching. In D. Gass, S. M., & Schachter,J. (Eds.).(1996).
Freeman & J. C. Richards (Eds.), Second language classroom research: Issues
Teacher learning in language teaching andopportunities. Mahwah, NJ:
(pp. 351-378). Cambridge: Lawrence Erlbaum Associates.
Cambridge University Press. Gieve, S., & Miller, I. K. (Eds.). (2006).
Freeman, D. (1998). Doing teacher Understanding the language classroom.
research: From inquiry to understanding. Basingstoke, Hampshire, England:
Boston: Heinle & Heinle. Palgrave Macmillan.
Fry,J. (1988). Diary studies in classroom Giroux,H. (1983). Theory andresistance in
SLA research: Problems and prospects. education: A pedagogyfor the opposition.
JALTJournal, 9, 158-167. South Hadley, MA: Bergin and Garvey.
Gaies, S.J. (1980,July). Classroom cen Gliksman, L., Gardner, R. C, & Smythe,
tered research: Some consumer guide P. C. (1982).The role of the integra
lines. Paper presented at the TESOL tive motive on students' participation
Summer Meeting, Albuquerque, NM. in the French classroom. Canadian
Gaies, S.J. (1983).The investigation of Modern Language Review, 38, 625-647.
language classroom processes. TESOL Grandcolas, B., & Soule-Susbielles, N.
Quarterly, 17(2), 205-217. (1986). The analysis of the foreign
Galda, D. (in press)."My words is big language classroom.Studies in Second
problem": The life and learning Language Acquisition, 8, 293-308.
experiences of three elderly Eastern Green, J., & Wallat, C. (Eds.). (1981).
European refugees studying ESL at a Ethnography andlanguage in educational
community college. In K. M. Bailey & settings. Norwood, NJ: Ablex Publish
M. G. Santos (Eds.), Research on ing Corporation.
English asa second language in U.S. Grotjahn, R (1987). On the method
community colleges: People, programs and ological basis of introspective
potential. Ann Arbor, MI: Universityof methods. In C. Faerch & G. Kasper
Michigan Press. (Eds.), Intrvspection insecond language
Galvan,J. L. (1999). Writing literature research (pp. 54-81). Clevedon,
reviews: A guidefor students ofthe social England: Multilingual Matters.
andbehavioral sciences. Los Angeles: Gu, P. Y, & Wen, Q. (2005). How often
Pyrczak Publishing. is often? Reference ambiguities of the
Garner, H. (2006). The rules of engage Likert scale in language learning strat
ment: Paul Greengrass's United 93. egy research. Review ofApplied Linguis
TheMonthly Online. Retrieved tics in China: Issues in Language Learn
January 25, 2008, from http://www. ingandTeaching, 1, 61-80.
themonthly.com.au/tm/node/2 71 Halbach, A. (2000). Finding out about
Gass, S. M. (1997).Input, interaction, and students' learning strategies by look
the second language learner. Mahwah, ing at their diaries: A casestudy.
NJ: Lawrence Erlbaum Associates. System, 28, 85-96.
Gass, S. M., & Mackey, A. (2000). Halliday, M. A. K. (1975). Learning
Stimulated recall methodology insecond how tomean: Explorations in the
language research. Mahwah, NJ: development oflanguage. London:
Lawrence Erlbaum Associates. Edward Arnold.

472 Reference
Hammersley,M., & Atkinson, P. (1983). learning. Mahwah, NJ: Lawrence
Ethnography: Principles inpractice. Erlbaum Associates.
London: Tavistock Publications. Ho, B., & Richards, J. (1993). Reflective
Harklau, L. (1994). ESL versus main thinking through teacher journal
stream classes: Contrasting L2 learn writing: Myths and realities. Prospect:
ing environments. TESOL Quarterly, A Journal ofAustralian TESOL, 8(3),
28(2), 241-272. 7-24.
Harklau, L. (2000). From the "good kids" Holliday, A. (2002). Doing andwriting
to the "worst": Representations of qualitative research. London: Sage.
English language learners across edu Holmes, 0. W (1906). The professor at
cational settings. TESOL Quarterly, thebreakfast table. London: J. M. Dent
34(1), 35-67. &Co|
Harrison, I. (1996). Looks who's talking Hornber£er, N. (1988). Bilingual
now: Listening to voices in curriculum education andlanguage maintenance:
renewal. In K. M. Bailey& D. Nunan, A Southern Peruvian case study.
(Eds.), Voicesfrom the language class Dordrecht: Foris Publications.
room: Qualitative research on second Huang, J, (2005).A diary study of
language education (pp. 283-303). New difficulties and constraints in EFL
York: Cambridge University Press. learning. System, 33, 609-621.
Hatch, E. M. (Ed.). (1978). Second Hughes,A.(1989). Testingforlanguage
language acquisition: A book ofreadings. teachers. Cambridge: Cambridge
Rowley, MA: Newbury House. University Press.
Hatch, E. M., & Farhady, H. (1982). Hunt, K. W. (1970).Syntactic maturity in
Research design andstatisticsfor applied school children and adults. Chicago,IL:
linguistics. Rowley, MA: Newbury University of Chicago Press.
House.
Jaeger, R, (1988). Survey research
Hatch, E. M., & Lazaraton, A. (1991).
methodsin education. In R. M. Jaeger
The research manual: Design andstatis (Ed.), ^Complementary methodsfor
ticsfor applied linguistics. New York: research in education (pp. 303-338).
Newbury House. Washington, DC: American
Heath, S. B. (1983). Ways with words. Educational Research Association.
Cambridge: Cambridge University Jaeger, R, M. (1993). Statistics—a spectator
Press.
sport (2nd ed.). Newbury Park, CA:
Henze, R. C. (1995). Guides for the Sage.
novice qualitativeresearcher. TESOL Jarvis,J. (1992).Using diaries for reflec
Quarterly, 29(3), 595-599. tion on in-service courses. English
Hilleson, M. (1996). "I want to talk to Language TeachingJournal, 46(2),
them but I don't want them to hear": 133-143.
An introspective study of second- Jepson, K. (2005). Conversations—and
language anxietyin an English- negotiated interaction—in text and
medium school. In K. M. Bailey& voice chat rooms. Language Learning
D. Nunan (Elds.), Voicesfrom the language andTechnology, 9(3), 79-98.
classroom: Qualitative research in second
Johnson, D. (1992).Approaches toresearch
language education (pp. 248-275).
in second language learning. White
Cambridge: Cambridge University
Press.
Plains, NY: Longman.
Hinkel, E. (Ed.). (2005). Handbook of Johnson, K. E. (1992a). Learning to
teach: Instructional actions and
research insecond language teaching and

Reference 473
decisions of preservice ESL teachers. Kebir, C. (1994). An action research look
TESOL Quarterly, 26(3), 507-535. at the communication strategies of
Johnson, K. E. (1992b). The instructional adult learners. TESOL Journal, 4(1),
decisions of pre-service English as a 28-31.
second language teachers: New Kemmis, S., & Henry, C. (1989). Action
directions for teacher preparation research. IATEFL Newsletter, 102, 2-3.
programs. In J. Flowerdew, M. Brock, Kemmis, S., & McTaggart, R. (1982). The
& S. Hsia (Eds.), Perspectives on action research planner. Victoria: Deakin
second language teacher development University.
(pp. 115-134). Hong Kong: City Kemmis, S., & McTaggart, R. (1988).
University of Hong Kong. The action research planner (3rd ed.).
Johnson, K. E. (1995). Understanding Victoria: Deakin University.
communication insecond language Kennedy, M. (1990). Policy issues in teacher
classrooms. Cambridge: Cambridge education. East Lansing, MI: National
University Press. Center for Research on Teaching.
Johnson, K. E. (1996). The vision versus Kincheloe, J. L. (1991). Teachers as
the reality: The tensions of the researchers: Qualitative inquiry asa path
TESOL practicum. In D. Freeman & toempowerment. London: The Falmer
J. C. Richards (Eds.), Teacher- learning Press.
in language teaching (pp. 30-49).
Knezedvoc, B. (2001). Action research.
Cambridge: Cambridge University
IATEFL Teacher Development SIG
Press.
Newsletter, 1(1), 10-12.
Johnson, K. E. (1998). Teachers under
Knowles, T. (1990). Action research: A
standing teaching. Boston: Heinle &
way to make our ideas matter. The
Heinle.
Language Teacher; 14(7).
Johnson, K. E. (1999). Understanding
Kramsch, C. (2000). Social discursive
language teaching: Reasoning inaction.
constructions of self in L2 learning.
Boston: Heinle & Heinle.
In J. P. Lantolf (Ed.), Socioadtural
Jones, F. R. (1994).The lone language theory andsecond language learning
learner: A diary study. Systan, 22(4), (pp. 133-153). Oxford: Oxford
441-454. University Press.
Jones, F. R. (1995). Learning an alien Krishnan, L. A., & Hoon, L. H. (2002).
lexicon: A teach-yourself casestudy. Diaries: Listening to 'voices'from the
Second Language Research, 11(2), multicultural classroom.English
95-111. Language TeachingJournal, 56(3),
Jourdenais, R. (2001). Cognition, 227-239.
instruction, and protocol analysis. In Kuhn, T. (1996). The structure ofscientific
P. Robinson (Ed.), Cognition andsecond revolutions (3rd ed.). Chicago:
language learning (pp. 354-375). University of Chicago Press.
Cambridge: Cambridge University
Kumaravadivelu, B. (1994). The
Press.
post-method condition: (E)merging
Katz, A. (1996). Teachingstyle:A way to strategiesfor second/foreign language
understand instruction in language teaching. TESOL Quarterly, 28(1),
classrooms. In K. M. Bailey & 28-49.
D. Nunan (Eds.), Voicesfrom the
Kumaravadivelu, B. (2003). Beyond
language classroom: Qualitative research
methods: Macrostrategiesfor language
on language education (pp. 57-87). New
teaching. New Haven, CT: Yale
York: Cambridge University Press.
University Press.

474 Reference
Kwan, T. Y. L. (1993). Contexts for Education Conference,Universityof
action research development: The Hong Kong, Hong Kong.
case for Hong Kong. Curriculum Leopold, W. F. (1978).A child's learning
Forum, 3(3), 11-23. of two languages.In E. M. Hatch
Labov,W. (1972). Some principles of (Ed.), Second language acquisition: A
linguistics methodology. Language in book ofreadings (pp. 23-32). Rowley,
Society, 1, 97-120. MA: Newbury House.
Lankshear, C, & Knobel, M. (2004). Levine, H, Gallimore, R., Weisner, T. S.,
A handbookfor teacher research: From & Turner, J. L. (1980).Teaching par
design toimplementation. New York: ticipant observation research methods:
Open University Press. A skills-buildingapproach. Anthropol
Lantolf, J. P. (Ed). (2000). Socioadtural ogy andEducation Quarterly, 11(1),
theory andsecond language learning. 38-45:
Oxford: Oxford University Press. Lewin, Ki (1946). Action research and
Larimer, R. E., & Schleicher L. (Eds.). minority problems.Journal ofSocial
(1999). Newways in using authentic Issues, 2, 34-46.
i

materials in the classroom. Alexandria, Lewin, Li (1948). Resolving social conflicts.


Virginia: TESOL. New York: Harper and Row.
Larsen-Freeman, D. (1996). The Lieblich, A., Tuval-Mashiach, R., &
changing nature of second language Zilber, T. (1998). Narrative research:
classroom research. In J. Schachter & Reading, analysis, andinterpretation.
S. Gass (Eds.), Second language class Thousand Oaks, CA: Sage.
room research: Issues andopportunities Liebscher, G., & Dailey-O'Cain, J.
(pp. 157-170). Mahwah, NJ: (2003). Conversational repair as a
Lawrence Erlbaum Associates. role-defining mechanism in classroom
Law, B., & Eckes, M. (1995). Assessment interaction. Modern Language Journal,
andESL: A handbookfor K-12educators. 87(3), 375-390.
Winnipeg: Peguis Publishers. Likert, R, (1932). A technique for the
Lazaraton, A. (2004). Gesture and speech measurement of attitude scales.
in the vocabulary explanations of one Archives ofPsychology, 140, 1-55.
ESL teacher: A microanalytic inquiry. Lin, A. M. Y. (1999). Doing-English-
Language Learning, 54, 79-118. lessonsin the reproduction or trans
LeCompte, M., & Goetz, J. (1982). formation of social worlds? TESOL
Problems of reliabilityand validityin Quarteriy, 33(3), 393-412.
ethnographic research. Review of Lin, Y, &: Hedgcock,J. (1996). Negative
Educational Research, 52(1), 31-60. feedback incorporation among high-
Lee, E., & Lew, L. (2001). Diary studies: proficiency and low-proficiency
The voicesof nonnative English Chinese-speakinglearners of Spanish.
speakers in a master of arts program in Language Learning, 46, 567-611.
teaching English to speakers of other Lincoln, Y. S., & Guba, E. G. (1985).
languages. CATESOL Journal, 13(1), Naturalistic inquiry. Newbury Park,
135-149. CA: Sage.
Legutke, M. (2000, December). Long, M. H. (1980). Inside the 'black
Redesigning the foreign language box': Methodological issues in
classroom: A critical perspective on research on language teaching and
information technology (IT) and edu learning. Language Learning, 30(1),
cational change. Plenary presentation 1-42. (Reprinted in Classroom oriented
at the International Language in research in second language acquisition,

Reference 475
pp. 3-35, by H. W. Seliger and M. H. Mackey, A., & Gass, S. M. (2005). Second
Long, Eds., 1983, Rowley, MA: language research: Methodology andde
Newbury House). sign. Mahwah, NJ: Lawrence Erlbaum
Long, M. H. (1983a). Does second lan Associates.
guage instruction make a difference? Markee, N. (1996).Making second
TESOL Quarterly, 17(3), 359-382. language classroom research work. In
Long, M. H. (1983b). Linguistic and J. Schachter & S. Gass (Eds.), Second
conversational adjustments to non- language classroom research: Issues and
native speakers. Studies in Second opportunities (pp. 117-155). Mahwah,
Language Acquisition, 5, 177-193. NJ: Lawrence Erlbaum Associates.
Long, M. H. (1984). Process and product Markee, N. (2000). Conversational analysis.
in ESL program evaluation. TESOL Mahwah, NJ: Lawrence Erlbaum
Quarterly, 18(3), 409-425. Associates.

Long, M. H. (1985).A role for instruc Markee, N. (2003). Qualitative research


tion in second language acquisition: guidelines(conversational analysis).
Task-based language training. In K. TESOL Quarterly, 37, 169-172.
Hyltenstam & M. Pienemann (Eds.), Markee, N. (2005). Conversational analy
Modelling and assessing second language sis for second languageacquisition.In
acquisition (pp. 77-100). Clevedon, E. Hinkel (Ed.), Handbook ofresearch
England: Multilingual Matters. in second language teaching andlearning
Long, M. H. (1996).The role of the (pp. 355-374). Mahwah, NJ:
linguisticenvironment in second Lawrence Erlbaum Associates.
languageacquisition.In W. Ritchie & Marshall, C, & Rossman, G. B. (1995).
T. Bhatia (Eds.), Handbook ofresearch Designing qualitative research (2nd ed.).
on second language acquisition (pp. Thousand Oaks, CA: Sage.
413-469). New York: Academic Press. Marshall, C, & Rossman, G. B. (1999).
Long, M. H, & Doughty, C.J. (Eds.), Designing qualitative research (3rd ed.).
(in press). Handbook ofsecond and Thousand Oaks, CA: Sage.
foreign language teaching. Oxford: Marshall, C, & Rossman, G. B. (2006).
Blackwell. Designing qualitative research (4th ed.).
Lowe, T. (1987). An experiment in role Thousand Oaks, CA: Sage.
reversal: Teachers as language learn Martyn, E. (1996). The influence of task-
ers. English Language TeachingJournal, type on the negotiation of meaning in
4(2), 89-96. small group work. Paper presented at
Lozanov,G. (1979).Suggestology and the Annual PacificSecond Language
outlines ofSuggsestopedia. New York: Research Forum, Auckland, New
Gordon and Breach Science Publish Zealand.
ers, Inc. Martyn, E. (2001).The effects of task
Lozanov, G. (1982). Suggestologyand type on negotiation of meaning in
Suggestopedia. In R. W Blair (Ed.), small group work. Unpublished
Innovative approaches tolanguage doctoral dissertation, University of
teaching (pp. 146-159). Rowley, MA: Hong Kong, Hong Kong.
Newbury House. Mason, J. (1996). Qualitative researching.
Lynch, T, & Maclean,J. (2000). Explor London: Sage.
ing the benefits of task repetition and Matsuda, A., & Matsuda, P. (2001).
recycling for classroom language Autonomy and collaboration in
learning. Language Teaching Research, teacher education: Journal sharing
4(3), 221-250. among native and nonnative

476 Reference
English-speaking teachers. CATESOL Miles, M. B., & Huberman, A. (1994).
Journal, 13(1), 109-121. Qualitative data analysis: An expanded
Matsumoto, K. (1987). Diary studies of sourcebook. Thousand Oaks, CA: Sage.
second language acquisition: A critical Mingucci,M. (1999). Actionresearchin
overview. JALTJournal, 9, 17-34. ESL staff development. TESOL
Matsumoto, K (1989).An analysis of a Matters, 9(2), 16.
Japanese ESL learner's diary: Factors Mitchell, M., & Jolley, J. (1988). Research
involved in the L2 learning process. design explained. New York: Holt
JALTJournal, 11, 167-192. Rinehart Winston.
Maxwell, J. A. (2005). Qualitative research Mok, A. (1997). Student empowerment
design: An interactive approach (2nd ed.). in an English language enrichment
Thousand Oaks, CA: Sage. programme: An action research proj
McCarthy, M., & Walsh, S. (2003). ect in Hong Kong. Educational Action
Discourse. In D. Nunan (Ed.), Research, 5(2), 305-320.
Practical English language teaching Moore, T. (1977). An experimental lan
(pp. 173-195). New York: guage handicap: Personal account.
McGraw-Hill. Bulletin ofthe British Psychological
McKay,S. L. (2006).Researching second Society, 30, 107-110.
language classrooms. Mahwah, NJ: Morgan,' D. (1997). Focus groups asquali
Lawrence Erlbaum Associates. tativeresearch (2nd ed.). Thousand
McDonough, J. (1994).A teacher looks at Oaks, CA: Sage.
teachers' diaries. English Language Moskowitz, G. (1968). The effects of
TeachingJournal, 48(1), 57-65. training foreign language teachers in
McGarrell, H. M. (Ed.). (2007). Language Interaction Analysis. Foreign Language
teacher research in the Americas. Annals, 1(3), 218-235.
Alexandria, VA: TESOL. Moskowitz, G. (1971). Interaction
McPherson, P. (1997). Action research: Analysis—a new modern language for
Exploring learner diversity. Prospect: supervisors. Foreign Language Annals,
A Journal ofAustralian TESOL, 12(1), 5, 211-221.
50-62. Moskowitz, G. (1976). The classroom
Merriam, S. B. (1988). Case study research interaction of outstanding foreign
in education: A qualitative approach. San language teachers.Foreign Language
Francisco:Jossey-Bass. Annals, 9, 125-143 and 146-157.
Merriam, S. B. (1998). Qualitative Nassaji, H., & Cumming, A. (2000).
research andcase study applications in What's in a ZPD? A casestudy of a
education. (2nd ed.). San Francisco: young ESL student and teacher inter
Jossey-Bass. acting through dialogue journals. Lan
Michonska-Stadnik, A., & Szulc-Kurpaska, guage Teaching Research, 4(2), 95-121.
M. (Eds.). (1997).Action research in Nisbett, R., & Wilson, T. (1977). Telling
the lower Silesia cluster colleges: more than we can know: Verbal re
Developing learnerindependence. ports on mental process. Psychological
Orbis Linguarum, 2. Legnica,Poland: Review, 84,231-259.
Nauczycielskie KolegiumJezkw Numrich, C. (1996). On becoming a lan
Obcych and the BritishCouncil. guage teacher: Insights from diary
Miles, M. B., & Huberman, A. (1984). studies. TESOL Quarterly, 30(1),
Qualitative data analysis: A sourcebook 131-151.
ofnew methods. BeverlyHills, CA: Nunan, D. (1988). The learner-centred
Sage. curriculum: A study insecond language

Reference 477
teaching. Cambridge: Cambridge Nunan, D. (1997a). Developingstandards
University Press. for teacher-research in TESOL.
Nunan, D. (1989). Understanding TESOL Quarterly, 31(2), 365-367.
language classrooms: A guidefor teacher- Nunan, D. (1997b). Research, the teacher
initiated action. New York: Prentice and classrooms of tomorrow. In G. M.
Hall. Jacobs (Ed.),Language classrooms
Nunan, D. (1990). Action research in the oftomorrow: Issues andresponses
language classroom. In J. C. Richards (pp. 183-194). Singapore: SEAMEO
& D. Nunan (Eds.), Second language Regional Language Center.
teacher education (pp. 62-81). Nunan, D. (1999).Second language teach
Cambridge: Cambridge University ingandlearning. Boston:Heinle &
Press. Heinle.
Nunan, D. (1991a). Methods in second Nunan, D. (Ed.). (2003). Practical English
language classroom-oriented research: language teaching. New York:
A critical review. Studies in Second McGraw-Hill.
Language Acquisition, 13(2), 249-274. Nunan, D. (2004). Task-based language
Nunan, D. (1991b). Second language teaching. Cambridge: Cambridge
acquisition research in the language University Press.
classroom. In E. Sadtono (Ed.), Lan Nunan, D. (2005). Classroom research.
guage acquisition and the second/foreign In E. Hinkel (Ed.), Handbook of
language classroom (AnthologySeries research in second language teaching and
28, pp. 1-24). Singapore: SEAMEO learning (pp. 225-240).Mahwah, NJ:
Regional Language Center. Lawrence Erlbaum Associates.
Nunan, D. (1991c). Language teaching Nunan, D. (2007a). What is this thing
methodology. London: Prentice Hall. called language? London: Palgrave
Nunan, D. (1992). Research methods in Macmillan.
language learning. Cambridge: Nunan, D. (2007b, September). Diverse
Cambridge University Press. voices: What we can learn from listen
Nunan, D. (1993a). Action research in ing to our learners? Plenary presenta
language education.In J. Edge & tion at the English Australia Confer
K Richards (Eds.), Teachers develop ence, Sydney, Australia.
teachers research: Papers onclassroom Nunan, D., & Lamb, C. (1996). The self-
research andteacher development directed teacher: Managing thelearning
(pp. 39-50). Oxford: Heinemann process. Cambridge: Cambridge
International. University Press.
Nunan, D. (1993 b). Teachers' interactive Nunan, D., & Wong, L. (2003, Decem
decision-making. Sydney:National ber). Learning styles and strategies:
Centre for English Language An empirical investigation. Paper
Teaching and Research. presented at the Chulalongkorn
Nunan, D. (1996). Hidden voices: University Language Institute
Insiders' perspectiveson classroom International Conference, Bangkok,
interaction. In K. M. Bailey & Thailand.
D. Nunan (Eds.), Voicesfrom the Nunan, D., & Wong, L. (2006). The
language classroom: Qualitative research good language learner: An empirical
onlanguage education (pp. 41-56). investigation. Unpublished manu
New York: Cambridge University script, University of Hong Kong, The
Press. English Centre.

478 Reference
O'Farrell, A. (2003).The language of Peck, S. (1996). Language learning
nonproliferationstudies.Unpublished diaries as mirrors of students' cultural
manuscript, Monterey Institute of sensitivity. In K. M. Bailey& D.
International Studies, Monterey, Nunan (Eds.), Voicesfrom the language
California. classroom: Qiuilitative research in second
Ohta, A. S. (2000). Rethinking interac language education (pp. 236-247).
tion in SLA: Developmentally appro Cambridge: Cambridge University
priate assistance in the zone of proxi Press.
mal development and the acquisition Pennington, M. C, & Richards,J. C.
of L2 grammar. In J. P. Lantolf (Ed.), (1997). Reorienting the teaching uni
Sociocultural theory andsecond language verse: [The experience of five first-year
learning (pp. 51-78). Oxford: Oxford English teachers in Hong Kong.
University Press. Language Teaching Research, 1(2),
Otto, F. M. (1969). The teacher in the 149-178.
Pennsylvania project. Modern Lan Perecman, E., & Curran, S. (2006). A
guage Journal, 53, 411-420. handbookfor social sciencefield research.
Oxford, R. (2001). Language learning Thousand Oaks, CA: Sage.
strategies. In R. Carter & D. Nunan Perry, F. L. (2005). Research in applied
(Eds.), The Cambridge guide toteaching linguistics: Becoming a discerning
English to speakers ofother languages consumer. Mahwah, NJ. Lawrence
(pp. 166-172). Cambridge: Erlbaum Associates.
Cambridge University Press. Peyton, ji K. (1990). Beginning atthe
Oxford,R. (2002). Languagelearningstyles beginning: First-grade ESL students
and strategies. In M. Celce-Murcia learning to write. In A. Padilla, H. H.
(Ed.), Teaching English asa second or Fairchild, & C. M. Valadez (Eds.),
foreign language (3rded.,pp. 359-383). Bilingual education: Issues andstrategies
Boston: Heinle & Heinle. (pp. 195-218). NewburyPark, CA:Sage.
Palmer, C. H. (1992). Diaries for self- Pica, T. (1994). Researchon negotiation:
assessment and INSET programme What does it reveal about second lan
evaluation. European Journal of Teacher guage learning conditions, processes
Education, 15(3), 227-238. and outcomes. Language Learning, 44,
Palmer, G. M. (1992). The practical 493-5b.
feasibilityof diary studies for INSET. Pica,T. (1997). Second language teaching
European Journal of Teacher Education, and research relationships: A North
15(3), 239-254. American view. Language Teaching
Parkinson, B., & Howell-Richardson, C. Research, 1(1), 48-72.
(1990). Learner diaries. In C. Brumfit Pica, T, & Doughty, C. (1985a).The
& R. Mitchell (Eds.), Research in the role of group work in classroom
language classroom: ELTDocuments, second language acquisition. Studies in
133 (pp. 128-140). London: English Second Language Acquisition, 7(2),
Publications and the British Council. 233-248.
Peck, S. (1980). Language play in child Pica, T, & Doughty, C. (1985b). Input
second language acquisition. In D. and interaction in the communicative
Larsen-Freeman (Ed.), Discourse language classroom: A comparison of
analysis insecond language research teacher-fronted and group activities.
(pp. 154-164). Rowley, MA: Newbury In S. Gass & C. Madden (Eds.),
House. Input insecond language acquisition

Reference 479
(pp. 115-132). Rowley, MA: Newbury In J. Edge (Ed.), Action research
House. (pp. 81-91). Alexandria, VA: TESOL.
Pike, K. L. (1964). Language in relation to Radecki, W. (2002). Student and teacher
a unified theory ofstructures ofhuman preferences in the high-tech class
behavior. The Hague: Mouton. room Unpublished manuscript, Zayed
Piore, M. (2006). Combining qualitative University, Dubai, United Arab
and quantitative tools: Qualitative Emirates.
research—does it fit in economics? In Reid,J. (1990). The dirty laundry of ESL
E. Perecmann & S. Curan (Eds.), A survey research. TESOL Quarterly,
handbookfor social sciencefield research 24(2), 323-338.
(pp. 143-157). London: Sage. Richards, J. C, & Lockhart, C. (1994).
Plummer, K. (1983). Doamtents oflife: An Reflective teaching in second language
introduction to theproblems andlitera classrooms. Cambridge: Cambridge
ture ofa humanistic method. London: University Press.
George Allen and Unwin. Richards, K. (1992). Pepys into a TEFL
Polio, C. (1996). Issues and problems in course. English Language Teaching
reporting classroom research. In J. Journal, 46(2), 144-152.
Schachter & S. Gass (Eds.), Second Richards, K. (2003). Qualitative inquiry in
language classroom research: Issues and TESOL. Houndsmill, UK: Palgrave,
opportunities (pp. 61-79). Mahwah, NJ: Macmillan.
Lawrence Erlbaum Associates.
Rivers, W (1979). Learning a sixth
Polio, C, & Wilson-Duffy, C. (1998). language:An adult learner's diary.
Teaching ESL in an unfamiliar Canadian Modern Language Review,
context: International students in a 36(1), 67-82.
North American TESOL practicum. Rivers, W. (1983). Communicating
TESOL Journal, 7(4), 24-29. naturally in a second language: Theory
Popper, K. (1968). The logic ofscientific andpraaice in language teaching.
discovery. London: Hutchinson. Cambridge: Cambridge University
Popper, K. (1972). Objeaive knowledge. Press.
Oxford: Oxford University Press. Roebuck, R. (2000). Subjects speak out:
Porter, P. A., Goldstein, L. M., How learners position themselves in a
Leatherman, J., & Conrad, S. (1990). psycholinguistictask. In J. P. Lantolf
An ongoing dialogue:Learner logs for (Ed.), Socioadtural theory andsecond
teacher preparation.In J. C. Richards language learning (pp. 79-95). Oxford:
& D. Nunan (Eds.), Second language Oxford University Press.
teacher education (pp. 227-240). Rounds, P. L. (1987). Characterizing
Cambridge: Cambridge University successful classroom discourse for
Press. NNS teaching assistant training.
Porto, M. (2007). Learning diaries in the TESOL Quarterly, 21(4), 643-671.
English as a foreign language class Rounds, P. L., & Schachter, J. (1996).
room: A tool for accessing learners' The balancing act: Theoretical,
perceptions of lessons and developing acquisitional and pedagogical issues
learner autonomy and reflection. in second language research. In J.
Foreign Language Annals, 40(4), Schachter & S. Gass (Eds.), Second
672-696. language classroom research: Issues
Quirke, P. (2001). Hearing voices: A andopportunities (pp. 99-116).
robust and flexible framework for Mahwah, NJ: Lawrence Erlbaum
gathering and using student feedback. Associates.

480 Reference
Rowntree, D. (1981). Statistics without study of an adult. In N. Wolfson &
tears: A primerfor non-mathematicians. E. Judd (Eds.), Sociolinguistics and
London: Longman. language acquisition (pp. 137-174).
Rowsell, L. V, & Libben, G. (1994). The Rowley, MA: Newbury House.
sound of one hand clapping: How to Schmidt, R. W (1984). The strengths
succeed in independent language and limitations of acquisition: A case
learning. Canadian Modern Language study of an untutored language
Review, 50(4), 668-687. learner. Language Learning and
Rubin, J., & Henze, R. (1981, February). Communication, 3(1), 1-16.
The foreign language requirement: A Schmidt, R. W, & Frota, S. N. (1986).
suggestion to enhance its educational Developing basicconversational abil
role in teacher training. TESOL ity in a second language:A case study
Newsletter, 15, 17, 19,24. of an adult learner of Portuguese. In
Ruiz de Gauna, P., Diaz, C, Gonzalez, R. R. Day (Ed.), Talking tolearn: Con
V, & Garaizar, I. (1995). Teachers' versation in second language acquisition
professionaldevelopment as a process (pp. 237-326). Rowley, MA: Newbury
of critical action research. Educational House.
Aaion Research, 3(2), 183-194. Schrank, A (2006). Bringing it all back
Ruso, N. (2007). The influence of task home: Personal reflections on friends,
based learning on EFL classrooms. findings and fieldwork. In E. Perecman
Asian EFLJournal, 18. Retrieved & S. Gurran (Eds.), A handbookforsocial
July 11, 2007, from http://www. sciencefield research (pp.217-225).
asian-efl-journal.com/pta_Febmary_ Thousand Oaks, CA Sage.
2007_nr.php Schumann, E (1980). Diary of a language
Santana-Williamson, E. (2001). Early learner: A further analysis. In S.
reflections:Journaling a wayinto Krashen & R. Scarcella (Eds.),
teaching. In J. Edge (Ed.), Action Research in second language acquisition:
research (pp. 33-44). Alexandria, VA: Selectedpapers ofthe Los Angeles Second
TESOL. Language Research Forum (pp. 51-57).
Sato, C. (1982). Ethnic styles in class Rowley, MA: Newbury House.
room discourse. In M. Hines & W. Schumann, E E., & Schumann, J. H.
Rutherford (Eds.), On TESOL '81 (1977). Diary of a language learner:
(pp. 11-24). Washington, DC: TESOL. An introspective study of second
Scales,J., Wennerstrom, A., Richard, D., language learning. In H. D. Brown,
& Wu, S. H. (2006). Language learn R. H. :Crymes, & C. A. Yorio (Eds.),
ers' perceptions of accent. TESOL On TESOL '77: Teaching andlearning
Quarterly, 4(4), 715-737. English asa second language—trends in
research andpractice (pp. 241-249).
Schachter, J., & Gass, S. (Eds.). (1996).
Washington, DC: TESOL.
Second language classroom research: Issues
andopportunities. Mahwah, NJ: Schumann,J. (1978a). The Pidginization
Lawrence Erlbaum Associates.
Hypothesis: A modelforsecond language
acquisition. Rowley, MA: Newbury
Scherer, G. A. C, & Wertheimer, M.
House.
(1964).A psycholinguistic experiment in
foreign language teaching. New York: Schumann, J. (1978b). Second language
McGraw-Hill.
acquisition: The Pidginization
Hypothesis. In E. M. Hatch (Ed.),
Schmidt, R. W (1983). Interaction,
Second language acquisition: A book of
acculturation, and the acquisition of
readings (pp. 256-271). Rowley, MA:
communicative competence: A case
NewburyHouse.

Reference 481
Seliger,H., & Shohamy,E. (1989).Second language education: Models andmethods
language research methods. Oxford: (pp. 259-281). Washington, DC:
Oxford University Press. Georgetown University Press.
Seliger,H. W (1977). Does practice Shuy, R. W (1993). Usinglanguage
make perfect? A study of interaction functions to discover a teacher's implicit
patterns and L2 competence. theory of communicating with students.
Language Learning, 27, 263-278. In J. K. Peyton &J. Staton (Eds.),
Seliger,H. W. (1983a). Classroom- Dialoguejournals in the multilingual
centered research in language teach classroom: Building languagefluency and
ing: Two articles on the state of the writing skills through written interaction
art. TESOL Quarterly, 17(2), 189-190. (pp. 127-154). Norwood, NJ: Ablex.
Seliger,H. W (1983b).The language Simard, D. (2004). Using diaries to pro
learner as linguist: Of metaphors and mote metalinguistic reflection among
realities. Applied Linguistics, 4, elementary school students. Language
179-191. Awareness, 13(1), 34-48.
Shamim, P (1996). In or out of the action Sinclair, J., & Coulthard, M. (1975).
zone: Location as a feature of interac Toward ananalysis ofdiscourse. London:
tion in large ESL classesin Pakistan. Oxford University Press.
In K. M. Bailey & D. Nunan (Eds.), Smith, L. M., & Geoffrey,W (1968). The
Voicesfrom the language classroom: complexities ofan urban classroom: An
Qualitative research on language analysis toward a general theory ofteach
education (pp. 123-144). New York: ing. New York: Holt, Rinehart and
Cambridge University Press. Winston.
Shavelson, R. J. (1981). Statistical reason Smith, J. K., & Heshusius, L. (1986).
ingfor the behavioral sciences. Boston: Closing down the conversation:The
Allyn and Bacon. end of the quantitative-qualitative
Shavelson, R. J. (1996). Statisticalreason debate among educational inquirers.
ingfor the behavioral sciences (3rd ed.). Educational Researcher, 15(1), 4—12.
Boston:Allynand Bacon. Smith, P. D. (1970). A comparison ofthe
Shaw, P. A. (1983). A sociolinguistic cognitive andaudiolingual approaches to
analysis of spoken discourse in foreign language instruction: The
undergraduate engineering classes. Pennsylvaniaforeign language project.
Unpublished doctoral dissertation, Philadelphia: Center for Curriculum
University of Southern California, Development.
Los Angeles. Snow,M. A., Hyland, J., Kamhi-Stein, L.,
Shaw, P. A. (1996). Voices for improved & Yu, J. H. (1996). U.S. languagemi
learning: The ethnographer as co- nority students: Voicesfrom the junior
agent of pedagogic change. In K. M. high classroom. In K. M. Bailey& D.
Bailey& D. Nunan (Eds.), Voices Nunan (Eds.), Voicesfrom the language
from the language classroom: Qualitative classroom: Qualitative research on lan
research onlanguage education guage education (pp. 304—317). New
(pp. 318-337).New York: Cambridge York: Cambridge University Press.
University Press. Spada, N. (1990). Observing classroom
Shaw, P. A. (1997). With one stone: behaviours and learning outcomesin
Models of instruction and their curricu- different second language classrooms.
lar implicationsin an advancedcontent- In J. C. Richards & D. Nunan (Eds.),
based foreign language program. In Second language teacher education
S. B. Stryker & B. L. Leaver (Eds.), (pp. 293-310). Cambridge:
Content-based instruction inforeign Cambridge University Press.

482 Reference
Spada, N., & Frohlich., M. (1995). Stewart, D., & Shamdasani, P. (1990).
Communicative Orientation ofLanguage Focus groups: Theory andpractice.
Teaching observation scheme: Coding Newbury Park, CA: Sage.
conventions andapplications. Sydney: Stewart, T. (2006). Teacher-researcher
NCELTR Macquarie University. collar, oration or teachers' research?
Spada, N., Ranta, L., & Lightbown, TESOL Quarterly, 40(2), 421-430.
P.M. (1996). Workingwith teachersin Storch, N. (2002). Patterns of interaction
second languageacquisition research. in ESL pair work. Language Learning,
In J. Schachter & S. Gass (Eds.), 52(1)\ 119-158.
Second language classroom research: Issues Strauss, A. (1988).Teaching qualitative
and opportunities (pp. 61-79). Mahwah, research methods: A conversation
NJ: Lawrence Erlbaum Associates. with Andrew Strauss. Qualitative
Spradley,J. P. (1979). The ethnographic Studiesin Education, 1(1), 91-99.
interview. New York: Holt, Rinehart Strong, M. (1986). Teachers' languageto
and Winston. limited English speakers and submer
Spradley, J. P. (1980).Participant observa sion classes. In R. R. Day (Ed.), Talk
tion. New York: Holt, Rinehart and ing to\learn: Conversation in second lan-
Winston. guage^acquisition (pp. 53-63). Rowley,
Springer, S. E. (2003). Contingent lan MA: Newbury House.
guage use and scaffoldingin a project- SullivanJ P. N. (2000). Playfulness as
based ESL course. Unpublished mediation in communicative language
manuscript,Monterey Institute of teaching in a Vietnameseclassroom.
International Studies, Monterey, In J. P. Lantolf (Ed.), Sociocultural
California. theory andsecond language learning
Springer, S. E., & Bailey, K. M. (2006, (pp. 115-131). Oxford: Oxford
April). Diversityin reflectiveteaching University Press.
practices: International survey results. Swaffar, J., Arens, K, & Morgan, M.
Paper presented at the CATESOL (1982). Teacherclassroom practices:
State Conference, San Francisco, Redefining method as task hierarchy.
California. Modern LanguageJournal, 66(1), 24-33.
Sreedharan, N. (2006). Sura's quotations Swain,M. (2000).The output hypothesis
of wit and wisdom. Chennai, India: and beyond: Mediating acquisition
Sura Books. through collaborative dialogue. In J. P.
Stake, R. E. (1988). Case study methods Lantolf (Ed.), Socioadtural theory and
in educational research: Seeking sweet second language learning (pp. 97-114).
water. In R. M.Jaeger (Ed.), Comple Oxford: Oxford University Press.
mentary methodsfor research in education Szostek, jC. (1994). Assessing theeffects
(pp. 253-266). Washington, DC: of cooperative learning in an honors
American Educational Research foreign language classroom. Foreign
Association. Language Annals, 27, 252-261.
Stake, R. E. (1995). The art ofcase Thorpejj. (2004). Coal miners, dirty
research. Newburg Park, CA: sponges, and the search for Santa:
Sage. Exploring options in teaching listen
Stenhouse, L. (1983). Case study in ing comprehension through TV news
educational research and evaluation. broadcasts. Unpublished manuscript,
In L. Bardett, S. Kemmis, & G. Monterey Institute of International
Gillard (Eds.), Case study: An overview Studies, Monterey, California.
(pp. 11-54). Geelong, Australia: Tinker Sachs, G. (2000). Teacher and re
Deakin University Press. searcher autonomy in action research.

Reference 483
Prospect: A Journal ofAustralian proficiency interviews as conversation.
TESOL, 15(3), 35-51. TESOL Quarterly, 23(3), 489-508.
Tinker Sachs, G. (2002). Learning van Lier, L. (1990a). Ethnography:
Cantonese: Reflections of an EFL Bandaid, bandwagon, or contraband?
teacher educator. In D. C. S. Li (Ed.), In C. Brumfit & R. Mitchell (Eds.),
Discourses in search ofmembers: In Research in thelanguage classroom: ELT
honor ofRon Scollon (pp. 509-540). Documents, 133 (pp. 33-53). London:
Lanham, MD: University Press of Modern English Publications and
America. British Council.
Trueba, G., Guthrie, P., & Au, K. H. P. van Lier, L. (1990b). Classroom research
(Eds.).(1981). Culture andthe bilingual in second language acquisition. An
classroom: Studies in classroom ethnogra nualReview ofApplied Linguistics, 10,
phy. Rowley, MA: Newbury House. 73-186.
Tsang, W. K. (2003).Journaling from in van Lier, L. (1992). Not the nine o'clock
ternship to practice teaching. Reflective linguisticsclass: Investigating contin
Practice, 4(2), 221-240. gency grammar. Language Awareness,
Tsui, A. B. M. (1995). An introdurtion to 1(2), 91-108.
classroom interaction. London: Penguin. van Lier, L. (1994a). Action research.
Tsui, A. B. M. (1996). Reticence and Sintagma, 6, 31-37.
anxiety in second language learning. van Lier, L. (1994b). Some features of a
In K. M. Bailey& D. Nunan (Eds.), theory of practice. TESOL Journal,
Voicesfrom the language classroom: 4(1), 6-10.
Qualitative research on language educa van Lier, L. (1996a). Conflicting voices:
tion (pp. 145-167). New York: Language, classrooms,and bilingual
Cambridge University Press. education in Puno. In K. M. Bailey&
Tsui, A. B. M. (2003). Understanding D. Nunan (Eds.), Voicesfrom the lan
expertise in teaching: Case studies of guage classroom: Qualitative research on
second language teachers. Cambridge: language education (pp. 363-387). New
Cambridge University Press. York: Cambridge University Press.
Tuckman, B. (1999). Conducting educational van Lier, L. (1996b). Interaaionin thelan
research (5th ed.). Ft. Worth, TX: guage airriadum: Awareness, autonomy,
Harcourt BraceCollege Publishers. andauthenticity. New York: Longman.
Turner, J. (1993). Another researcher van Lier, L. (1998). Constraints and re
comments. TESOL Quarterly, 24(4), sources in classroom talk: Issues of
736-739. equality and symmetry. In H. Byrnes
Tyacke, M., & Mendelsohn, D. (1986). (Ed.), Learningforeign and second
Student needs: Cognitive as well as languages: Perspeaives in research and
communicative. TESL Canada Journal scholarship (pp. 157-182). New York:
(Special Issue 1), 171-183. Modern Language Association of
America.
Ulichny, P. (1996). Performed conversa
tions in an ESL classroom. TESOL van Lier, L. (2000). From input to affor-
Quarterly, 30(4), 739-764. dance: Social-interactivelearning
van Lier, L. (1988). The classroom and the from an ecological perspective. In J. P.
language learner: Ethnography andsec Lantolf (Ed), Socioadtural theory and
ond language classroom research. Lon second language (pp. 245-259). Oxford:
don: Longman. Oxford University Press.
van Lier, L. (1989). Reeling, writhing, van Lier, L. (2001). Language awareness.
fainting and stretching in coils: Oral In R. Carter & D. Nunan (Eds.),

484 Reference
The Cambridge guide to teaching Wesche, M. B. (1983). Communicative
English to speakers ofother languages testing in a second language. Modern
(pp. 160-165). Cambridge: Cambridge Language Journal, 67(1), 41-55.
University Press. Whyte, W. F. (1981). Street corner society:
van Lier, L. (2005). Case study. In The social structure ofan Italian slum
E. Hinkel (Ed.), Handbook ofresearch (3rd ed.). Chicago: University of
insecond language teaching and learning Chicago Press.
(pp. 195-208;. Mahwah, N.J.: Wiersmal W (1986). Research methods in
Lawrence Erlbaum Associates. education. Boston: Allyn and Bacon.
Verity, D. P. (2000). Side affects: The Willing, K. (1988).Learning styles in adult
strategic development of professional migrant education. Sydney: National
satisfaction. In J. P. Lantolf (Ed.), Centre for English Language Teach
Sociocultural theory andsecond language ing and Research.
learning (pp. 179-197).Oxford: Willing, K. (1990). Teaching how to learn.
Oxford University Press. Sydney:National Centre for English
Vygotsky, L. S. (1978).Mindinsociety. Language Teaching and Research.
Cambridge, MA: Harvard University Winer, L. (1992). "Spinach to chocolate":
Press.
Changing awareness and attitudes in
Vygotsky, L. S. (1986). Thought andlan ESL writing teachers. TESOL
guage. Cambridge,MA:Massachusetts Quarterly, 26(1), 57-79.
Institute of Technology. Woodfield, H, & Lazarus, E. (1998).
Wajnryb, R. (1992). Classroom observation Diaries: A reflective tool on an INSET
tasks: A resource bookfor language language course.English Language
teachers andtrainers. Cambridge: TeachingJournal, 52(4), 315-322.
Cambridge University Press. Woods, D. (1989). Studying ESL
Wallace, M.J. (1998).Aaionresearchfor teachers' decision-making: Rationale,
language teachers. Cambridge: methodologicalissuesand initial
Cambridge University Press. results. Carleton Papers in Applied
Walsh, S. (2006). Investigating classroom Language Studies, 6, 107-123.
discourse. London: Roudedge Taylor Woods, D. (1996). Teacher cognition in
Francis Group. language teaching: Beliefs, decision
Wang, J. (2003). Students' needs and making andclassroom praaice.
teachers' dilemmas: A case of one Cambridge: Cambridge University
university. Hong KongJournal of Press.j
Applied Linguistics, 8(1), 33-50. Yahya, N. (2000). Keeping a critical eye
Warden, M., Swain, M., Lapkin, S., & on one's own teaching practice. EFL
Hart, D. (1995).Adolescent language teachers' use of reflective teaching
learners on a three-month exchange: journals.Asian Journal ofEnglish
Insights from their diaries. Foreign Language Teaching, 10, 1-18.
Language Annals, 28(4), 537-550. Yin, R. (1984). Case study research: Design
Watson-Gegeo, K A. (1988). andmethods. BeverlyHills, CA: Sage.
Ethnography in ESL: Defining the Yin, R. (2003). Case study research: Design
essentials. TESOL Quarterly, 22(4), and methods (3rd ed.). Thousand Oaks,
575-592. CA: Sage Publishing.
Weitzman, E. A, & Miles, M. B. (1995). Youngman, M. B. (1986).Analyzing
A software sourcebook: Computerpro questionnaires. Nottingham, UK:
gramsfor qualitative data analysis. University of Nottingham, School of
Thousand Oaks, CA: Sage. Education.

Reference 485
INDEX

Abstract 251, 338, 438,451, 460 Audio-recording 11, 16, 169, 182, 193, 200,
Acheson, K. A. 272 259-260, 263, 267, 271, 277, 281,
Achievement 16, 29, 31, 65, 125, 236, 258, 287-289, 306, 339,413,415,417,435,
320, 390, 393 455,457
Act265,271,341,347 Authenticity 325, 347, 391
ACTFL 324 Autobiographical research 284, 297, 300,
Action research 1, 5-7, 17-20, 23-25, 36, 307-308
44,47, 55, 63, 67, 77, 81-82, 120, 127, Bachman, L. F. 325, 335
129, 135, 158, 226-237, 239-241, Back translation 124, 146
243-253, 372, 379, 408,428, 438,445, Bacon, R. 371
452-453 Bailey, K. M. 9-11, 18, 19, 25, 32, 79,116,
Adair-Hauck, B. 162, 184 118,124, 130, 136,140,143, 154,
Adelman, C. 158, 159, 166, 184, 185 195-198, 258, 263, 266, 272-273, 277,
Affective factors 29 286, 289, 292-296, 299, 300, 302-306,
Aljaafreh,A.212 322,335, 363, 369, 391,401,416-417,
Allen, P.J. 14,268 428,430-435,440-442,456,458-459
Allison, D. 294 Barkhuizen, G. 349-350, 370,421
Allwright, D (R. L.) 48, 79, 166, 169, 202, Barlow, L. 462
253, 257, 263, 266, 275, 279, 283, 369, Barlow, M. 428,436
428,435 Barnard, R. 253
AIreck,P. 140, 142, 155 Bassano, S. 237
Alternative directional hypothesis 58, 59, 73 Bateson, G. 16
Alternative hypothesis 57, 58-59, 73, 114 Beany, K. 425
Amanpour, C. 186, 219 Beaumont, M. 462
American PsychologicalAssociation(APA) 34 Bejarano, Y.404-408, 410
Analysis of variance (ANOVA)338, 378, Bell curve 109, 112-113,373
392-395,401^102,407-408 BellJ. 132
Analyticalnomological paradigm 11,444-445 Benchley, R. 3
Andrews, R. 371 Benson, P. 199, 297, 332
Annotated bibliography2, 34-35 Bergthold, B. 299
Anthropology 8, 187-188, 211-212, 222, 297 Biber, D. 436
Antonek,J. L. 11, 163 Bilingual education 31, 125,435,440,454
Anxiety29,41, 204, 247, 306,416-417 Bilingual SyntaxMeasure (BSM) 325-326
Appel.J.295,296 Bimodal distribution 374
Applewhite, A. 26 Biographical research255, 284, 297, 300,
Applied linguistics 3, 39, 125, 133, 148, 307-308
157-158, 167, 183, 185, 187, 222, 285, Birch, G.J. 294, 295
305,411,426,436 Black box 13-14, 275
Arens, K. 13 Bley-Vroman, R. 334
Asher, J. J. 45 Blind review process 451
Asian Journal ofEnglish Language Teaching Block, D. 169, 253, 279, 296, 302,418
11,460 Borg, S. 462
Atkinson, P. 163,211,430 Bounded 161-164, 183, 190
Attrition 182, 205-206 Bowering, M. 427
Au,K.H.P. 189 Bowers, R. 263, 264, 265, 267, 341-343, 369
Auerbach, E. 236 Braine, G. 224
Audiolingualism 12-14 Braunstein, B. 299

486
Brinton, D. M. 296,461 Classroom observation 3, 33,49, 255,
Brock, C. 296, 359 257-259, 263, 268, 275, 282, 334,439
Brock, M. N. 296 Classroom-oriented research 9, 17, 55, 155,
Brown, C. 294, 300 175,315,332,425
Brown, H. D. 335 Cleghorn, A. 191
Brown, J. D. 33,63, 79, 87, 123, 126, 133,137, Code-mixing and code-switching 28
156,335,384,399,401,411 Coding 62|-63, 76, 87,176-178, 184, 255,
Brown, R. 167 260, 263, 266, 268, 271, 276, 278,
Brumfit, C. 24-25 281-282, 287, 339-340, 346, 362-363,
BrunerJ. 304 366, 3^9-370,424,426,428-429, 434,
Burling, R. 169 436,440
Burnaford, G. 462 Coffey,A. 430
Burns, A. 226, 253, 446,462 Cognitive code learning 13-14
Burt, M. 325 Cohen, A. D. 285
Burton, J. 462 Cohen,J.M.257,284
Busch, M. 133,136,156 Cohen, L. 10, 125,128,170, 228
Campbell, C. C. 294-295, 302,417,443 Cohen, MlJ. 257, 284
Campbell, D. T. 102 Cohort 3lk 316, 319,441,445
Canagarajah, A. S. 188, 191, 202-204, 206, Cole, R. 296
219,224 Collaboration 18-19,178, 228, 229, 366,
Card sort activity 329-330, 333,424-425, 427, 368,4l5,451
429,423,435 Communication feature 14
Carless, D. 168 Communicative language teaching 14—15,268,
Carr, W. 17, 18, 226, 253 362,4l4
Carrasco, R. L. 159 Communicative Orientation to Language
Carroll, M. 294 Teaching (COLT) 14-15, 268-270, 283
Carson, J. G. 294, 295 Comparability 76, 131, 172
Carter, R. 362,427,446 Comprehension checks 22,43, 326
Case study 8-9,47^18, 53, 55, 63, 81-82, 90, Comprehensive stage 192
92, 94, 100-102, 157-185, 190,198, Conceptual research 35
200-201, 209, 217,219,229, 234, 250, Concordancing 425-428,437
297,439 Conferences 3,279, 338, 353, 438,450-452,
Category system 260, 261, 311, 341, 363,434, 459-461
440-441 Confirmability 209
Causality 44,47,61, 67, 73, 162,198, 210, Confirmation checks 22,43
439, 449 Confounding (extraneous) variables60-61, 63,
Celce-Murcia, M. 169 66, 6^, 75, 78, 83-84, 89,103, 118, 120,
Chamot,A. U. 253 123,393
Chan, Y.H. 253 Conrad, si296, 436
Chaudron, C. 14, 25, 54, 194, 266, 334, 360, Construct 2, 39-42,44, 53, 56,60, 64-65,
429,442 75,126, 142-143,421,428,445^146,449,
Chi square 34, 330, 338, 372, 381-389, 398, 454
402,409-410 Content-based instruction 31,461
Choi, J. 297-299, 311 Contingency 212,214
Christison, M. A. 21,152,237 Control 60-61, 68, 75, 78,91, 93, 117, 123,
Clarification requests 22, 43, 326 160,162, 358, 366, 428, 431, 438
Clark, J. L. D. 13 Control group 44,46-47, 53, 70, 85-86,
Clarke, M. A. 172 93-l6o, 103-104, 106-107, 111, 114-115,
Classroom, definition of 15-16 117, 324, 374, 376-378, 380, 390-392,
Classroom context mode 343-344, 404,445
346, 369 Conversattonal analysis339, 347, 352-356,
Classroom management 249, 269 359, 363, 369-370, 412, 423
Classroom-based research 4,17, 55, 312, 334, Conversational replay 165, 356
335,361,364,425,445,453,456 Cooley, L| 33, 53, 443
Classroom-centered research 4, 16, 258 Coombe, C. 462

Index 487
Corpora 425, 427-428,436 Distance learning 15-16
Correction move 165, 356 Donato, R. 11-12,162-163,184, 363
Correlation 2, 56, 70-74, 77-78, 89, 100-101, Dornyei, Z. 126-127, 135, 138, 141-142,
114, 120, 338, 372, 382, 396-401,439, 147-148,154,156, 370,411,429-430,
447,449 436, 439,444,461
Coulthard, M. 270-271, 340-341, 343, 350, Doughty, C. J. 327,446
434,440 Dowsett, G. 314
Crago,M. B. 189 Duff, P. A. 158,161-162,165,167-168,170,
Credibility 201,209, 224, 248 172, 173, 175, 181-182,185,188,189,
Criterion groups design 2, 70, 71, 73, 77, 78, 191,214-217,223,349
89,100,150,319,324,381 Dulay, H. 325
Critical paradigm 350 Duterte, A. 253
Critical values 380, 384-385, 388, 399-400, Eckes, M. 335
407-408,410 EdgeJ.226,253,462
Crookes, G. 253 Edwards, J. A. 370
Culture 9, 23, 27-28, 31-32, 52, 73, 77, 84, Ehlich, K. 347
100,116,119,121,143,168-169,174, Elicitation, eliciting 41, 87,124-126,142,
187-192, 194-197, 206-208, 211, 217, 154, 232-237, 255-256,265-266,
222,224-225, 298, 303, 312,314-315, 307-308, 312-313, 315, 317-318, 320,
328,371,410,418,429,451,454 323,325-326, 330, 332-335, 341, 350,
Cumming, A. 175-181, 184, 212 359,423
Curran, S. 334 Ellis, R. 188, 275-277, 294, 302, 333,
Curtis, A. 197,253,293,303 348-350,370,421
Dagneaux, E. 426 Empirical (data-based) research 1, 3-6, 9-10,
Dailey-O'Cain, J. 359-360, 370 32-33, 35, 57, 90, 155, 161, 205,223,
Damon, W. 366 276,353
Danielson, D. 294, 295 English Language TeachingJournal (ELT
Data tagging 425-427 Journal) 460
Davidson, F. 156 Equality 366, 368
Davis, K. A. 225, 436 Equivalent time samples design 96-97, 100
Degrees of freedom (d0 372, 379, 384-385, Ericcson, K. A. 285, 287-288, 305, 311
388-389,407^108,410 Error treatment 116,198, 222, 346, 357,
de la Torre, R. 45 359-361,363,370,456
de SilvaJoyce, H. 253 Ethical issues 51, 82,124,141, 147-148, 240,
Delaney, A. E. 296 338,447-448,452,457,461
Denness, S. 426 Ethnography 8-9,28, 81-82,164-165,
Denzin, N. K. 211, 234, 279,446 173,186-195, 198-200,202-211,
Dependability 209, 224 214, 217-219, 222-223,244, 349,
Dependent variable 59-61, 63-64, 68-69, 71, 439-440
73, 75, 78, 84, 86, 88, 91,93, 97, 99,103, Ex post facto designs 56, 70-71, 77, 89,100,
117, 126, 149, 210, 319, 324, 381, 389, 150,381,396
402,404 Exchange271, 340, 343, 345-346,
Dialog journals 65,175-179, 181, 184 352-353,355
Diary study 169, 234, 255, 279, 284, 286, Expectancy threat (researcher or subject
292-299, 301-308, 310-311,416-417, expectancy) 85,119,141,153
439,443 Experience bias factors 84-85
Diaz, C. 253 Experimental group 2,47, 61, 68-69, 71, 95,
Dingwall, S. 438 100, 104, 106-107, 109, 111, 114-115,
Directing 265, 341,343 117-118, 373-376, 378-379, 391,404
Directional hypothesis 58, 59, 73 Experimental research 2, 6,18,44,46-47, 49,
Directionality 72,400 57, 67, 70, 81-82, 100-101,106,126,158,
Discourse analysis 21, 270, 340, 368, 370,412, 162, 166, 172-173,182-183, 186, 188,
423,434 194, 196,198, 200, 204-208, 210-211,
Discourse completion task 138-139, 227-231, 249, 252, 297, 371, 379, 421,
321-322,333 438-439, 454-456

488 Index
Exploratory interpretive paradigm 11, Gliksman, L. 147
444-445 GoddardJA.427
Exploratory practice 253 Goetz, J. 200-201, 203-207, 223
Evans in, W. 26 Goldstein, L. M. 296
Fratio393,407 Gonzalezi V. 253
Factorial studies 61, 78,100, 319, 394-396 Graffiti bbard 233
Fs:rch,C. 311 Grandcolas, B. 294-296
Falsification 57, 174,194 Granger, S. 426
Fanselow, J. F. 253, 267-268, 283 Green, J. 189
Farhady, H. 375 Grotjaha R. 11, 81, 83,186, 292, 311,
Farrell, T. S. C. 462 4444445,460
Fetterman, D. M. 334 Grounded theory 194, 218, 338,412,414-415,
Field notes 10,182,190,201, 203, 212-213, 421^123,440,451
216, 218, 258-259,274, 277-279, Gu, P. Y. 156
281-282, 289, 317,401,415,417^418, Guba,E.jG. 172,209,421,424
432-434,440-441, 453, 455^157,461 Guthrie, P. 189
Fischer, J. 462 Halbach,|A.294
Flanders, N. 266, 339 Halliday,M.A.K. 167
Fleischman, N.J. 299 Hammersley, M. 163,211
Flick, U. 314,334 Harklau, jL. 168, 188
Focus on Communications Used in Contexts Harrison, 1.418
(FOCUS) 267-268,283 Hart, D. 294
Foreign language contexts 48 Hatch, EM. 16, 27,110,156, 185, 374-375,
Foreign language in the elementary school 384,401,409,411,461
(FLES) 11-12, 164 Hawthorke effect 88, 123
Foreign Language Interaction Analysis (FLint) Heath, SJB. 187, 188, 196,200
266,339-340 HedgcockJ.42,43
Fowler, F. 129 Henry, C. 230, 253
Fowler, G. 26 Henze, Ru C. 9, 255, 294-295
Fox, K. 195 Heshusius, L. 462
Fraser, B. 322-323 High-inferencecategories202
Freeman, D. 53, 224, 274, 284,414-415, 453 Hilleson,|M.294,302,417
Frequency distribution 81 Hinkel, E. 446
Frohlich, M. 14,268,283 Histogram 108,123
Frota, S. N. 294-295, 302,417,434 History threat 85
Frothingham, A. 26 Ho, B. 296
Fry,J.311 Hobson, D. 462
Gaies, S.J. 24, 339 Holbroolc, M. P. 299
Gain scores 91, 94, 99,404-406 HollidayjA.413,436
Galda, D. 315-318 Holmes, O. W. 284
Gall, M. D. 272 Holten, C. 296
Gallimore, R. 219 Hood, S. 253
Galvan, J. L. 53,461 Hoon, L. H. 294
Garaizar, I. 253 Hornberger, N. 224
Gardner, R. C. 147 Hosenfeia, C. 285
Garner, H. 412 Howell-Richardson, C. 294
Gass, S. M. 54,137-138,153-154,162, 259, Huang, J 294
311,335,362,438,456,461 Hubermin, A. 161,458
Generalizability 64-65, 67, 69, 87, 123, Hughes, A. 335
170,172,174,184, 207, 245, Hunt, K. W. 177
250-251,458 Hyland, J. 328
Genesee, F. 189,191 Hypothesis 2,16, 18, 27-28,42, 56-58,69-70,
Geoffrey, W. 187 77, 91, 93, 99, 120-122,126,168,
Gieve, S. 462 174-175,193-194, 229, 231, 298, 350,
Giroux, H. 192 390,402,443,459

Index 489
Hypothesis-orientedstage 193 Jepson, K. 21,22, 25, 34,42-43, 51,63,
Hypothesis testing 2, 18, 28, 56, 77, 174, 229, 78, 326
372,381,402,409-410 Johnson, D. 125,161
IELTS 280, 324,408,440 Johnson, K. E. 185, 283, 352,363,
ILR Oral Proficiency Interview 324 369-370,462
Incorporation 22,43, 269, 326 Jolley,J.63,123, 135-136
Independent variable 59-61, 70, 73, 75, 78, 84, Jones, F. R. 294-295
91,93,99,103,117, 123, 149,319,324, Jourdenais, R. 311
380,394-395,431 Kamhi-Stein, L. 328
Individualized instruction 41 Kasper, G. 311
Insight 7, 19, 23, 67, 75, 155, 162-173, Katz, A. 454-459
184, 203, 207, 218, 238, 241-242, 245, Kebir, C. 253
247, 284, 289, 295, 297,299, 303, 307, Kemmis, S. 17-18, 158, 226, 228-230, 253
311,316-318,329,332,420-421,427, Kennedy, M. 300
436,439 Kincheloe,J.L.462
Instability of measures 87 Knezedvoc, B. 253
Institutional talk 353 Knobel, M. 462
Instructional episode 162 Knowles, T. 253
Instructions 42,145,155,221,261,288, 301, Kramsch, C. 363
304, 307-308, 321-323, 333-334, 367, Krishnan, L. A. 294
369,403 Kuhn, T. 438
Instrumentation threat 84-85, 87, 122 Kumaravadivelu, B. 236, 253
Intact groups design 46, 73, 77, 89, 91-94, Kusudo, J. A. 45
100, 102-103, 381 Kwan,T.Y.L.253
Interaction 4,6, 14, 16, 21-22, 28-29, 34, Labov, W. 196,281,305,447
87, 125, 159-160, 163-164, 166, 169, Lamb, C. 348
175-176, 180, 182, 186-191,200,203, Lampert, M. D. 370
208, 215-216, 249-250, 255, 259-266, Language Contact Profile 125
268, 270-274, 276, 280-283, 289, 291, Language Learning 11, 25, 34, 79
303,306,312,314,317-318,321, Language Teaching Research 11,460
326-328, 337, 339-341, 343, 345-347, Lankshear, C. 462
349, 353, 355-357, 359-370,413, 417, Lantolf,J.P212,271, 362
440, 443,454-456 Lapkin, S. 294
Interaction analysis266, 339, 412, 423 Larimer, R. E. 235
Interaction effect 87-88, 395-396 Larsen-Freeman, D. 67, 157,170,172
Interaction effects of selection bias 87-88 Law, B. 335
Interactive combination of factors 86 Lazaraton, A. 16, 27, 110, 136, 156, 225, 356,
Inter-coder agreement 63, 177, 277, 340, 370, 374, 384,401,409,411,436,461
363, 370 Lazarus, E. 294
Interlocutor 21,43, 125, 138,269,279, Leatherman, J. 296
326, 360 Learning strategies 29, 152,292, 310, 317,
Interpretive paradigm 350,444-445 332,413,416,443
Interval data 2, 38-39,40, 59, 114-115, LeCompte, M. 200-201,203-205, 207, 223
135-136, 381, 389-390, 393, 396-397, Lee, E. 296
399-403, 409 Legutke, M. 21
Interviews 47, 62, 186, 192-193, 205, 221, Lenzuen, R. 253
223, 312-321, 324, 328-330, 332, 334, Leopold, W. F. 163
350, 353,413-416,418, 421,430, Lesson plan 29,234, 249, 289, 291, 304,
444-445,454-455 319-320,332,413,415,436
Introspection 255, 259, 284-311, 439 Levine, H. 219
IRF (or IRE or QAC) pattern 270, 271, 340, Lew, L. 296
345, 350 Lewin, K. 226
Jaeger, R. M. 126, 161, 374,410-411 Lewin, L. 226
Jarvis, J. 296 Lewkowicz, J. 33, 53,443
Jenkins, D. 158 Libben,G. 294

490 Index
Lieblich,A.297,423 Member checking
che (also membervalidation)
Liebscher, G. 359-360, 370 363,429-430,461
Lightbown, P.M. 445 Mendelsohn, D. 294
Likert, R. 133 Merriam, k B. 35, 161,163
Likert scale 131, 133-134,136, 156 Michonska-Stadnik, A. 253
Lin,A.M.Y. 189 Microanalysis 356, 370
Lin, Y. 42,43 Micro-ethnography 208, 217
Lincoln, Y. S. 172, 209,421,424,446 Miles, M.|B. 161,426,436,458
Literature review 2, 27, 32-36,42,44, 52-53, Miller, I. K. 462
55, 59-60, 73, 127, 212, 235, 327, 368, MingucciJM.226,251,253
421,431,439,443,445-446,459 Mitchell, M. 63,123,135,136
Lockhart, C. 227 Mitchell, R. 24,25
Logic 438 Mixed methods research 11, 371,408,
Long, M. H. 4, 13-14, 24,42, 260, 275, 439-440,111 115,453-458
326, 362,446 Mode 110, 111, 112,114,168, 372-374, 379
Longhini, A. 294,295 Moderator variable 60-61,67, 70, 89, 100,
Longitudinal research 8, 69, 86,158, 163, 121-123, 319, 324, 381, 385-389
166-167,173, 178, 180, 182,189-190, Modern LanguageJournal 11, 460
193,198, 205, 208, 217-218, 224, 297, Mok, A. 253
314,316,350,417,449 Moore,Tl 294
Lowe, T. 294-295 Morgan, D. 315
Low-inference categories 202,204, 222, 266 Morgan, M. 13
Lozanov, G. 116 Mortality threat 69, 85-86, 182
Lynch, T. 279, 281,408,439 Moskowitz, G. 266, 339-340
Mackey,A. 137-138,153-154, 162, 259, Motivation 17, 29, 37, 41, 60, 103, 192, 202,
311,335 204, 231, 297, 300,443,445-446
Maclean, J. 279, 181,408,440 Move 22, |34,42^13,63, 271-272, 274,
Main effects 395 356, 370
Managerial mode 343, 346, 369 Multiple perspectives analysis 11-12, 350
Manion, L. 18,125,128,170, 228 Multiple treatment interaction 89
Marginal frequencies 386 MurpheyJT. 370
Markee, N. 340, 352-356, 370,461 Mutualityj 366, 368
Marshall, C. 225,450-451, 453,461 Nassaji, H. 175,176-181, 184, 212, 226, 328
Martyn, E. 326-328 Naturalistic inquiry (or naturalistic research)
MasonJ. 334 1, 7-j>, 11, 14,18,20, 23-24, 36,44,
Materials mode 343, 345, 369 47-48, 63, 67, 77,90,120,127, 155, 158,
Matsuda, A. 296 160, !l64,167,170,173,183,186,
Matsuda, P. 296 188-pO,199, 202, 228, 230-231, 334,
Matsumoto, K. 294, 309, 311 363, 372, 379,423,428,438-439,
Maturation threat 85, 86, 87, 206 444-JW5,453
Maxwell,J.A.363 Negative feedback 22,42-43, 233
McCarthy, M. 341, 343-346, 369 Negotiated interaction (negotiation of
McDonough,J. 296 meaning) 21-22, 34, 42^13, 326-327, 362
McGarrell, H. M. 462 Nisbett, R. 288
McKay, S.L. 25,461 Nominal data 37-39, 381, 383, 389,400,403
McPherson, P. 253 Nonequivalent comparisongroups design
McTaggart, R. 227, 229, 253, 288 93-94, 100, 102-103
Mean 86, 106, 110-115,118,121, 123, Nonparticipant observer 193-195, 197,213,
129, 372-375, 377-379, 389-390, 223-224, 259
392-394,404-406,408^109,413,418, Nonverbal behavior 250, 340-341, 350, 355,
440-442,447 362-363
Meaning condensation 412,418-421,440 Normal curve (normal distribution) 83, 110,
Measures of central tendency 110, 372-375 112-113,123,129,373
Measures of dispersion 110, 372-373, 376-379 Normative paradigm 349-350
Median 110-112, 114, 372-375, 379 Null hypothesis 57-59, 73,114

Index 491
Numrich, C. 296 Participant observer 165, 193-195, 197, 203,
Nunan, D. 9, 16-17, 21,25, 27, 66, 79,124, 213,219,223-224,259,279
148,150,152,154,161,194, 197,199, Particularization 171-172
201, 204, 206, 228, 231, 250, 253, 264, Peck, S. 168, 294
269, 276, 282, 289, 291, 297, 308, 326, Pennington, M. C. 296
328, 332-333, 342-343, 348, 361-363, Pennsylvania Project 25
369,416,418,420-423,427, 429,442, Perecman, E. 334
444, 446,448-450,452, 459, 462 Perry, F. L. 411
O'Brien, T. 462 Peyton, J. K. 175
Observation 3-4, 7,10-11, 14-15, 33, 35,49, Phelps, E. 366
50, 57, 76-77, 87-88, 102,161, 163, 165, Pica, T. 24, 327, 362
170, 173-174,182,186-187,189-199, Pike, K. L. 197
201, 205, 211-213, 221-223, 228, 234, Piloting 87, 124,140-141, 145
255, 257- 261,263-264, 266-270, Piore,M.437
272-273, 275-279, 281-283, 289, Plummer, K. 297,428
297-299, 305, 321, 325, 332-334, 365, Polio, C. 295, 296,461
415,432-434,439-440,447, 453, Popper, K. 57, 174
455-457 Population 2,44, 46^18, 53,63-67, 70-71,
Observation schedule (or scheme or system) 4, 83-84, 86, 88-89, 92, 94,98, 104-106,
11, 15, 87, 164, 212, 234, 255, 258-261, 110, 113-115,124-125,127-129,
264,266-268,270,279,339 144-145,150,152-153,158,170-172,
Observer's paradox (reactivity) 196, 205, 216, 183,196,199,207,372,381,393,
280-281, 306, 212, 234, 258, 447 413,456
O'Farrell, A. 424 Porter, P. A. 296
Ochsner, R. 292 Porto, 294
Ohta, A. S. 363 Post-test only control group design 98-99,
One-group pre-test post-test design 90, 92,94, 100, 380, 391
100, 365, 381, 392 Practice effect 85, 87, 123,192
One-shot case study 90, 92, 94, 100-102,158, Pre-experimental designs 100
183,229 Presenting 265, 341
Open-ended items 136-137,140-141, 153, Pre-test post-test control group design
155,413,416,444 99-100,103, 380
Operational definitions 2, 36, 41-42, 56, 63, Probability 372, 380, 384-385,400,410
120, 326, 358, 363, 370, 375, 404,413, Process studies 1,13-14, 117, 257, 275
443,445-446,449,454 Production tasks 256, 312, 321, 326-327, 333
Ordinal data 38-39, 108, 136, 156, 375, 381, Process-product studies 1, 14, 117, 172, 257,
399,402-403 275,404, 456
Ordinary conversation 353 Product studies 1, 14,117,172, 257,
Organizing 264-265, 341, 343 275,281
Otto, F. M. 25 Profiles 10,432-433,440,455
Outcome 7,13-15,19,45-46, 56, 58, 60-62, Prompt 140, 179-180, 263, 290, 321-323, 325,
66, 68-69, 85, 88, 94-95,97, 103,117, 330, 345
119-120,158, 167,171,186, 203-206, Proof 57, 67
227-229, 231, 242, 247-250, 275, 279, Proposal 18, 130, 225, 249, 338,437,440,
281, 302, 323, 359, 364, 370, 380, 391, 432-443,447,450-451,453,461
397,400, 418, 440,443,450, 454,457 Prospect-a JournalofAustralian TESOL 11,
Oxford, R. 243, 362 119,460
Palmer, A. S. 325,335 Psychometric research 6, 7,11, 23,46,67, 73,
Palmer, C. H. 295 202,230, 408,428, 431,438, 444
Palmer, G. M. 295 Publishing 11,13,16, 27, 33, 53, 75,120, 227,
Paradigm 11, 12, 81, 83, 170, 207, 211, 228, 337, 364,434-435,447^152,459-460
292, 349-350,438,442,111 115,460 Qualitative data analysis 10-11, 19, 153, 156,
Paraphrase 22,43,407 164,179, 190,198-199, 249, 292-293,
Parkinson, B. 294 304, 330, 337, 359,408,412-436,442,
Participant bias factor 85 444,447, 456-458

492 Index
Qualitative data collection 10-11, 19, 63, 75, Reactive effects of experimental arrangements
129-130, 163, 178,190, 198-199, 211, 87-8^, 123, 196, 205
249, 292, 356, 371,401, 435,442,444, Reah,D.427
449,453-454,456-458 Reference! list 20, 35,461
Qualitative research 5, 7,9-11, 14-15, 19,161, Reid, J. 156
170,173,178-179, 186, 189, 200-202, Reliability! 1-2, 62-63, 65,67, 75, 77, 79, 82,
207-209, 211, 218-219, 224-225, 314, 84, 87,137, 170, 183,187, 198-204,
350,412-437,439-442,446,450,453, 207-209, 222, 224, 277, 281-282, 288,
457-458,461 305-306, 324,409,428,449,453
Quality control 2,42,62, 65,67, 84, 87, Repair 22J 34,42^43, 63, 353, 359-360,
119,124,141,154,163,170,173, 368,370
183-184, 187,198,207,209,211, Replicability, replication 34,41-42,63, 65, 78,
222, 224, 226, 234-235, 255, 277, 300, 120, 155,170,199-201, 204, 222,428
339, 363,402-403,412,428-430, Reppen, R. 436
453-454 Research gaps 33, 178,443
Quantitative research 1, 5-7, 9-11, 14-15, 19, Research question 2, 5, 11,12, 22,24, 27-30,
34,48, 51, 56, 71, 73, 83, 126, 129, 141, 32-3J1,42,44-45,47,49, 52-53, 55-56,
153, 164, 166,178-179, 184, 190, 202, 58-59, 69-72, 75, 77-78, 81, 89,91, 93,
218-219, 236, 292, 318, 330, 337-338, 97,99, 117,121-122, 126, 129-130, 132,
350,408,413,428,437,439-442, 139,145, 149, 151, 155, 167, 177,
444-445,453-458,462 182-183, 191, 193-194, 196, 213, 216,
Quasi-experimentaldesigns 11,97-98, 100, 224, 239, 231-232, 234-235, 252, 260,
210,292,444-445 266, 275, 280, 282-283, 287, 290, 298,
Questionnaire 11, 16, 33,41-42,49, 55, 82, 307-308, 310, 314-315, 318, 328,
87-88, 102, 116, 124-156, 164,203,212, 333-334, 356, 360-361, 346, 363-364,
234-235, 237-238, 241, 243-246, 252, 368-369, 381,402,425,430,434-435,
256,279-280, 302, 312-314, 316, 439, JW2-445,449-450,452-454,
318-321, 332-334, 350,413,415^116, 458-461
418,428,443444,458 Respondents 126, 130-138, 140-144, 148,
Questions 29, 31, 37,43, 62, 76,100, 126, 151,153-156, 313, 317-320, 322-323,
130-133,136-138, 140-145,147, 151, 429, J445
153-154,162, 178-179, 202, 204, 206, Responding 341, 356, 359
215-216, 232-233, 236-237, 239, Response set 135
242-245, 251, 261-262, 264-266, Retrospection 213, 255, 284-286, 288-289,
269-270, 274, 279, 282, 291-292, 299, 297-299,305-307,311,418
304, 309-310, 313-314, 316-319, Richard, D. 372
325-326, 330-331, 340-343, 352-353, Richards,J.C.227,296,446
358-360,403,413-414,416,443, Richards, K. 9, 194-195,198, 207-209, 218,
455-456 225, 294-295, 430-431,436,462
Quirke, P. 232-234 Rintell, E 322
Radecki, W. 144 Rivers, W. 295
Raffier, L. M. 296 Rodgers,T.S.401
Raiseddesign (or two-phase design) 320, 324 Roebuck, R.363
Randomization 44,46, 53, 71, 85-86, 92, Rogan, P. 296
94, 97-101, 103-104, 116, 120, 127-128, Role play 64, 98, 256, 269, 312, 322-323, 329,
164, 172, 199, 205, 210, 236, 381, 391, 332-534
404,445,449 Rossman, G. B. 225,450-451,453,461
Range 106-107, 110-111, 114, 372-375, 376, Rounds, P. L. 119,442
379,405,410 Rowntree, D. 83, 105
Ranking 6, 39-41, 59,106-108, 132, 329, Rowsell, L. V. 294
393, 403 Rubin, J. 294-295
Ranta, L. 445 Ruiz de Gauna, P. 253
Rater reliability 62, 363, 428 Ruso, N. 294, 296
Rating 6, 59, 62,150-151, 232, 239-240, 324, Sample 2, 5, 10,44, 46-49, 53, 64-65, 67,
363, 403,440-442 69-7C1, 74, 83-84, 88-89, 101, 104-106,

Index 493
109-110,115,117,119,124-125, Social Sciences and Humanities Research
127-129,145-146, 150, 152, 154-155, Council of Canada 460
162,164,172,174,177, 182-183, 207, Sociating 341, 343
319-320, 334, 372, 381, 388-390,402, Socioculturaltheory 177-181, 271, 362
413,418,432 Sociolinguistics 50, 131, 134, 162, 214, 269
Sanger, K. 427 Soule-Susbielles, N. 294-296
Santana-Williamson, E. 296 Spada, N. 14-15, 268, 283,445
Sato, C. 74-78, 122, 250, 388-389 Spencer Foundation 460
Saunders, S. 391 Spradley,J. P. 193,314,334
Scaffolding 178-179, 181-182, 184,212-214, Springer, S. E. 49-52, 130,136, 154, 212-214
304-305, 362-363, 368 Sreedharan, N. 3
Scales, J. 372 Stake, R.E. 172-173, 183
Scatterplot, scattergram 396-398,400, 410 Standard deviation 106, 110-115, 153,
Schachter, J. 54,119,438,456,461 373, 376-379, 390, 392,405,408,413,
Scherer, G. A. C. 12-13,275 418,441
SchleicherL.235,296 Stanley,]. C. 102
Schmidt, R. W. 173-174, 294-295, 302, Statistical regression85-86
417,434 Statisticalsignificance6, 10, 12-13,15, 22-23,
Schrank, A. 312 34,44, 57-58,68, 71-73, 76, 83,114-117,
Schumann, F. E. 294-295, 302 120, 152, 281, 330, 348, 350, 359, 372,
Schumann, J. H. 168, 174, 294-295, 302 379-396, 399400,402,440
Scientific method 83,123, 188, 230 Statistics 10-11, 38, 51, 58, 61, 70-71, 73, 75,
SCORE data 272-273, 282, 339 77, 79, 81, 83, 86, 100-102, 104-115,
Second languageacquisition 6, 14, 20-21, 119-120, 123, 129,136,152, 154, 156,
30, 32,49-50, 124,138, 157, 167-168, 164-166, 172, 174,208,218-219,247,
174-175, 183, 185, 212, 268, 271, 292, 249, 292, 304, 330, 333, 338, 350,
297, 299, 311-312, 321, 326, 333-334, 372-381, 384, 393, 396411,418,447
349, 353 Stenhouse, L. 165
Second languagecontext 48, 356, 448 Stewart, D. 314
Secondaryresearch (or libraryresearch) Stewart, T. 462
33,35 Stimulated recall 35, 255, 259, 281, 284, 286,
Selection threat 85, 87-88, 103, 205-206 288-292, 306-308, 311, 350, 369,413,
Self-repetition 22,43 415-416
Self-report 42, 147, 150, 153, 280, 287, 302, Storch, N. 364-368, 370
305, 350 Strauss, A. 424
Seliger, H. W. 24, 30-31, 125, 311 Strong, M. 356
Semantic differential scale 133-135, 167 Subjectivity 7, 24, 119, 173, 195, 200, 243,
Settle, R. 140, 142, 155 270,323,428
Shamdasani, P. 314 Suggestopedia 116-119
Shamim, F. 274 Sullivan, P. N. 363
Shavelson, R. J. 100, 123,411 Summarizing 10, 33,421, 432-434,440
Shaw, P. A. 189,193-194 Survey 13, 50, 81-82, 102, 124-156, 162,
Shohamy, E. 30-31 164, 174, 222, 297, 313, 365,415,439,
Shuy, R.W. 176 445, 452
Simard, D. 194 Swaffar, J. 13-14
Simon, H. A. 285, 287-288, 305, 311 Swain, M. 294, 363
Sinclair, J. 270-271, 340-341, 343, 350, System 11
434,440 Szostek, C. 253
Skills and systems mode 343-345, 369 Szulc-Kurpaska, M. 253
SLEP 324 t-test 115, 338, 389-392
Smith,J.K.462 T-unit 177-179
Smith, L.M. 187 Task 3-6, 9, 14, 16, 24, 28-30, 35, 76-77, 114,
Smith, P. D. 25 135,138,154-155,159-160,169, 184,
Smythe, P.C. 147 202,212-214, 222, 237,242-243, 245,
Snow, M. A. 328, 330,461 248-249, 251, 256-257, 262-263,

494 Index
265-266, 272, 279, 282-283,286-289, Trouble source 359, 370
293, 304-308, 311-312, 323,325-330, True experimental design 44,46, 53, 70-71,
334, 340-341, 357, 361-366, 369, 73,77,89,92, 94, 98-100,380,445
396-397,410,414, 443 Trueba, G. 189
Teacher research 4, 19-20, 24, 53, 59, 82, 203, Tsang, W. K.I296
263,364,462 Tsui, A. B. M. 185, 250-252, 360, 370
Teaching assistants (TAs) 196, 273-274, Tucker, G.R. 11,163
417-418,430435,440-442,459 Tuckman, BJ 41-42, 84-87, 95-96, 123,136
Teaching style 10,89,222, 274,300,417,433, Tuman, J. 299
441,454,456,458 TurnerJ.L. 136,156,219
Technology 5,15, 20-22,24-25, 34,50, Turn-taking 63,74, 76, 122,125, 250,
212,314,318,338,340,401,412, 259, 3£3
425-428 Tuval-Mashiach, R. 297,423
TESOL 49, 119,436,462 Tyacke, Ml 294
TESOL Quarterly 11,225,436,447,460 Ulichny,P. 164-165,356
Testingthreat 86-87, 99, 103, 123 Validity l|2,61-70, 75, 77-79, 81-82, 84-85,
The International Research Foundation for 87, 89, 99-100,103,116,118-119,
EnglishLanguage Education (TIRF) 460 122^123, 128, 152 , 155, 157-158,
Theory 18, 19, 52, 57-58, 73,159, 175, 181, 1704171,173-174,182-184,187,194,
185,192,194-195, 208, 211, 213, 218, 198^200,204-210, 222-224, 249-250,
227, 235, 249, 421,436, 437,440,443, 277| 281, 305-306, 324-325, 333,
451 428-430,446,449,453
Think-aloud protocols 255, 284, 286-289, Variable's 1-2,6-7,15,17,27, 36-41,44,
305-308,311,350 46J48, 53, 56, 59-61,63-64,66-73, 75,
Thorpe, J. 135, 235, 236-246, 251 77f78, 83-84,86,88-89,91, 93,97,
Threats to validity and/or reliability 56, 67-70, 99-100,103,110,114,117-121,123,126,
75,77-78, 81, 84-89, 99,103, 116-119, 149,158,160,162,171-172,194,198,
122-123,128,152,155,171,173,182, 210,230,249,297,319,324,328,350,
202,204-206,333,449 366,372, 380-383, 385, 389, 393-402,
Timesampling 432
Time series design 94-97, 100-101 404,409-410,421,431,438-439,444,
Tinker Sachs, G.253, 294-295 4*19,460
TOEFL 90-96, 98-99, 280, 324, 399,408 Variance 110,114, 338,372,376,378-379
440 ' 3^92-393,407 '
TOEIC 324 Video1 recording 16, 74, 76, 87, 121 159
Topic-oriented stage 192 162-163, 187, 190, 200, 203, 208, 216
Total physical response (TPR) 45, 57.53 218,235-236, 238, 242, 244, 246, 252'
Transaction 8,271 258-260, 267-268, 271, 277-278
Transcripts, transcription 8, II,16, 20, 35 281-282, 287, 289-290, 306, 315 332
162-163, 165, 182,193, 199, 216-217 339-340, 34, 3686, 353, 362-363,' 39l'
236-237, 242-244, 259,263-264 396,, 413-416,424,446
270-271, 273, 280-281, 283, 289,' 291 Virtual classroom 15, 20, 21, 257
306, 337, 340, 344, 346-349, 352-356* van Lier, L. 7-8, 15,18, 20, 25, 55-56,
362-363, 365-367, 369-370, 401 67, 119, 170, 172, 185, 188, 198,203
413^114,418,424,429,435,454! 1207,208-212,218-219,221,223-224
455-457 j226, 229-231, 245, 246, 251, 253 ?57*
Transferability 171-173, 184,209,224 279,282, 322, 362-363, 366, 435* 440*
Treatment 4446, 53, 60-61, 63-64 66 (454,457 ' '
68-70, 74, 85-89, 92, 94-97, 100 ' Venty, D.P. 295-296, 304
Vygotsky, L. S. 178, 362
102-104, 115, 119, 126, 158, 202! 204 Waissbluth, X. 299
210,218,231,236,380-381,391 393' Wait-time 29, 37,47, 55, 202, 251 414
395,404 ' '
Wajnryb, R. 283
Triangulation 82,163, 202, 211-214 223 Wallace, M.J. 25, 226, 253
234-235, 278, 280-282, 302, 307, 318 Wallat, C. 189
320, 332, 350
Walsh, S. 341, 343-346, 369-370

Index 495
Walters, J. 322 Winer, L. 296
Wang, J. 168 Wong, L. 148, 150, 152, 154,444
Warden, M. 294 Wong, M. 296
Watson-Gegeo, K. A. 186, 189, 190-192, Woodfield, H. 194
194-195, 197, 207, 224 Woods, D. 289-290
Weisner.T. S. 219 Wu, S. H. 372
Weitzman, E. A. 426,436 Yahya, N. 296
Wen, Q. 156 Yates correction factor 409
Wennerstrom, A. 372 Youngman, 132
Wertheimer, M. 12-13, 275 Yin, R. 161, 166, 170-171,174
Wesche, M.B. 323,461 Yu, B. 296
Whyte,W.F. 187 Yu,J.H.328
Wiersma.W. 30-31,35 Zambo, L. 299
Willing, K. 150 Zilber, T. 297, 423
Wilson, T. 288 Zone of proximal development(ZPD)
Wilson-Duffy, C. 295-296 177-178,362

Text Credits
Ch. 3: P. 75, adapted from Nunan, D. 1992. Research Methods inLanguage Learning. Cambridge University Press.
Reprintedwith permission.
Ch. 7: P. 190, 204, 206, adapted from Nunan, D. 1992. Research Methods in Language Learning. Cambridge
UniversityPress. Reprintedwith pennission.P.213, adaptedfrom Springer,S. E. (2003). Contingent languageuse
and scaffolding in a project-based ESL course. Unpublished manuscript, Monterey Institute of International
Studies, Monterey, California. Reprintedwithpermission.
Ch. 8: P. 88,adapted from Quirke, P.(2001). Hearingvoices: Arobust andflexible framework forgathering andusing
student feedback. InJ. Edge (Ed.), Action research (pp. 81-91). Alexandria, VATESOL. Reprinted with pennission.
P. 75, adapted from Nunan, D.1992. Research Methods in Language Learning. Cambridge University Press. Reprinted
with pemiission. P. 231, adapted from Quirke, P. (2001). Hearing voices: Arobust and flexible framework for
gatheringand using student feedbacL InJ. Edge (Ed.), Action research (pp. 81-91). Alexandria, VA: TESOL. Reprinted
with Permission. P. 238-240, adapted from Thorpe, J.(2004). Coal miners, dirty sponges, and the search for Santa:
Exploring options in teaching hstening comprehension through TV news broadcasts. Unpublished manuscript,
Monterey Institute ofInternational Studies, Monterey, California. Reprinted with permission.
Ch.9:P.258,273,adaPtedfromBa^^
University Press. P. 111. Reprinted with permission. P. 269,276, adaptea rrom i^un
X^L^gXambWUniversity Press'^^.^^^^^6^^^.^
clm^UnUity Press, P. 51. Reprinted withpermission

with permission. Self-Directed Teacher. Cambridge University Press.


Ch. 12: P. 347-348, adapted from Nunan and Lamb 1W. lot •>«/
Reprinted with permission - Acooperative small-group methodology in the language
Ch.13: P. 404-406 adapted ^J^^l^J,^P«falon.
classroom. TESOL Quarterly, 21,483-504. KW. *ep
AA
Macmillan. Reproduced
Ch. 14: P. 431, adapted from Richards, K., Qualitattve ,„a,nry wTESOL, 2003, Pa.gr
with permission ofPalgrave Macmillan. inmv;„ective methods. From C. Fsrch &G. Kasper
Ch. 15: P. 445, adapted from On the «»»«^^u^TSSS^«w Publications. Reprintedwidn
(^s^lntrospeaioninse^d^gemea^
permission.P.449,459,450^452,.adaptedfromNunan, DIVV ^ ^ A^
University Press. Reprinted with P™°"**^ ffi&D. Nunan (Eds.), Voicesjivn, the language ciassroon,.

496 Index and Text Credits

You might also like