Exploring Second Language Classroom Research PDF
Exploring Second Language Classroom Research PDF
Classroom Research
A COMPREHENSIVE GUIDE
David Nunan
University ofHong Kong/Anaheim University
Kathleen M. Bailey
Monterey Institute ofInternational Studies
f \ HEINLE
1% CENGAGE Learnim
Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
\ HEINLE
t% CENGAGE Learning-
Heinle
25 Thomson Place
Boston, MA 02210
USA
Printed in Canada
123456789 12 11 10 09 08
For my research students, who helped me refine my thinking
by trying out these ideas and activities.
David Nunan
ForRobMcMillan,
Sarah Springer,
Will Radecki,
Marie-Lise Bouscaren,
Analisa Bouscaren Radecki,
Noel Isaiah Cortez Radecki Bouscaren,
andFady—
thanksfor beinghere thesepastfewyears.
KathiBailey
CONTENTS
Preface v
Acknowledgments vii
References 463
Index 486
Text Credits 496
IV
PREFACE
vi Preface
ACKNOWLEDGMENTS
ike most authors, we are grateful for the support ofthe many people who
vu
PART I
Second Language
Classroom Research
An Overview
There are two kinds ofpeople inthis world—those that divide the world
into two kinds ofpeople and those thatdon't, (attributed toRobert Benchley,
as citedin Sreedharan, 2006, p.54).
Overriding Objectives
The book has two overriding objectives. The firjit of these is to provide an
overviewand introduction to language classroom research. To this end, we look
at both substantive issues (the topics and questions that have been investigated
by classroom researchers) and methodological issues (the techniques and meth
ods that researchers have employed for collecting data, analyzingand interpret
ing their data, and presenting the results). The second objective is to help read
ers develop confidence and practical skills for carrying out their own empirical
investigations. To this end,we will suggest topical foci and provide tasks to help
readers identify their own areas of interest. We will also provide guidance and
activities for developing research skills.
Many years ago, Long(1983a) advanced several reasons forthestudy ofsec
ond language classrooms, particularly on the part of teachers in preparation. In
thefirst place, Long argued thatclassroom-centered research can provide a great
deal ofuseful information about how foreign language instruction isactually car
ried out (in contrast to what people imagine happens in classrooms). Secondly,
classroom-centered research can promote self-monitoring by classroom practi
tioners. Third, the various observation schemes for classifying classroom inter
action can be used by teachers to investigate their own classes and the classes of
colleagues. Finally, involvement ofteachers in classroom research can help them
to resist the temptation to jump on thevarious methodological bandwagons that
come rolling along from time to time. Descriptive studies of what actually goes
on in classrooms can help teachers evaluate the competing claims of different
syllabi, materials, and methods.
Our hope, then, is that havingreached the end of this book, readers willhave
a clear idea ofthestate of the art in terms ofwhat researchers (including teacher
researchers) have looked at and how they have gone about their investigations.
We also hope that teachers and future teachers who read this book will be en
couraged to examine their own classroom contexts systematically. Having read
the bookand completed the questions and tasks set out in the various chapters,
readers should understand key concepts and methods in classroom-based re
search. They shouldalsobe able to relate these concepts and issues to their own
teaching situations.
Recurring Themes
This book has several recurringthemes. The first is that empirical research mat
ters.The second is that there are manyroles for teachers in the research process.
Even those teachers who do not plan to do research themselves should be knowl
edgeable,informed, and criticalconsumers of others' research. The third is that as
classrooms are specifically constituted to facilitate language learning, we should
develop skills for systematically finding out what goes on in them. Fourth, we
argue that no single approach to language classroom research is inherently supe
rior to others;instead, the choice of a research method should be determined by
the purpose of the study—a point we willreturn to throughout this volume.
REFLECTION
When we began our own teaching careers(more yearsago now than we wish
to admit), there was a preoccupation with a search for the one best method, and
We have used the term empirical research in the text above. What does this
phrase mean to you? Is it the same as experimental research}
Although van Lier's comments were written more than two decades ago, they
remain true today.
Naturalistic research is actually a cover term for several different research
methods. Two of the methods thathave frequently been used in language class
room research are ethnographies and case studies. Both of these methods will
be covered in detail in future chapters, but herewewill briefly consider them to
illustratenaturalistic inquiry.
Large-scale, long-term studies aimed at investigating classrooms as cultural
systems are called ethnographies. The roots of the ethnographic tradition in lan
guage classroom research can be traced to anthropology:
In anthropology, the ethnographer observes a little-known or 'exotic'
group of people in their natural habitat and takes fieldnotes. In addition,
working with one or more informants is often necessary, if only to de
scribe the language. Increasingly, recordingis used for description and
analysis, not just as a mnemonic device, but more importantlyas an es
trangement device, which enables the ethnographer to look at phenom
ena (such as conversations, rituals, transactions, etc.) with detachment.
The same ways of working are applied in classrooms. However, record
ing (and subsequent transcription) is of even greater importance here
than in anthropological field work, since many more things go on at the
same time and in rapid succession, and since the classroom is not an
exoticsetting for us but rather a very familiar one, laden with personal
meaning, (van Lier, 1988, p. 37)
REFLECTION
classrooms empirically. In fact, Allwright and Bailey (1991) argue against the
oversimplistic contrast of quantitative-versus-qualitative research. They state
that classroom data can be collected either quantitatively or qualitatively, and
that it can also be analyzed quantitatively or qualitatively. Figure 1.1 shows
some examplesfrom language classroom research.
REFLECTION
Based on what you already know about language classroom research and
aboutyourself, doyouhave a predisposition foreitherquantitative or qual
itative data collection? What about quantitative or qualitative data analy
sis? If you do favor one approach overanother, explain your position to a
colleagueor classmate.
In thissection, we will take a historical approach and briefly review some illus
trative investigations into language acquisition in classroom settings. These
studies typify the development of second language classroom research and show
how the different research traditions have been realized over the years.
An early example of classroom research in the psychometric tradition was
conducted by Scherer and Wertheimer (1964). The project was a classic
'methods-comparison' study in which the researchers set out to compare the
grammar-translation method of teaching with the then innovative audiolingual
method. The research question guiding thestudy was asfollows: Is audiolingual-
ism a more effective method of learning a foreign language for college-level
learners than grammar-translation?
The subjects in this study were two groups of college students learning
German asa foreign language. One group was instructed in listening, speaking,
reading, and writing using translation and grammar studies. The other group
was taught bytheinnovative audiolingual method, in which the emphasis was on
listening and speaking rather than on reading and writing. In audiolingualism,
translation was avoided and grammatical rules were learned inductively rather
than deductively. At the end of the two-year experimental period, both groups
were tested, and the scoreswere analyzed to decidewhether differences were sta
tisticallysignificant.
REFLECTION
REFLECTION
Formal Informal
REFLECTION
What does the term action research mean to you 5Have you read any action
researchreports in the past? Ifso, what do you recall about this approach?
If not, what can you predict?
ACTION RESEARCH
To note that action research has a practical focusdoes not demean its value.Asvan
Lier (1994a) has observed, "We must never forget that it is ... important to do
research on practical activities andforpractical purposes, suchasthe improvement
of aspects of language teaching and learning" (p. 31). (See also van Lier, 1994b.)
In this section, we have given a brief account of action research, a topic that
is elaborated upon and described in greater detail later in this volume. To sum
marize, it is a method of conducting research that involves the participants (such
Type of
Research What Who
Think of a study—one that you have read or that you could imagine
doing—that would involve classroom action research conducted by a
teacher (or a team of teachers). Explain what makes that study simultane
ously (1) action research, (2) classroom research, and (3) teacher research.
ACTION
Skim the reference list for this book. Put a check mark byany items which
sound particularly interesting to you. Can you tell from the titles which
items are likely to involve the psychometric tradition, naturalistic inquiry,
or action research?
Atthe present time, technology is having a profound impact on all aspects of life.
In education, the ease with which computers can bring people together across
time and space is forcing a redefinition of the classroom. In the opening section,
we discussed ideas from D. Allwright (1983) and van Lier (1988), whosuggested
that classrooms could be defined as places where individuals were gathered to-
getherfor purposes of teaching and learning. I Iowever, through technology, this
"gathering together" no longer requires the individuals to inhabit the same
physical space.
The following vignettes illustrate some of the changes wrought by technol
ogy in language education and teacher preparation:
REFLECTION
Have you ever taken or taught a course online? If you have, what were the
differences in the online interaction and what you might experience in a
similar course taught in a face-to-face classroom context? If you haven't,
what do you diink the differences might be?
SAMPLE STUDY
The repair moves Jepson studied had been identified in previous research. They
consisted of two main categories: negotiation of meaning and negative feedback.
The negotiation of meaning category had five types: (1) clarification requests,
(2) confirmation checks, (3) comprehension checks, (4)self-repetitions and par
aphrases, and (5) incorporations. The negative feedback category also had five
types: (1) recasts, (2) explicit correction, (3) questions, (4) incorporations, and
(5) self-corrections.
The data for this study were recorded simultaneously in the voice chat
roomsand the textchat rooms of an online English program using twocomput
ers. Jepson audio-recorded five minutes of voice chats at the same time he saved
the texts generated by the participants in the typed chat rooms. He repeated this
procedure five times, fora total of twenty-five minutes of typed chat and twenty-
five minutes of voice chat. Jepson later transcribed the interaction in the voice
chat rooms. He counted the number of times the participants engaged in nego
tiation of meaning and in negative feedback.
REFLECTION
The results of this study are interesting and complex and too numerous to
list in detail here. The main findings can be summarized as follows: (1)There
were more repair moves in general in the voice chats than in the text chats.
(2) Likewise, there was more negotiation of meaning in the voice chats than in
the text chats. (3) There were fewer negative feedback repairs than negotia
tion of meaning repairs. All of these differences were statistically significantly
different.
CONCLUSION
The aimof thischapter has beento set the groundwork for the rest of the book.
We have described the main characteristics of two predominant traditions—
psychometric research and naturalistic inquiry. We have considered how those
traditions have underpinned the evolution of classroom research. We have
provided a briefhistorical overview, as well as a definition (and redefinition) of
Ifyou would like to learn more about the early history oflanguage classroom re
search, werecommend the articles by D. Allwright (1983), Brumfit andMitchell
(1990a), Gaies (1983), Long (1983a; 1984), Pica(1997), and Seliger (1983a). For
Getting Started on
Classroom Research
Writing iseasy. Allyou do issitand stare ata blank sheet ofpaper until
drops ofbloodform onyourforehead. (Fowler, as cited in Applewhite, Evans, &
Frothingham, 2003, p.300)
Most books on classroom research eventually get around to the practical sideof
the business, usually culminating witha chapter on doingresearch. However, we
decided to jump in and introduce some of the practicalities, along with the diffi
culties, of doing research fromthe verybeginning of this book. There are several
reasons for thisdecision. First, we wantyou to start thinking aboutthe research
process from your own perspective right at the outset. We also hope that you
might "get your feet wet" by doing some relatively informal problem posing,
data collection, and data analysis before you getto theend ofthebook. Thirdly,
we hope that thischapter will provide a lens through which youcan view the rest
of the book. We want you to be able to read about data collection and analysis
issues with a stronggrounding in the practicalities of the research process.
In essence, this chapter outlinessome of the typical procedures that under
pin the planning andimplementation of a classroom research project, regardless
of the research traditionyou choose. In anysuchproject, whetherit iscarriedout
by experienced university researchers with large grants or classroom teachers
who are simply interested in understanding and improving the quality of what
goes on in their classrooms, researchers needto consider the following questions:
© What aspect of classroom teaching and learning am I interested in, and
what specifically is it about this issue that I reallywant to know?
26
© Does anybodyelse have an answer to my question?
© How do I get started?
© What kinds of data will be relevant to my research interest and question?
© How will I gather those data?
© What techniques existfor analyzing my data:
© Howcan I make myresearch available to anyone else who might beinterested?
These questions correspond to the following phases of a research project:
(1) areaidentification, (2) question formation, (3) literature review, (4) planning
and implementation, (5) data collection, (6) data Analysis, and (7) reporting. In
this chapter, we will deal with each of these areas although we will mainly focus
on area identification, question formation, the literature review, and defining
variables during the planning stages of researcli. Additional issues in data
collection and analysis will be dealt with further in subsequent chapters about
particular researchmethods.
We also want to introduce the idea that research is a kind of culture. It has
its own values, norms, artifacts, rules, and procedures. Like others sorts of cul
tures, research has subcultures. If you keep this metaphor in mind as you read
this book, you can imagine yourself as being an anthropologist, exploring the
culture(s) of language classroom research. We will return to this theme from
time to time aswe cover the concepts of research design and research methods.
REFLECTION
If you are having difficulty coining up with ideas lor research, you might
follow Hatch and Lazaraton's (1991) suggestion that you keep a research journal.
Such a journal can he a useful resource when it comes to defining what interests
you and identifying a research area. Here is how such a journal might be
developed:
Each time that you think of a question for which there seems to be no
reach' answer, write the question down. Someone may write or talk
about something that is fascinating, and you wonder if the same results
would obtain with your students, or with bilingual children, or with a
different genre of text. Write this in your journal. Perhaps you take
notes as you read articles, observe classes, or listen to lectures. Place a
star or other symbol at places where you have questions. These ideas
will then be easy to find and transfer to the journal. Of course, not all
these ideas will evolve into research topics. Like a writer's notebook,
these bits and pieces of research ideas will reformulate themselves
almost like magic. Ways to redefine, elaborate or reorganize the
questions will occur as you reread the entries, (pp. 11-12)
Other sources ofideas can be found at the ends of articles, theses, or dissertations.
Thesereportsusually include asection called "Suggestions forFurtherResearch."
Table 2.1 contains some general research areas and topics for classroom re
search. This list isveryselective and is intended onlyto be illustrative sincethere
are manyother topicsthat could form the basis for classroom investigation.
As we have suggested, research questions can come from many different
sources—a problem that arises in the classroom, an interesting article, discus
sionswith colleagues, a journalof research ideas that we havekept over time, and
so on. One advantage of reading about the topic in question is that we maywell
find a study that can help us take our research interests forward. For example,
although we are particularly interested in listening comprehension, wemay find
a study on the effect of background knowledge on! reading comprehension. This
study might prompt us to ask whether the effects of background knowledge
ACTION
Selectone of the topics in Table 2.1 that interests you and turn it into a re-
searchable question.
ACTION
REFLECTION
REFLECTION
Here is a story that Kathi Bailey tells her research students. Read the story
and see what it has to do with posing appropriate research questions.
Novice researchers often equate die process of doinga literature review with doing
research. It is true diat reviewing the literature is important, but locating and sum
marizing what others have written about a topic is often called library research or
secondary research Q. D. Brown, 1988). In this book, we are concerned with empirical
research—that is, investigations inwhich the researchers collect and analyze original
data of their own in order to answer the research questions they have posed.
A literature review can be published as a stand-alone piece or as part of a
reporton empirical research. There are at least four main reasons fordoing a lit
erature review when conducting empirical studies. We will examine each in turn.
The first reason for doing a literature review is to obtain background infor
mation on the area that you have chosen to investigate. A systematic literature
review will acquaint you with previous work in the field and should alert you to
problems and potential pitfalls in your chosen area. It will also help you locate
clear definitions of key terms.
The second reason is to help you to identify research gaps (Cooley and
Lewkowicz, 2003), which involve work that hasn't yet been done. Having
reviewed and summarized the work of others, you will be able to spell out a
research space or gap in the research literature that yon propose to fill. Finding
and articulating such gaps is part of developing the rationale for your research.
The third reason is to discover tools that could help you answer your own
research question. For example, you ma}- find a questionnaire someone else has
developed that is directly related to your interest, or you might learn about a
classroom observation instrument that would be useful in your study. So, as long
as you choose judiciously and cite your sources properly, it is acceptable to
utilize research procedures and tools developed by others.
ACTION
Visit the APA Web site and note the format for citing books, articles in
journals, and chapters in books.
]epsotij Kevin. (2005). Conversations—and negotiated interaction—in te,vt and voice chat
rooms. Language Learning andTeclinologij, 9(3), 79-98.
Jepson wanted to investigate tfte "quality ofinteraction among Entjfisft L2 speakers in con
versational text or voice chat rooms" (p. 79). Me compared' the voice chats and text (typed)
cfiats ofnon-native speakers interacting on the Internet, lie investigated which types of re
pair moves occur in tc,vt and voice chats, and looked for any differences 6etween the repair
moves in the typed chats and the voice cfiats. Jepson recorded ten 5-minute, synchronous chat
room sessions (jive of text-chats and five oj voice chats). "Significant differences were found
Between the higher number oftotal repair moves made in voice chats and the smaller number
in t&tt cfiats" (p. 79). The repairs in the voice cfiats were often refatedto students' pronun
ciation issues. He used chi-squarc to analyze the quantitative data.
In our experience, these guidelines can be very helpful, especially to novice writ
ers. You can also use these guidelines as criteria in evaluating literature reviews
written by other researchers.
ACTION
Find a journal article or a book chapter about classroom research. Skim the
literature review. How have the authors organized the information in their
literature review?
ACTION
Think about the following statements made by teachers and identify the
variables implicated in the statements.
Teacher A: I always used to teach grammar inductively, but I've recently-
started experimentingwith deductive techniques.
Teacher B: Your grades on the listening test I gave you Friday are much
improved.
Teacher C: I used to teach in an all girls'school, but I now teach in a coed
school.
Teacher D: I justcan't get my reading group motivated.
Teacher E: I've been trying to wait longer between asking a question and
repeating the question or giving students the answer.
Teacher F: I seem to have a class of holistic learners this semester.
Here are the variable labels we would apply to these teachers' comments:
Teacher Variable
A lnstruetioiv.il method
B Listening test scores
c; Students' gender
D Learners' motivation
F Learning style
These variables are different in kind because the data that comprise them are
different. The simplest kind of data is known as nominal data. The word nominal
is an adjective formed from noun. A nominal variable is something that can be
named. Nominal data are also called categorical data because they involve putting
things into categories to create variables. With nominal variables, we can create
a series of 'boxes' that can be given a label and to which instances of each
REFLECTION
Not all variables arenominal to begin with although theycan beturned into
nominal variables. Consider testscores. Such data will typically bea sequence of
numbers: 26, 74, 33, 52, 88, 81,45, 62, and 49. These data cannot beassigned to
boxes in the way gender and instructional method can. We could, if we wished,
turn them into nominal variables by creating two (or more) boxes, say "high
achievers" and "low achievers," and deciding, somewhat arbitrarily, that all
learners who scored 50 and above would be placed in the high achievers' box,
and those scoring less than 50 would be placed into the low achievers' box.
Notice that this data reduction exercise results in less detailed information than
do the original scores. Saying that Billy is a low achiever, while Nancy is a high
achiever gives us less precise information than providing the students' actual
scores. If we knew, for instance, that Billy had scored 49 and Nancy had scored
51, our interpretation of the students' proficiency would be much different than
if Billy had scored 24 and Nancy 89.
Test scores and other measurements are typically considered to be interval
data because theyaremeasured onwhatiscalled an interval scale. This concept is
easy to understand in terms of measurements of length, such as kilometers or
inches. An inch is an inch whether it is the difference between fourteen inches
and fifteen inches or between sixty-five inchesand sixty-six inches. A kilometer
remains the samelength whetherwe are talking about the distance between four
and five kilometers, or between 320 and 321 kilometers. The unit of measure is
a constant interval. In the psychometric tradition, researchers oftentry to collect
intervaldata because there are powerful statistical analyses that can be done with
interval (or interval-like) data.
A third concept that is important in our field is oi'dinal data—that is, data
that are represented in an order. In the example above, we can say that Billy
scored lower than Nancy, whether their scores were 49 and 51, respectively, or
24 and 89, respectively. Ordinal datalack the precision of interval data, but they
are often useful.
Ordinal measures are sometimes used in classroom research when precise
counts or measurements are not appropriate or are not available. For example, if
REFLECTION
Think of three more examples of interval data and three more of ordinal
datathat mightoccurin studies of language leahung and teaching.
A final type of scale for measuring variables i; the ratio scale, which meas
ures absolute values such as temperature. Ratio scales are veryimportant in the
physical sciences, but they are of little value in applied linguistics because con
structs such as language proficiency do not exist in absolute quantities. (Some
onewhoobtains a score of 50on a grammar test does not necessarily know twice
as much grammar as someone who obtains a score of 25.) As they are of little
practical interest to us in classroom research, we will not deal with ratio scales
any further in this book.
REFLECTION
ACTION
Rank the students in order from highest to lowest in terms of their gram
mar test scores.
Now divide the students into those whose names start with letters in
the first halfof the alphabet and those whosenamesstart with letters in the
second half of the alphabet.
REFLECTION
Answer the following questions and compare your answers with a col
league or classmate.
1. What type of data were the students' actual French grammar test
scores?
2. What type of data did you get when you ranked the students' names
according to their grammar test scores (but did not include the scores
themselves)?
3. What typeof data didyouhave when you divided the students into those
whose names start with letters in the first half of the alphabet and
those whose names start with letters in the second halfof the alphabet?
We have spent a fair amount of time discussing variables and the sorts of
data you might use in your research (or read about in other people's reports)
because these concepts influence a great many decisions about how we will col
lect and analyze our data. Another related and important issue is that of
constructs and how they are operationali/.ed, the topic of our next section.
ACTION
Look at a research article on a topic that interests you. What key terms
havebeen defined? Which of the approachesdescribed byTuckman (1999)
were used to write the operational definitions?
ACTION
Write the research question for the study about using TPR with the
beginning-Japanese students. What terms would you need to operationally
define in preparingto carryout this study?
You could, of course, use TPR with both beginner classes and then see how
well they do on the examination, hut that would give you no point of compari
son. You could test all the students about their knowledge of Japanese vocabulary
before the TPR lessons and again after the six weeks of TPR lessons, but there
would be little reason to do so since you know the students are true beginners.
So, you decide to useTPR techniques with one class ofJapanese learners and to
use your regular teaching method with the other group. That way you can com
pare the vocabulary test scores of the two groups of students—those with whom
you used TPR and those with whom you did not. The purpose of withholding
the experimental treatment from one group is to be able to compare the learners'
test scores and see if those who receive the TPR lessons outperform those who
do not. (In other words, you want to see if using TPR causes a difference in the
learners'Japanese vocabulary test scores.)
REFLECTION
Imagine that as you are planning your research project about TPR lessons
and Japanese vocabulary learning, you find out that atiiletic practice for
school sports will be held at 3 P.M. every afternoon, so any beginning
students of Japanese who wish to participate in sports will enroll in your
8 A.M. class. You also learn that orchestra, band, and choir practices are
held every morning from 7 to 9, so any of the music students who wish to
take beginningJapanese will be in your afternoon class.
What are the implications of these facts for interpreting the outcomes
of your study?
REFLECTION
Think about some classroom research that you would like to do. What case
or sample might you study? What population would be represented?
In other situations, you may choose a particular case as the object of your in
vestigation because that entity is somehow unique or special. For example, in an
early classroom investigation, R. L. Allwright (1980) wanted to investigate how
ESL learners got speaking turns in class. In the process, he found that some
learners got many more than the typical number of turns and others got far
fewer. So, after discussing the quantitative analysis of the number and length of
turns taken by all the students, Allwright investigated a particular learner who
got far more than his "fair share" of speaking turns in class.The close analysis of
that learner's conversation with the teacher allowed Allwright to discover how he
got so many turns. In this case, the learner's uniqueness was what caused the re
searcher to focus on him.
Sometimes sampling decisions are influenced by practical issues such as the
availability of subjects or of a research setting. If you are livingand working in
Sao Paulo, for instance, it will be much easier and cheaper to observe and inter
view teacher trainees working there than in New York City. If you are teaching
adult EFL learners in Seoul, it will be much more practical to learn about their
beliefs regarding language learning and teaching than about the beliefs of learn
ers in Cairo. Making sampling decisions on the basis of availability is known as
opportunistic sampling.
In the experimental tradition, opportunistic sampling is frowned upon be
cause it is not as powerful as random sampling in controlling variables. However,
in some naturalistic inquiry, opportunistic sampling can increase our access to
subjects and add depth to the database. For example, many of the earliest case
studies of children acquiring two languages simultaneously were done by those
children's parents. The parents had broad exposure to the children's emerging
speech and could record their utterances at any hour of the night or day. In those
The way you articulate your research question(s) will clearly influence your
data collection (whether you do so through questionnaires, classroom observa
tions, etc.).
To summarize, research design issues, such as sampling, are important in
nonexperimental research as well as in experimental studies. The design you
choose is direcdy related to how you word your research question(s).The design
will also be influenced by the literature you review.
SAMPLE STUDY
In this chapter, rather than summarize an entire study,we will report on how one
classroom researcher got started on an investigation. This study was by Sarah
Springer, a teacher who had enrolled in a graduate program in TESOL (Teach
ing English to Speakers of Other Languages). Through her course readings, she
became interested in what factors promoted second language acquisition. Here
are Springer's (2003) commentsabout choosinga researchfocus:
My own training and previousexperience as an EFL teacher in Europe
had involved courses organized around grammatical syllabi that pre
sented discrete linguistic forms in a linear progression. The first
opportunity I had to work with groups of students using content- and
project-based syllabi was in the summer of 2002. The... summer
program had been designed to provide both ja cultural exchange and
academic research experience forJapanese university students who over
REFLECTION
Have you ever had an experience like Springer's, where a job change or a
new opportunity led you to reflect on what you were teaching and why? If
so, what issues arose for you? If not, can you imagine a context in which you
might face such issues? (If you are a new teacher with limited experience, it
would be worthwhile to ask an experienced teacher these questions.)
Predict what Springer might have done to address these elaborated ques
tions. Sketch out a plan of how you would proceed to investigate these
issues in a language classroom.
There are many practical problems that occur in language classroom research.
Our purpose here is not to intimidate or worry you about conducting your own
research but rather to help you prepare to overcome such problems by anticipat
ing them. Careful planning in the design stages will help you avoid difficulties
and disappointments in the data collection and analysis stages.
In fact, once you get to die point of fleshing out your research ideas, it is a
good practice to spend some time anticipating the practical problems likely to
get in the way of successfully completing the project. Anticipating problems and
thinking about solutions can help smooth the research path. Problems encoun
tered by our own graduate students include the following:
1. Lack of time (a particular problem for students who are also working or
taking several courses)
2. Lackof expertise, particularly at critical points in the research process such
as formulating a researchable question, determining the appropriate
research design, and, in the case of quantitative research, selecting the
appropriate statistical tool
3. Identifying subjects willing to take part in the research
4. Negotiating access to research sites(Unless you are collecting data in your
own classroom or your own school, getting permission to collect data can
be both time-consuming and frustrating.)
5. Issues of confidentiality
6. Ethical questions relating to data collection, which become acute when
you want to collect data without alerting your subjects beforehand
7. The sensitivity of reporting negative findings, particularly if these relate to
individuals you work with or know well
8. The difficulty of actually writing up the research
The last problem can becomeacute, particularlyfor researchers who are not
native speakers of the reportinglanguage or who lack confidence (and sometimes
ACTION
Think of some research you would like to conduct. Make a list of the prob
lems you think might arise in your own research situation.
CONCLUSION
In this, the final chapter of Section 1 of this book, we will continue to build on
the ideas presented in Chapters 1 and 2, with a special focus on planning your
research designso that you can make good decisions about your data collection
and analysis procedures. Having posed a research question and done your litera
ture review, you are wellset to begin planningyour study, but what you actually
do depends on the research questions or hypotheses you have posed and what
your goalsare in conductingthe investigation.
For instance,if you wanted to conduct an action research study, like the pos
sibleinvestigation of a teacher's wait time described in Chapter 2, your research
questions and procedures would be different from those you would use if you
developed an experimenton wait time in which one group of teachersincreased
their wait time and another group did not, and you subsequently assessed the
quantity and quality of verbal turns taken by the students of those two sets of
teachers. Likewise, if you wanted to do a case study of one particular teacher
trainee's experiences while doing his or her practice teaching in an inner-city
school, the procedures would be quite different than if you wished to contrast
the questionnaire responses of 200 trainees in Manchester and another 200 in
Johannesburg.This questionnaire study would be more like classroom-oriented
research than actual classroom-based research (see Chapter 1), but it would still
be relevantand important to teacher educatorsand trainees.
55
REFLECTION
Based on what you've read so far and what you already knew about
research, think of a classroom-based or classroom-oriented study that you
would like to do. Keep that idea in mind as you read this chapter.
The key point here is that there is no perfect research design or research
method. As van Lier (1988) notes, the tools you choose depend on the goals you
wish to accomplish. So, as we proceed through the next sections, please keep in
mind that we do not advocate any of the main research traditions over the oth
ers out ol context. However, we will begin this chapter with concepts associated
primarily with research design in the psychometric tradition for two reasons:
(1) historically it has been (until recently) the dominant tradition in classroom
research, and (2)some of the vocabulary associated with quantitative data collec
tion and analysis are used in other approaches as well. So, the psychometric
tradition serves as a useful point of departure for discussions about several
approaches to classroom research. For this reason, we will begin with the con
cept of hypotheses and hypothesis testing. We will then consider the sorts of
threats to research design that can invalidate a study. Finally, we will discuss the
ex post facto class of designs. This class consists of two important research
designs, one of which enables us to investigate correlations of two or more vari
ables. The other leads to nonexperimental comparisons of different groups.
These designs are useful in language classroom research, and there are several
examples of both in the published literature of our field.
HYPOTHESIS TESTING
ACTION
ACTION
ACTION
ACTION
Skim a research report in our field. Did the author(s) take any steps to
ensure the reliability of data collection instruments, of ratings, or of data
coding processes?
ACTION
Read the following vignette and identify why the study has weak internal
validity.
AnESL teacherin Australia wanted to see if using role plays would help
improve the speaking fluency of his intermediate students in his adult
evening class. So, he used role plays in class once a week for the second to
fourteenth weeks of the semester. There were twenty-three students at the
beginning of the term but only eighteen at the end of the semester, as two
of the older students had to drop out due to health reasons and three of the
younger ones had to stop, either because the class was too difficult for them
or because they had work-related issues that prevented them from coming
to class. At the end of the course, the teacher rated the students on their
speaking fluency during the final role-playactivities. He tape-recorded the
role plays and asked another teacher to discuss with him her impressions of
the students' fluency. Based on his success with role plays in that semester,
he decided to incorporate role plays in his beginners' course the next term.
ACTION
Read the following vignette and identify at l<:ast three issues that con
tribute to the study'sweak externalvaHdity.
A teacher of French as a foreign language at a prestigious secondary
school in a wealthyneighborhood believed hei use of dialog journals was
helping the students in her two honors classes improve their spelling and
grammatical accuracy in French. (Dialogjourads are personalwritingsby
the students in the target language, to which tl Le teacher repliesin an on
going exchange. Typically, the teacher responds to the content and does
not overtlycorrect the students' errors.)The studentsin the honors classes
were selected because of their high achievement in French the previous
year, lb test her assumption, the teacher used dialog journals with the six
students in her 9 A.M. class, but not with the eight students in her 10 A.M.
class, for the entire fifteen-week semester. Shefread and responded to the
six students' dialog journals three times eaclJ week. At the end of the
semester, the students in both classes submitted twenty-page term papers
written in French. Three of this teacher's colleagues each read and rated
all the term papers without knowing about either the investigation or
that only one of the two honors classes had engaged in the dialogjournal
project. T
REFLECTION
How important are reliability and validity to you when you are reading a
research report?
ACTION
THREATS TO VALIDITY
Can you see how each of these conditions might influence the outcomes of any
sort of grammar test you could give the two classes? All of these conditions rep
resent threats to the internal validity of the study because each one could influ
ence how well the two groups do on the test (the dependent variable).There may
have been differencesattributed to gender (1 above).Allof the students got some
inductivelyoriented input from the textbook (2 above).The quality of your own
1. A kindergarten teacher finds that TPR is very effective with the bilingual
five- and six-year-olds in her morning class. There are twelve pupils in the
class and alltwelve havetested quite high on a test of intelligencefor children.
2. A pronunciation teacher works for a small company that handles the tele
phone service requests for an international computer manufacturer. The
teacher conducts a study that shows that a particular software packageused
by the individual employees of the telephone service firm is highly effective
in improving their English pronunciation. Each employee is required to
use thesoftware package ten hours perweek for a period of three months
as a condition of his or her employment.
3. A teacher at an international secondary school uses extended reading as a
way of building her pupils' target language vocabulary and reading skills.
The students are predominantly children of diplomats, and most of them
have lived in four or five countries even though they are just teenagers.
Each of these situations includes interesting issues and for each we can imagine
a viable research question that could be asked or a reasonable hypothesis that
could be posed, but each situation is problematic in terms of its generalizability.
For each of the three situations described above, identify the threat(s) to ex
ternal validity inherent in thecontext. Ifyou are working with a group, com
pare your ideas to those of a classmate or colleague from a different group.
The comments above about experimental research design describe the strongest
true experimental design. It hastwo or more groups, at leastone of which gets the
treatment and one of which serves as the control group. You will recall that to
qualify asa true experimental design, the study must involve random selection (from
the population to the sample) and random assignment (from the subject pool to the
groups). The groups in the experiment are determined by the research question
or hypothesis—and specifically by the levels of the independent variable. There
may also be groups defined by the levels of one or more moderator variables.
But some very important types of research do not involve a treatment and
cannot be called "experiments" in the strongest sense of the term. In some situ
ations, rather than comparing groups, we want to determine the relationship, or
correlation, between two or more variables as measured in one group of people.
The research design for this situation is called a correlation design, and there are
several statistical procedures that can be used to detect correlations across vari
ables. Still other studies involve the use of statistical logicor other analytic pro
cedures to look lor differences among groups defined by preexisting conditions
rather than by experimental treatments. Such studies are called criterion groups
designs because they compare groups defined by some criterion.
Correlation designs and criterion groups designs are both part of the ex post
facto class of designs. The phrase "ex post facto" is used because the researchers in
vestigate the possible influence of conditions after the fact. In this section we will
examine both of diese designs, but we will begin with the criterion groups design
because—like the other designs we have discussed so far—it allows us to compare
two or more groups (even though there is no formal experiment going on).
1. Two secondary school teachers share the opinion that their female students
are naturally better at foreign languages than their male students. They
plan a study to investigate this possibility systematically.
2. The members of an FSL faculty observe that students from some first lan
guage backgrounds seem to have less difficulty with English spelling than
do students from other first languages. They decide to test this hypothesis.
In the examples given above, the researchers did not make thestudents male or
female; the researchers did not cause them to bej native speakers of particular
languages, nordid the researchers cause thesubjects to beleft-handed or right-
handed. Instead, the researchers willstudy the possible influences of these con
ditions on some dependent variable(s) after thefact—that is after the conditions
of interest in the comparison already exist. In other words, the students are al
ready male or female (Situation 1), they already come from a particular L-l
background (Situation 2), and they are already left-handed or right-handed
(Situation 3) before these studies begin. That is why we say that the criterion
groups design is part of the expostfacto class of designs.
ACTION
Write the null hypotheses being tested in tjie three criterion groups
designs described above.
You will recall one key characteristic of the true experimental designs is that
they use random selection from the population to the sample and random assign
ment from the sample pool to the variousgroups in the sample (the control and
experimental groups). In criterion groups designs, we may have random selec
tion but we cannot have random assignment. Why? Think about it for a
moment. In the study comparing the pronunciation of left-handed and right-
handed language students (Situation 3 above), we might be able to randomly
select left-handed and right-handed students from the entire population of
foreign language learners. But we could not randomly assign them to groups
in the study because the groups are defined by the subjects' handedness. This
is, in fact, the defining characteristic of criterion groups designs: Subjects
are grouped according to the criterion of interest, not according to random
assignment.
When we use the ex post facto criterion groups design to test hypothesesor
answer research questionswith quantitativelycollected data in the psychometric
tradition, we often compare the average scores ofj the groups on the dependent
variable. Statistical tools are used to make inferences that help us decide whether
those differences are significant. In other words, statistics can be used to com
pare groups defined by preexisting conditionsjust as they can be used to identify
significant differences between control and experimental groups in formal ex
periments. We will return to this point in subsequent chapters where we discuss
quantitative data analysis.
REFLECTION
Look at the three hypotheses above. Notice die differences in their word
ing, compared to one another. Underline the key words that distinguish
the three hypotheses.
ACTION
Write the null hypothesis, the alternative hypothesis, and the alternative
directional hypothesis for the correlation design used to investigate the
relationship between German reading speed and German vocabulary.
Comparing two
groups (or more)
Treatment for at
least one group
Random assignment
to groups
Youmay use the questions in this checklist as you design your own research.
You can also use them for reflecting on and analyzing research that you read.
SAMPLE STUDY
In this section, we will briefly summarize one of the earliest published classroom
research studies conducted by a teacher in our field. Although this study was
published many years ago, it dealt with an issue that still concerns language
teachers today: language learners' classroom participation.
Hearing her colleaguesdiscuss the difficultyof getting Asianstudents to talk
in English classes at a university in the United States, a teacher decided to com
pare in-class participation patterns of Asian and non-Asian learners of English.
(The researcher, Charlene Sato, was herself a Japanese-American ESL teacher
and was very interested in ethnic styles in classroom discourse.) To investigate
this perception, Sato (1982) decided to conduct some research to determine
whether "ethnic patterns of participation were observable, as reflected in aspects
of turn-taking" (p. 14).
To address these issues, Sato videotaped her own class during three fifty-
minute lessons while she was teaching. (The students were told that the
videotape process was a regular part of teacher training in that program, which
was true.) In another teacher's class, three lessons were tape-recorded as Sato ob
served and took notes to record which students took speaking turns. In Sato's
class, there were fifteen Asian students and eight non-Asian students, while in
the other class, there were four Asians and four non-Asians. The two classes were
at the same level of English proficiency in the ESL program.
ACTION
REFLECTION
What do you think about the comparability of the data from the two
classes in Sato's study? In her own class, the data consisted of the video
tapes of three lessons. In the other teacher's class, Sato tape-recorded the
lesson and took observational notes.
Sato's (1982) findings are interesting and complex'. We will summarize only
a few of them here. She found that, although the Asian students outnumbered
the non-Asian students in the two classes combined (nineteen Asians versus
twelve non-Asians), the non-Asian students took 63.5% of the total speaking
turns. Further scrutiny revealed that the Asian students only self-selected a third
of the time, while the non-Asian students self-selected two-thirds of the time. In
terms of teacher-allocated turns, the Asian students were selected for turns by
the teachers 40% of the time while the non-Asians were selected 60% of the
time. (All these differences were statistically significant.) Sato's comment about
her own turn distribution patterns is particularly insightful:
In this chapter we have dealt with key concepts in planning language classroom
research. By now, it is probably apparent that there are many payoffs and pitfalls
involved in conducting classroom research, and your choice of research design
can lead to both.
CONCLUSIONS
In this chapter, you read about hypotheses and hypothesis testing. We saw that
there are specificwordings as well as reasons why researchers pose null, alterna-
tive, and alternative directional hypotheses. We noted the point that the research
questions and/or hypotheses determine what sort of variables and research de
signsare involved in a study.
We looked at different sorts of threats to the validity and reliability of a
study and discussed the expost facto class of designs: the correlation designand
the criterion groups design. These were then compared with the true experi
mental designs and the intact groups designs, whichhad been introduced earlier.
The following questions and tasks, as well as the suggestions for further reading
on these topics, should helpyou consolidate your understandingof these impor
tant issues in language classroom research.
4. Think of a way to replicate but improve upon Sato's (1982) research. You
could compare the speaking turns of Asian and non-Asian students as she
did. Or you could compare the speaking turns of male and femalestudents.
In fact, you could combine these two issues and use a factorial criterion
groups design, using ethnicity as the independent variable and gender as the
dependent variable. Draw the box diagram for a factorial criterion groups
design study investigating the influence of ethnicity and gender on
students' speaking turns in a language classroom.
5. In the replication of Sato's study that you envision, are you planning a di
rect replication, a systematic replication, or a conceptual replication? What
do you see as the value of replicating a previous study?
6. Perhaps you have noticed the tendency of two variables to co-occur, or to
run counter to each other. Brainstorm some research questions about these
situations. This sort of research calls for a correlation design. Choose a
research question you have posed and identify the X and Y variables.
Although this section and the section that follows overlap somewhat, we
see this particular section as a big-picture treatment. The section begins
with a chapter on experimental methods, which is one of the two "pure"
research paradigms (Grotjahn, 1987). The section also covers the other "pure"
research paradigm—ethnography. Other approaches that are dealt with in this
section includesurveyresearch, casestudiesresearch, and action research.
For manypeople, the experimental method issynonymous withresearch, and
other approaches that draw on more naturalistic forms of inquiry are oftenseen
as ground-clearing operations designed to yield preliminary data and to set the
scene for experimental research. We don't see it that way. Case study research,
surveys, and actionresearch all have their value and—depending on the research
questions and the overall intention of the research—can generate useful infor
mation where experiments may well be inappropriate.
81
Chapter 5: Surveys
By the end of this chapter, readerswill
b definesurvey research and explain its basic uses;
b differentiate between survey research and the experimental method;
b describe different kinds of sampling strategies for obtaining subjects;
b discuss the basic principles of questionnaire design;
b recognize potential problems in usingvarious questionnaire item types;
b be familiar withvarious ethical issues in survey research.
Chapter 7: Ethnography
By the end of this chapter, readers will
b define ethnography and differentiate it from case study research;
b understand the distinction between emic and etic perspectives;
b articulate the principles that guide ethnography;
b describe four types of triangulation;
b discussconcerns about reliability and validity of ethnography.
Any timeyou use phrases like: "On average, I cycle about 100 miles a
week"or"We can expect a lot ofrain atthis time ofyear"or"The earlier
you start revising, the betteryou are likely to do inthe exam"you are
making a statistical statement, even though youmay have performed no
calculations. (Rowntree, 1981, p. 13)
In this chapter, we will look more closelyat the experimental method, one of the
two 'pure' research paradigms (Grotjahn, 1987). In Grotjahn's terms, this para
digm involves (1) experimental designs, (2) quantitative data, and (3) statistical
analyses. The experimental method is basically a collection of research designs,
guidelines for using them, principles and procedures for determining statistical
significance, and criteria for determiningthe quality of a study. The experimen
tal method is part of the psychometric tradition, and it is also referred to as the
scientific method. For some researchers, the experimental method is the premier
method, all others being 'ground clearing' operations, that is, preliminary data
collection and interpretation exercises to prepare por a formal experiment.
We will begin this chapter by adding to our earlier discussion of possible
confounding variables. Then we will add more research designs to those you
have read about in earlier chapters. We will systematize this discussion by ana
lyzing and exemplifying the research designs, dividing them into classes, and
explaining their relationships to one another. T[hen we will use an extended
example to look at the issue of extrapolating from samples to populations.
This extrapolation is based on the logic of the normaldistribution, whichwill be
discussed as well.
83
As you read this material, keep in mind that different forms of" research have
different cultures. The experimental method has one ofthemost strictly codified
sets ofvalues and procedures of any of the main methods we will study. It also in
volves a fair amount of jargon, which can sometimes be a bit intimidating. But
just imagine that you are learning new vocabulary, as you would when entering
any new culture.
REFLECTION
What do you picture when you read the phrase the experimental method}
What images does it evoke for you?
In this section, we will build on key concepts that were introduced earlier.
These concepts included samples, populations, variables, reliability, and validity.
As we saw earlier, experiments are generally conducted in order to test the
strength of relationships between variables. We also saw that when the re
searcher is testing the influence of one variable on another, the variable doing
the influencing is called the independent variable, while the one being influ
enced is called the dependent variable. For example, in a study of the effect of
two different methods for teaching grammar, the teaching method would be the
independent variable, and the students' performance on a test of grammar
knowledge would be the dependent variable.
In Chapter 3, we discussed confounding variables—those factors that might
negatively influence the interpretation of your results. In the experimental
method, one of the researcher's key goals is to control and systematically manip
ulate variables in order to determine cause-and-effect relationships. This goal
has such a high value in the culture of the experimental method that people have
written extensively about the things that can go wrong. These types of con
founding variables are also called extraneous variables or threats to -validity.
a group of students take an IQ test, and only the highest third and the
lowest third are selected for the experiment, eliminating the middle
third. Statistical processes would create a tendency for the scores on any
post-test measurement of the high IQ students to decrease toward the
mean, while the scores of the low IQ students would increase toward the
mean. Thus, the groups would differ less in the post-test results, even
without experiencing any experimental treatment, (p. 136)
Tuckman explains that this pattern happens because "chance factors are more
likely to contribute to extreme scores than to average scores, and such factors
are unlikely to reappear during a second testing" (ibid.). Using subjects who
represent an entire range of abilitylevels in your designis one wayto avoid this
problem.
Experimental mortality (or just mortality) is the problem of losing subjects
from the study. It canbe especially worrisome if the groups end up beingof quite
different sizes because people dropped out. To deal with this threat, researchers
often try to recruit more people for a study than they may actually need. Re
searchers must sometimes also try to locate subjects who took the pre-test and
experienced the treatment (or were in the control group) but then moved away
or were absent when the post-test was administered.
As the name suggests, the intei-active combination offactors happens when
more than one threat is presentin a study. For example, ifyouconducta studyof
REFLECTION
REFLECTION
Think about the times that you have observed a class or have been ob
served when you were teaching a class. Did any reactive effects occur?
What were they? WTio was affected? What might be done to counteract
such problems when an observer visits a language class?
The threat known as the interaction effects of selection bias occurs when the
sample in an experiment is not really representative of the population from
which it was drawn. This is a major issue for language classroom researchers be
causeit hingesupon firstdefining the populationwe wish to study and then upon
In order to deal with these threats, the experimental method includes many dif
ferent research designs to counteract the possible confounding variables that
could influence the internal and external validity of a study. The various designs
have different strengths and weaknesses. Anyone who chooses to do an experi
ment must balance the focus of the research question against the time and re
sources available for conducting the study in order to choose the best design.
In this section, we will review some research designs discussed in Chapters
1, 2, and 3 (the true experimental designs, the intact groups design, and two in
the ex post facto class—correlation and criterion groups designs). We will also
introduce some other designs that are used in the experimental method. To show
REFLECTION
When you have completed this study, what information will you have
about the students? What information willyou lack?
This research design is called a one-shot case study. (This phrase does not
mean the same thing as the term case study does in naturalistic inquiry—a point
we will explore in Chapter 6.) The one-shot case study is a weakdesign because
of the problems inherent in the interpretation of the results. Since there is no
pre-test, we don't know how proficient the students were in English at the begin
ning of the TOEFL preparation class. As a result, we can't really say, on solid
empirical grounds, whether the course helped them (though the students and
teacher may be sure that it did). And since only one set of data is available, no
comparisons are possible.
REFLECTION
What informationwillyou have at the end of this study that you wouldn't
have had after conducting the one-shot case study described in Scenario 1?
REFLECTION
What steps could you take to refine this design so that you could confi
dently say that the students' measured improvement was in fact due to the
TOEFL preparation course?
ACTION
REFLECTION
What are the weaknesses inherent in the intact groups design? (Think back
to our discussions of randomization in Chapters 2 and 3.)
The main problems with the intact groups design stem from the fact that the
subjects in the groups being compared were not randomly selected from the
population, nor were they randomly assigned to groups. (Hence, the name "in
tact groups design.") Without randomization (and without a pre-test), we cannot
be certain that the groups being compared were identical (or at least quite simi
lar) to begin with. Perhaps the students in one group are more motivated than
those in the other group, or have greater language aptitude or higher language
proficiency to start with. As a result, we cannot be sure that any differences we
find are truly due to the treatment (the TO FIT preparation course). Therefore,
when we usean intact groups design we must be conservative when we report the
results.
These three designs—the one-shot case study, the one-group pre-test post-
test design, and the intact groups design—all belong to the pre-experimentalclass
of designs. They are pre-experimental in that they lack some of the defining
characteristics of the true experimental designs.
ACTION
Decide which of the three designs discussed above is being used in each of
the following research situations.
1. An Arabic teacher wanted to know whether pronunciation exercises im
proved her students' pronunciation of difficultwords. She gave the stu
dents a pronunciation test before doing the pronunciation exercises, and
then she tested them again after the class.
2. A teacher used two different methods to teach vocabulary with her two
intermediate French classes. One group got a list of randomly ordered,
unrelated words to memorize. The second group got the same vocabu
lary lists, but in addition, the teacher created jazz chants using the
words. At die end of the term, both classes were tested over the vocab
ulary presented in the course.
REFLECTION
What element has been added to this scenario that was not present in Sce
nario 3, which described the intact groups design?
ACTION
However, because the groups being compared were intact (i.e., they were
not randomly sampled or randomly assigned), there are still limitations on the
claims you can make. In fact, that is whywe prefer the term comparison groups in
the design's name rather than control groups. By definition, a control group is one
made up of people randomly selected from the population and randomly as
signed to the groupsin the study. In addition, in the strongestdesigns—those in
the true experimental class—which group serves as the control group is often
randomly determined, perhaps by the flip of a coin. This step is a further safe
guardto ensure that there are no known preexisting differences that mightinflu
ence the outcome of the study.
* * *
fl 17"
0
* 15-
14- * * * *
13- * * * i:
12- i i i i i i i 1 l i i i i i i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Week
study harder. From the data above, it appears that that effort resulted in higher
vocabulary quiz scores.
In the time series design, the group under investigation serves as its own
control. That is, before the treatment, the students are functioning as a control
group, but after the treatment, they are analogous to an experimental group.
This situation is far from ideal, but it can be informative. We need to be careful,
though, about interpreting the results, as shown in Figure 4.2. In this figure,
T stands for time.
REFLECTION
Why does Tuckman (1999) assert that cases 3 and 4 suggest that the treat
ment has not caused an effect?
These four sets ol data all provide different information about the possible
efteets of the treatment. The first line in Figure 4.2 tells us that the scores were
higher after the treatment than they were before, and that they remained consis
tently high.The second line tells us that the scores improvedafter the treatment
but that the improvement tapered off later. The third line indicates that the
scores were higher after the treatment and continued to increase, but we cannot
say that the treatment caused this improvement because the scores had already
begun to increase before the treatment. It appears that this developmental trend
might have continued with or without the treatment being administered. Finally,
in the fourth line, the scores are somewhat erratic. They improve immediately
alter the treatment and then drop back down again, but the scoreswere relatively
high at one point before the treatment as well.
REFLECTION
What can you infer from the results depicted in Figure 4.3?
18-
17-
n
{16
a.
15-
14-
13-
12 T T t—n r—n r—n
3 4 5 8 9 10 11 12 13 14 15
Week
Once again, when we work with the equivalent time samples design, we say
that the group serves as its own control. There is no separate control or compar
ison group, but we are able to compare the scores on the dependent variable (the
twenty-point vocabulary quiz) for the weeks when contextualization was pro
vided with the vocabulary scores for the weeks when it was not. In effect, the
contextualization is the treatment, and it is alternately provided and withheld
from the same group of students. This design is stronger than the time series de
sign because multiple comparisons are possible. That is, since the treatment has
been given and withheld several times, we have a better chance of detecting its
effect (if any).
These three designs (the nonequivalent comparison group design, the time
series design, and the equivalent time samples design) all belong to a class called
the quasi-experimental designs. This class is characterized by (1) the possibility of
making comparisons on the dependent variable, but also by (2) the lack of a
randomly sampled and randomly assigned control group.
REFLECTION
Think of a research question that you could address with a time series de
sign and one that could be addressed using the equivalent time samples
design. Remember that these are designs that can be used when no con
trol group (or even a comparison group) is available. Given that fact,
what could you confidently say about the possible outcomes of your two
studies?
ACTION
This research design iscalled the post-test only control group design. It is one of
the true experimental designs because of the presence of a control group and the
random selection and random assignment of subjects.
REFLECTION
Now what is the dependent variable? What is one possible threat to valid
ity inherent in this design?
The pre-test post-test control group design is one of the true experimental de
signs because of the presence of a control group and the random selection and
random assignment of subjects. In one sense, it is stronger than the post-test
only control group design because you can measure improvement (through the
gain scores). However, it is also susceptible to the testing threat.
One-group pre-test
post-test No Yes No No Yes No No
Nonequivalent
comparison groups Yes Yes No Yes Yes No No
Correlation No No No No No Yes No
ACTION
REFLECTION
There are many possible influences that can affect the outcome of a study
such as this one. If different teachers are involved in teaching the different
groups, then it could be the teachers rather than the materials that make a differ
ence.If one teacherworkswith both groups,you will havecontrolled for teacher
style as a factor, but the teacher's enthusiasm for (or boredom with) one type of
materials could influence the results. Even the time of day at which a classis held
can affect learning outcomes. These issues weaken the internal validity of the
study because it is not possible to state categorically that the treatment brought
about any differences observed in the students' test scores.
While factors such as those mentioned above may impinge on research out
comes, participant factors (such asthe selection threat)are the most pervasive. For
example, you may have happened to select a group of fast-track or high-aptitude
smdents as the recipients of the experimental authentic materials, and a group of
slow learners that used the traditional materials. In order to guard against the
possibility that factors such as age, motivation, or aptitude might influence the
research outcomes, sound experimentaldesign in the psychometrictradition sug
geststhat you assign subjectsrandomlyto the control and experimental conditions.
Using randomization puts you in a better position to argue that any ob
served differences on the end-of-course test are due to the innovative materials
because possibleconfounding variablesthat might have had an effect (such as in
telligence and aptitude) are presumably evenly distributed in the experimental
and control groups. You can also test both groups of students before the experi
ment just to make sure that the groups really are the same, though this step in
troduces the possibility of the testing threat. (Doing so would entail the use of
the pre-test post-test control group design.)
Unfortunately, in ongoing programs, it is not always practical to rearrange
students and randomly assign them into different groups or classes. In many
schools, if an experiment is to be conducted, it will have to be with classes to
which students have been preassigned. That is why the intact groups design and
the nonequivalentcomparison groups design are so often used in languageclass
room research—because they involve groups that were already established (by
means other than randomization) without the researcher's control. In these
circumstances, while the internal validity of the experiment is weakened, the
study may still be worthwhile.
Imagine that you are able to carry out the experiment described above. You
randomly assign ten final-year secondary school students to the control
group and ten to the experimental group. A 100-point listeningpre-test in
dicates that the groups are at the same level of proficiency. You teach both
groups for a semester, using the innovative authentic materials with the
experimental group and the traditional materials with the control group.
At the end of the semester, the groups are retested with an alternate form
of" the 100-point listening test. You calculate the averages:
Control Group: Experimental Group:
Post-test average: 80 85
The experimental group has thus outscored the control group.
Are you entitled to claim that the innovative materials are superior to the
traditional materials? If so, why? If not, why not?
The answer to the question is "Not yet!" You have selected a sample, or sub
set, of all the possible students in the final year of secondary school as your ex
perimental subjects. II you retested them again tomorrow, or if you selected a
different group of subjects and tested them, it's highly unlikely that you would
get exactly the same scores. The students might be more tired (or more ener
getic), or the weather could affect their performance. In short, a whole range of
factors could be responsible for test score variation. What you need to decide is
whether the variation in scores between the control and experimental groups
might have happened by chance, or whether the differences were a result of the
experimental treatment. In order to do this, you must make inferences based on
statistical procedures.
The aim of this section is to introduce you to the logic of statistical inference.
The information presented here will probably not equip you to carry out your
own statistical analyses, but it should help you to understand and appreciate the
logic behind the statistical procedures that enable researchers to make claims
about an entire population based on a sample or subset of subjects from that
population.
In most research, it is not possible to collect data from the entire population
in which we are interested. Consider your investigation of the authentic materi
als in secondary school EFL classes. Although not impossible, it would be ex
tremely time-consuming and cumbersome to obtain data on all the secondary
REFLECTION
Why might it have been unwise to assume, supposing you were a gunner,
that this claim would be true of attacks on gunners in general?
I Ie goes on to say that this issue of matching samples and populations is a para
dox of sampling:
A sample is misleading unless it is representative of the population; but
how can we tell it is representative unless we already know what we need
to know about the population and therefore have no need of samples!
The paradox cannot be completely resolved; some uncertainty must re
main. Nevertheless, our statistical methodology enables us to collect
samples that are likely to be as representative as possible. This allows us
to exercise proper caution and avoid over-generalization, (ibid.)
lb help protect against overgeneralization, researchers use several procedures
based on descriptive and inferential statistics. (Explaining all these procedures is
beyond the scope of this book, but we will deal with some in Chapter 13.)
ControlGroup Experimental
Student ID Scores Student ID Group Scores
C-l 80 E-l 85
C-2 82 E-2 87
C-3 78 E-3 83
C-4 77 E-4 82
C-5 83 E-5 88
C-6 80 E-6 85
C-7 76 E-7 81
C-8 84 E-8 89
C-9 75 E-9 84
C-10 85 E-10 86
Mean 80 85
DescriptiveStatistics
To understand the logic behind the procedures that enable extrapolation from
samples to populations, you need to be familiar with a number of statistical con
cepts. The two most important of these are the mean and standard deviation.
These are two of the descriptive statistics—so labeled because they describe the
sample.
For experimental researchers, two particularly interesting features of nu
merical data sets are the extent to which individual items in the data set are sim
ilar and the extent to which they differ or are dispersed. The most important
measure of similarity is the numerical average, or mean (symbolized by a capital
X with a horizontal bar above it and called X-bar). The average is obtained by
adding the individual scores together and dividing the sum by the total number
of scores. To illustrate, let's look at the post-test scores from the students in the
control and experimental groups in the (hypothetical) study about innovative
listening materials.
The scores presented in Table 4.3 are simply listed in order of the students'
identification codes. They are not organized in any particular fashion. It can be
quite useful to rank order these scores, in order to see existing patterns more
clearly. When we rank order these scores, we find the data in Table 4.4.
The lowest score in this entire data set is 75 and the highest score is 89. The
difference between the highest score and the lowest score is the range. This is
one of the descriptive statistics. (Ranking the scores in this way is useful because
it allows us to see the range quickly.) The range in the control group was 75 to
85 and the range in the experimental group was 81 to 89. So, just by looking at
1 85 1 89
2 84 2 88
3 83 3 87
4 82 4 86
5.5 80 5.5 85
5.5 SO 5.5 85
7 78 7 84
8 77 8 83
9 76 9 82
10 75 10 si
Average SO 85
the difference in the ranges, we can see that the experimental group did better.
However, range is also reported in terms of its absolute value. That is, we some
times subtract the lowestscore from the highest score. Using this procedure, we
can saythat die range for the control group is ten, and the range for the experi
mental group is eight. We cannot tell which group did better—we can only tell
that there was more variability in the control group's scores than in the experi
mental group's scores.
REFLECTION
In Table 4.4, in the columns that provide the ranking of the two groups'
scores, there are two places where the rank is given as 5.5—one for the
control group and one for the experimental group.Why do you think these
ranks are given as 5.5 instead of 5 and 6?
The answer to the question in the reflection box above has to do with prin
ciples of decision making. Let's use the prize money in a golf tournament as a
metaphor. One golfer is the first place winner, and he receives a check for
S100,000. Two golfers are tied for second place. The prize money for second
place is $50,000, and the prize money for third place is $30,000. What is a fair
way to decide which golfer was second and which was third, since their scores
were tied? The answer is to add the prize money for second place and third place
Frequency Polygons
Anotherway to look at these test datais to see how many studentsobtainedeach
particular score. Let's start with the scores of all twenty students combined. We
canlistthe score values across the bottomaxis of a chartand the numberof peo
ple who obtained each particular score on the vertical axis. The termfrequency
here refers to how often each score was obtained. In Figure 4.4, each asterisk
shows how many people out of the twenty subjects in the study received each
possible score value.
If you were to draw bars down from each of these asterisks to the horizontal
axis, you would have a bar graph, or histogram. If you were to draw a line con
nectingthese asterisks, you would have whatis called afrequency polygon—a chart
of the frequency with which each score value was obtained. Notice that no one
got a score of 79.At this point the line connecting the asterisks would drop down
to the horizontal axis, to indicate that no one received that score.
The frequency polygon is a very important conceptual tool as well as an
informative visual aid. In fact, the frequency polygon is the basis for many of
the most important statistical procedures that we use in language classroom
research.
0 i i i i t i i i i i i i I I I t"
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score
1-
—i 1 1 1 1 1 1 1 1 1 1 1 1 1 i i
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score
With very small samples like this, frequency polygons can be nearly flat. But
an interesting thing happens when large data sets are plotted on a frequency
polygon. Imagine for a moment that ninety-five students actually took this
100-point test. If we had that many scores to enter in the frequency polygon, it
might look something like this:
i 1 1 1 1 i 1 i i i i i i i r
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
Score
REFLECTION
Why is the number zero directly under the line that represents the mean,
the median, and the mode in Figure 4.6? (This issue is a matterof logic and
definition rather than of mathematics.)
Calculating the standard deviations for the scores in Table 4.3 gives us the
values reported in the last row of Table 4.5. (We won't go into the formula for
calculating the standard deviation here. We will work with it in Chapter 13.)
We said earlier that the ranges for the scores of the control and experimen
tal groups were ten and eight, respectively. Can you see how the range is re
flected in the standard deviations reported in Table4.5? Sincestandard deviation
is an index of how spread out thescores arein a given data set, it makes sense that
where there is a wider range there will he a bigger standard deviation.
C-l SO E-l 85
C-2 82 E-2 87
C-3 78 F.-3 83
C-4 77 I-.-4 82
C-5 83 E-5 88
C-6 80 E-6 85
C-7 76 E-7 81
C-8 84 E-8 89
C-9 75 E-9 84
C-10 85 K-10 86
Mean 80 85
Range 10 8
(75-85) (81-89)
Standard 3.25 2.58
deviation
Inferential Statistics
Atthis point, we want to remind you of some things thatyou already know about
percentages. We want to build on your existing knowledge and confidence to in
troduce a new concept.
ACTION
Look at the pie charts below and decide what percentage of the area of
each circle is indicated by the various sections.
We are quite sure that you recognize the percentages represented by these
divisions. In Circle 1, die sections of the bisected pie chart represent 50% and
50%. In Circle 2, the pie chart is divided into four equal portions. Each segment
represents 25% of the whole circle. And in Circle 3, we have three equal
sections, each of which represents 33M% of the whole circle. You recognize these
percentage values because you have seen charts like this ever since you started
school.
We can use the image of the normal distribution to indicate percentages,
too. As noted above, when the bell curve is bisected by the vertical line repre
senting the mean, the median, and the mode, 50% of the scores fall above that
line and 50% fall below it. (Remember the frequency polygons where you
connected the asterisks. A bell curve is just a symmetrical frequency polygon.)
FIGURE 4.7 The areas under the normal curve, as indicated by-
standard deviations (downloaded on August 13, 2006
from Wikipedia)
The image of the bell curve can also be divided in predictable, recognizable
ways.
When you see a chart of the normal distribution, it usually has numbers
written on the horizontal axis. These numbers range from negative four to pos
itive four, as mentioned above, to indicate the location of the standard deviations
in the diagram. But, for ease of interpretation, such diagrams also often have
vertical lines drawn that represent the standard deviations. These lines help us
see percentage divisions, as shown in Figure 4.7. Oust think of this as a different
shaped pie chart—one with which you may not be very familiar at this time.)
Here and elsewhere the Greek symbol a (the lower-case sigma) stands for
standard deviation. The symbol /x (the Greek letter /////) represents the popula
tion mean.
REFLECTION
Look at Figure 4.7. Can you see why we said that 68.2% of the scores fall
within one standard deviation of the mean? (Another way to saythis is that
68.2% of the scores fall in the area between one standard deviation below
the mean and one standard deviation above the mean.)
ACTION
REFLECTION
The mean, median, mode, standard deviation, range, and variance are the
descriptive statistics. What do you think is meant by inferential statistics?
A SAMPLE STUDY
r \ f
FIGURE 4.8 The control and experimental groups in the sample study
ACTION
Answer the following questions based on what you know about this study.
1. What were the likely research questions in this study?
2. What is the independent variable and how many levels does it have?
3. What are the dependent variables? (Hint: There are two.)
4. What is one control variable?
5. What is the design of this study? (Hint: This is a trick question. Think
about the dependent variables.)
6. Can you anticipate any possible threats to validity in this situation?
Compare your ideas with those of a classmate or colleague.
REFLECTION
As the sample study illustrates, there are many pitfalls in using the experimental
method to conduct language classroom research. Many of these problems
stem from the difficulty of controlling all the possible confounding variables
Indeed, classroom research "entails a very large number of human and institu
tional factorsthat can affectresearch designand outcomesin manyunforeseen and
unforeseeable ways. It is not for the timid" (Rounds and Schachter, 1996,p. 108).
There is also a more philosophical problem with the experimental method.
For some people, if a phenomenon is not measurable, it is not worth studying.
For them, the psychometric tradition, and the experimental method in particu
lar, may be the only valuable ways of conducting research. However,in an effort
to quantify phenomena of interest we may miss important issues. We may not
investigate key variables because they are not easilyquantified, or we may focus
on trivial issuesthat are easyto quantify. Furthermore, the data collected in lan
guage classroom research are often collectedfrom people, but the need to quan
tify and the widespread use of group averages sometimes make it seem that
individual learners are represented only by test scores. And the seeming dehu-
manization of participants is also noticeable when researchers talk about the
sample in a study as "experimental subjects."
Another problem relates to the issue of objectivity and subjectivity. The ex
perimental method emphasizesobjectivity in hopes of counteracting the threats
of researcher and subject expectancy. As a result, teachers (and learners) have
typically not been seen as potential collaboratorsin languageclassroomresearch
conducted with the experimental method. Teachers have had important roles,
but often primarily as deliverers of a particular treatment in an experiment.
However, for teachers, it is sometimes very difficult to be simply a treatment de
liverer when students' needs and desires run counter to the prescribed treatment.
(Remember the Suggestopedia teacher's comment, "I know, but what can I do?
The students want grammar rules and error correction.")
There are also several payoffs associated with the experimental method.
First, it is a well-documented, highly codified approach to conducting educa
tional research with well-developed quality control procedures. There are many
textbooks available and it is relatively easy to locate courses if you wish to get fur
ther training.
Secondly, the experimental method is an internationally recognized way of
conducting research. If you conduct an experiment in Jakarta and publish your
findings in Prospect, the TESOL journal of Australia, readers in Germany, Iran,
India, Canada, and Brazilwillall understand the report (provided they have been
trained in the language and culture of the experimental method).
CONCLUSIONS
The experimental method has been very important, historically, in all sorts of
research, in both the physical and social sciences. It has often been used in lan
guage classroom research, but with varied success, since it is so difficult to con
trol all the possible confounding variables that can arise in research with real
people. It is, however, a valuable approach to understanding teaching and
learning, and it has influenced many other approaches, as we will see in future
chapters.
A. Pose the research question(s) you wish to ask. You could write a formal
hypothesis if you wish.
SUGGESTED READINGS
For more information on research designs, see Mitchell and Jolley (1988),
Shavelson (1981, 1996), and Tuckman (1999).
J. D. Brown (1988) provides a very good description of the normal distribu
tion. He also provides a clear introduction to the descriptive statistics (1988;
2001).
Surveys
You can't catch anelephant with a butterfly net. Then again, youcan't
catch a butterfly with anelephant net. (K. M. Bailey on questionnaire design
to her research students)
124
to stimulate language production, such as pictures, diagrams, and even standard
ized tests, as well as surveys, whichcollect data through questionnaires and inter
views. (Interview procedures are sometimes classified under the rubric of survey
research, butwe will treat interview procedures ina! separate chapter.)
In this chapter, we will look specifically at survey research using
questionnaires—written data elicitationdevices. Questionnaires haveoften been
used in classroomresearch,but perhaps more in classroom-orientedstudies than
in classroom-based studies. Questionnaires administered to teachers, parents,
and/or students can amplify and improve upon a classroom-based study.
For example, a very early and important published example of language
classroom research was a study by Seliger(1977).j He wanted to investigate an
element of turn taking in the target language. He believed that some learners
could be characterized as "high input generators"—that is, language learners
who participated in conversations in ways that generated input from their inter
locutors (the people they were talkingwith). He hypothesized that the high input
generators would outscore "low inputgenerators" pn a testof language achieve
ment. However, since he was studying a group of ESL learners in the United
States,he realized that they had opportunities for English input outside of class
as well. So, he included a questionnaire in his study called the "Language Con
tact Profile." It asked students about their use of English outside the classroom.
When Seliger analyzed his results, he found that the high input generators
outscoredthe lowinput generatorson both the measureof Englishachievement
and alsoon the Language Contact Profile. The datafrom this questionnaire pro
vided an added dimension to the research in terms of understanding how inter
action in the target language can improvestudents' learning.
REFLECTION
7. Identity analytical procedures I low will the data be assembled and analyzed?
8. Determine reporting procedure How will the results be presented?
One of the most important questions a survey researcher must confront is the
following: VMiat is the population I wish to learn about byway of the survey? Po
litical surveys, particularly those carried out in the run up to an election, gener
ally purport to reflect the entire population of eligible voters (although they also
report a margin of error—usually between 2 and 5%). It would not, of course, be
practical to obtain data from the entire population. In fact, that's what the elec
tion itself is meant to do. A major task for the survey researcher, therefore, is to
select a representative sample from the population as a whole. The idea that the
sample represents the population is important if the predictions based on the
opinion poll are to be accurate.
REFLECTION
If you wished to study voting intentions, one way to collect data for a large-
scale survey, which claims to represent the entire population of a voting dis
trict, would simply be to go into the street and question people at random.
Why do you think tliis may not be a sound way of collecting the data?
Strategy Procedure
SamplingStrategies
In large-scale surveys, the major problem with simple random sampling is that it
may mask differences between underlying subgroups within the population. For
example, men and women often have different voting patterns and preferences in
elections. Therefore, it is important to be sure that men and women are repre
sented in the sample in proportion to the ratio of men and women in the voting
population. Table 5.2 shows some of the different sampling strategies available
to the researcher.
REFLECTION
REFLECTION
Do you see a role for a questionnaire in any of the research topics that
interest you?
l=never, _j J J J J J J J -»
9=very
frequently
l=not at _j J J J J J J J J
all
appealing:
9=very
appealing
REFLECTION
Make a list of the pros and cons of using closed and open-ended questions.
Compare your list with that of a classmate or colleague.
There are several advantages of using closed items. These include practical
ity7 in terms of the ease and speed with which people can respond to the question
naire. Also, providing the options for respondents to select or evaluate provides
the great benefit of comparability because it constrains the variation you can get
in the responses. This factor enhances the data analysis process greatly. All in all,
the benefits of questionnaires have made them very important tools in survey
research in psychology, sociology, sociolinguistics, general applied linguistics
research, and, to some extent, language classroom research.
A wide range of closed item formats can be used. Some of the question types
used in closed question surveys are presented in Table 5.3.
It is important to remember that the question format you choose will con
strain the type of data that you get. For example, in the grid format shown in
Table 5.3, a program director completing this grid would indicate in each cell
the number of people in that age range (e.g., there may be twelve students in
Level 1 who are between the ages of eighteen and twenty-four). But we cannot
tell, from that datum alone, how many of those twelve students are eighteen
years old, how many are nineteen years old, and so on. Likewise, this particular
Level 1
Level 2
Level 3
Level 4
grid does not permit the possibility that there are any students younger than
eighteen or older than fifty enrolled in the program.
For these reasons,you should think carefullyabout your research question in
designingyour questionnaire. The researchquestion determines the analysis you
wish to do with your data, and that will influence the decisions you make about
the format of the questions. Remember the elephant and the butterfly: you can
hunt for either one, but you have to have the proper net to captureyour quarry.
Look at the item types in Table 5.3. Decide what sort of data—nominal,
ordinal, or interval—each question type will elicit from respondents.
There are two very important closed-item formats that have been used in
applied linguistics research. One is called the semantic differential scale and the
other is called a Likert scale (pronounced like "LICK-ert"). Both have important
potential applications in second language classroom research.
Likert scales are named after their originator, a psychologist named Rensis
Likert (1932), who developed the technique for measuring people's attitudes
about various social concerns (Busch, 1993). Likert scales are often used in ap
plied linguistics research (including language classroom research) "to investigate
how respondents feel about a series of statements" J. D. Brown, 2001, p. 40). In
this format, "the respondents' attitudes are registered by having them circle or
check numbered categories(for instance, /, 2, 3, 4, or )), which have descriptors
above them" (ibid.).
Let us take as an example an issue that was introduced earlier in this chapter.
A Likert scale could be used to investigate people's attitudes about the desirabil
ity of foreign languages being required subjects in secondary school. To gather
information about this issue, a researcher could ask the following question:
Or, if more precise information were needed about the respondents' attitudes,
the researcher could word the item as follows:
You may also see this format, in which SA represents "strongly agree," A stands
for "agree," and N represents "neutral." D and SD represent "disagree" and
"strongly disagree," respectively. Respondents circle their choice:
REFLECTION
Intelligent : : : : Unintelligent
Impolite : : : : Polite
Educated : : : : Uneducated
Please draw a single straight line (I) on the horizontal continuum (—) to
indicate your opinion.
The news story was
VeryEasy ^_ VeryDifficult
I found the teaching activity
Not At All Helpful j Very Helpful
Thorpe made sure that all the lines were of equal length, both on all the
items for the questionnaire after a particular lesson and on the questionnaires
he gave the students after all the lessons. This strategy allowed him to measure
the distance where the students' marks were placed on each line. He was there
fore able to compare students' opinions, both to one another and over time.
(We will return to Thorpe's data in Chapter 8 whlen we study action research.)
Perhaps you noticed that in both examples above, sometimes the more pos
itive adjective is on the left and sometimes it is on the right. This placement is
intentional. Researchers use it to avoidwhat is called a response set— a habitual or
patterned way of responding to itemsthat is independent of the items (Mitchell
andJolley, 1988). For instance, a respondent may rush through a questionnaire
and simplymark all the positive options without really thinking about their con
tent. Switching the positive and negative adjectives breaks up this tendency to
some extent.
You will recall from Chapter 3 that we distinguished among nominal, ordi
nal, and interval data. The differences among these types of data are important,
partlybecause the type of data you work with determines the types of analyses
you may use.
Tuckman makes this point because of the value and nature of true interval scales.
We said in Chapter 2 that interval scales are those on which the unit of measure
is a constant interval.
However, there is another important feature of true interval scales and that
is that their properties are known and widely used. For instance, an inch is an
inch whether you are using a ruler in Australia, Canada, Egypt, Kenya, or
Brazil. A kilometer is the same distance whether you are riding a bike in
France, Thailand, or China. The same cannot be said of Likert scales since we
do not know if "agree" means the same thing to each person using the scale.If
two students "strongly disagree" with a statement on a Likert scale, does that
mean that they are equal in the vehemencewith which they disagree? We can
not know because there is no standardized measure of agreement as there is
with inches and kilometers. (See Busch [1993] and Turner [1993] for further
discussion of this position.)
You may have noticed that in the example screen shot from Springer and
Bailey (2006) above, the Likert scale was nine points long. The researchers
used a nine-point scale in order to provide respondents with more choices
since the computer-delivered questionnaire would not allow them to mark
pluses or minuses, or to circle two numbers. In addition, "from a statistical
viewpoint, longer scale lengths of seven or more categories are more desirable
because of the grain in score variability" (Busch, 1993, p. 735). Hatch and
Lazarton (1991) state that Likert scales become interval-like when the length
of the scale is increased.
Whether Likert scale data should be treated as interval data or ordinal
data hinges on many factors. If you are a graduate student trying to complete a
research project or meet graduation requirements, you should consult your pro
fessor or researchadvisor. In anycase, you need to showthat you understand the
issueof interval versus ordinal data when you choose the statistical procedures to
use in your data analysis. (See Chapter 13.)
Noticethat when youuse these item formats, the space youprovide indicates to
the respondentshow long a response you are hoping to receive.
Dornyeiaddsa third typeof open-endedquestion—sentence completion. In
thisformat, the questionnaire designer provides the opening clause of a sentence
and then provides spacefor the respondent to complete the idea.In this format,
the item is usually worded from the perspective of the respondent (i.e., it uses J,
my, and mine instead ofyou, your, and yours).
You are a student in a linguistics course. Your backpack is stolen and your
linguistics textbook was in it. You cannotafford to buya newcopy, but you
need a copy of the book over the weekend to! prepare for the midterm
exam, whichwill be given on Monday. You decide to askyour professorif
you mayborrowher copy. You go to her office and sayto her,
You askyour linguistics professorif you can borrow her copy of the text
book. Yours has been stolen and the library copy is checked out. You want
to use her textbook over the weekend because the linguistics midterm
exam is scheduled for Monday.
The professor says, "Very well, but I'll need it back Monday morning at
nine o'clock. Oh, and please don't mark in it."
You ask your linguistics professor if you can borrow her copy of the text
book. Yours has been stolen and the library copy is checked out. You want
to use her textbook over the weekend because the linguistics midterm
exam is scheduled for Monday. Yougo to the professors office and say,
A. "Hey, can I borrow your text book?"
B. "Excuseme, Professor, may I borrow your textbook over the weekend?"
C. "My textbook was stolen. Can I use yours over the weekend?"
D. "Excuse me, Professor, but unfortunately my textbook was stolen and
the librarycopyhas been checked out. May I|please borrow your copy?"
Think of a positive change you have made in your own teaching. It could
be a change in content, in philosophy,or procedure. The important thing
is that it be a change for the better which you have made and which has re
mained with you. I am interested in learning about changes that last in
your work as a language teacher—that is, I am trying to understand how
teachers bring about their own professional development, (p. 263)
Organizing YourQuestionnaire
Once you have settled on the questions that you want to ask, and you have de
cided how you want to frame them, the next step is to organize the question
naire. The ordering of the questions should makesense to the respondents, and
there shouldbe an ordered progression from one questionto the next.Table 5.4
provides guidelines for organizing a questionnaire.
Step 2 Begin with the most general questions and avoid those that might be
threatening or difficult to answer.
Step 3 Remember the initial portion sets the stage and influences the respondents'
expectations about what is to come.
Step 4 Be sure the body ol the questionnaire flows smoothly from one issue to
the next.
Step 5 List items in the body in a sequence that is logical and meaningful to
respondents.
Step 6 Save the most sensitive issues and threatening questions for the concluding
portion when rapport is greatest.
Step 7 List demographic or biographical questions last so that if some respondents
decline, most of the data will still be usable.
CraftingQuestionnaire Items
Getting the wording of the questions right is crucial in questionnaire design.
One of the potential problems with any type of elicitation device is that the re
sponses we get may well be artifacts of the elicitation devices themselves. To
guard against this problem, the researcher should not ask 'leading' questions that
reveal his or her own biases, such as, "Do you think that the concept of learner-
centeredness is impractical and unrealistic?" Dornyei (2003) gives examples of
leading questions that start with phrases such as, "Isn't it reasonable to suppose
that. . . ?" or "Don't you believe that. . ." (p. 54).
Furthermore, questions shouldnot be complex and confusing, nor shouldthey
askmore than one question at a time. Here is an example that fails on both counts:
ACTION
Look at the question quoted in the box above. How many constructs is it
attempting to measure? How could you rewrite this question so that it is
clearer?
REFLECTION
Complete this sentence: "In classes with laptops, I usually like to ."
a. listen to the teacher and take notes
b. work on exercises or homework
c. discussissues in groups or with the whole class
d. work on a research paper or a project
e. do many different things, like listen to lectures, do exercises, have dis
cussions, and do projects
PilotingYour Questionnaire
The concept of piloting a questionnaire (or any other data collection procedure)
is like a dress rehearsal in die theater. By administering the questionnaire before
the actual data collection, you can locate any unclear items, misnumbered items,
confusing instructions, and so on.
Piloting a questionnaire is at least a two-stage process. First, after you have
carefully organized and proofread your questionnaire, you should pre-pilot it
with a fewcolleagues, especially those who are familiar with the populationyou
wish to sample. You may need to revise the questionnaire somewhat based on
their feedback. Then you pilot the questionnaire by administering it to a small
number of people who are part of the population you wish to sample but who
willnot be in the sample themselves. You should be physically present when you
pilot the questionnaire with this group so that you can ask them for feedbackand
answer their questions. Doing so will give you valuable input about possible
problems in the questionnaire before you actually administer it to the sample
from whom you wish to collect the data.
Translation Issues
If your questionnaire is going to be completed by people who are not native-
speakers of the questionnaire language, you need to take special steps to make
REFLECTION
One solution that is sometimes used if the respondents are language learn
ers is to translate the questionnaire into the learners' native language. This solu
tion can work well if you are sampling from one or a few first-language groups,
but it becomes very unwieldy if many different first languages are represented in
your sample.
If you do decide to translate your questionnaire, there is an important pro
cedure for checking on the accuracy of the translation. This procedure is called
back translation and it works like this: First, the original questionnaire is drafted,
piloted, and revised. When the questionnaire is in its final form, it is translated
into the respondents' first language by a competent bilingual translator. Next,
another bilingual translator, working from the translation and withoutseeingthe
original version of the questionnaire or speaking to the first translator, puts the
questionnaire back into the language it was originally written in. Then you and
the two translators (if you are not one of them) compare the original version of
the questionnaire with the back-translated version. If there are anydifferences in
wording, the translators try to resolve the ambiguity in the translated version
and help you clarify your intended meaning before you administer the question
naire. At this point, you are ready to pilot your questionnaire.
Doing a back translation is time-consuming and can be costly. However,
making sure you have a proper translation of your questionnaire from the outset
is an important professional step in enhancing the reliable and valid measure
ment of the constructs you wish to capture. In the long run, back translation may
ACTION
Ethical Issues
Questionnaires elicit a kind of self-report data—that is, data in which the respon
dents are providing information about themselves. As a result, in order to get
truthful data, it is important to promise your subjects confidentiality whenever
you can. Usually confidentiality is accomplished by treating the questionnaire
data anonymously. That is, the respondents are not identified in any way that
will indicate who gave what opinions, ideas, information, etc., in responding to
the questionnaire. Here is an example of how one group of researchers promised
confidentiality to the students who completed their questionnaire:
Your answers to any or all questions will be treated with the strictest
confidence. Although we ask for your name on the cover page, we do so
only because we must be able to associateyour answers to this question
naire with those of other questionnaires which you will be asked to
answer. It is important for you to know, however, that before the ques
tionnaires are examined, your questionnaire will be numbered, the same
number will be put on the section containing your name, and then that
section will be removed. By following a similar procedure witii the other
questionnaires, we will be able to match the questionnaires through
matching numbers and avoid having to associate your name directly
with the questionnaire. (Gliksman, Gardner, and Smythe, 1982, p. 637,
as cited in Dornyei, 2003, p. 23.)
The main points here are (1)you are likely to get better data if you promise your
respondents anonymity, but (2) if you can't promise complete anonymity, you
must guarantee confidentiality in the handling and reporting of the data.
ACTION
A SAMPLE STUDY
In this section, we present a sample study based on a survey. While surveys are
widely used in applied linguistics, they are less common in classroom research.
The study we haveselected is classroom-oriented rather than classroom-basedin
that the data were collected outside of the classroom although the study was
carried out in order to inform classroom action.
The survey was conducted by Nunan and Wong (2006) to investigate the
learning styles and strategies of good and poor language learners. The study
was carried out among undergraduate students who were native speakers of
Cantonese. The subjects were 110 undergraduate students drawn from all the
faculties at the University of Hong Kong.
The aim of the study was to explore whether there are identifiable differ
ences between good and poor learners. Language proficiency was defined in
terms of grades obtained on the Hong Kong Examinations Authority Use of
English Examination—a high-stakes English language examination that all
students have to take in order to graduate from high school. The aim of the
research was to investigate whether there were any common practices among
language learners who did well in English within the Hong Kong education
system as compared with those who did not. Ultimately, the research was
intended to provide practical guidelines for teachers to add a learning-how-to-
learn dimension to their teaching.
From what you know so far, what do you think the design of this study is?
What class of designs does it belong to?
Research Questions
Seven aspects of language learning and use were investigated in the study. The
following research questions were posed about the two groups of learners (those
who did well on the Use of English Examination and those who did not):
1. Are there any differences between the good and poor language learners in
terms of their overall learning style?
2. Are there any differences between die good and poor language learners in
terms of their individual learning strategy preferences?
3. Are there any differences between the good and poor language learners
in terms of their target language practice outside of class?
4. Are there any differences between the good and poor language learners in
terms of their areas of academic specialization?
5. Are there any differences between the good and poor language learners in
terms of their perceptions of the importance of English?
6. Are there any differences between the good and poor language learners
in terms of their perception of language ability?
7. Are there any differences between the good and poor language learners in
terms of their enjoyment of learning English?
REFLECTION
What is the independent variable in this study? How did the researchers
operationalize it? How many levels does it have and what are they?
Research Procedures
The independent variable was whether the students could be characterized as
being good or poor English learners. This construct was operationalized using
the grades the students obtained on the Hong Kong Exams Authority Use of
English Exam. There were two levels of the independent variable. 'Good' learn
ers were defined as those who obtained an A on the examination. 'Poor' learners
were those who obtained an E or E The dependent variable consisted of the stu
dents' responses to a questionnaire on strategy preferences, learning practices,
and attitudes.
REFLECTION
Implications
Nunan and Wong (2006) identified a number of implications, particularly for
teachers working with poorer language learners. The main implication was to
encourage such learners to see language as a tool for communicating rather than
as a bodyof content to be memorized. Developing independent learning strate
gies and reducing students' dependence on the teacher were also recommended.
Learnersshould also be encouraged to develop a greater range of strategies and
to activate their language outsideof the classroom. Following Christison (2003),
the researchers suggested that teachers audit their own classroom practices to
identify the strategies that they themselves favor.
CONCLUSIONS
REFLECTION
157
Acase study isa detailed, often longitudinal, investigation of asingle individ
ual or entity(ora few individuals or entities). In applied linguistics research, case
studies canbestbeclassified asa type ofnaturalistic inquiry in that theytypically
do not involve any sort of treatment. Instead, researchers working in the case
study tradition set out to learn what is happening—whether it is with a child
learning his first language, an adultdeveloping literacy skills, a particular class of
preschool children, a novice teacher, or anyother entityof interest.
In experimental research, case studies (or "one-shot case studies" as they are
oftencalled) have traditionally beenseenhas having limited value. This is because
their lack ofcontrol over variables prohibits researchers from making strong causal
claims—a problem of internal validity. With only one or a few subjects, the argu
mentgoes, wecannever be surethe population iswell represented, so generalizing
the findings ofa case study to a population isa dubious undertaking—a problem of
external validity. Given these concerns, the perceived value of case studies in the
experimental research tradition is that theymaygenerate hypotheses that can later
be tested in experiments. However, from theperspective ofnaturalistic inquiry, the
case studymethod isveryimportant in other ways and in its ownright.
In the one-shot case study design of the experimental tradition, the re
searcher applies a treatment and observes its apparent effects in a post-test or
some other form of after-treatment measurement. There is no pre-test, nor is
there any comparison group. In naturalistic inquiry, however, "the researcher
usually does not provide experimental treatments or interventions that might
modifythe processof change" (Duff, 2008, p. 41). Nor does the researcher exert
control over variables in a naturalistic casestudy:
rather, the data reflect natural changes in the learner's behavior and
knowledge, influenced by numerous possible factors, such as the envi
ronment, physical maturation, cognitive development, and schooling,
which the researcher must also take into account in order to arrive at
valid conclusions concerning learning processes and outcomes, (p. 41)
Indeed, when we compare casestudy research in naturalisticinquiry with the ex
perimentalresearch tradition,wesee that "the strengths of one approachtend to
be the weaknesses of the other" (ibid., p. 42).
Case study research assumed a great deal of importance in education in the
1970s, when it was embraced by a group of researchers and evaluators in
Cambridge, England. Three members of the Cambridge Action Research
Network (CARN), Adelman, Jenkins, and Kemmis (1976), produced an impor
tant position paper in which they argued that case studies are not merely pre-
experimental and that case study is not a term for a standard methodological
package.The issue of whether or not case studies are 'pre-experimental'—mere
ground-clearing operations for more 'rigorous' experimental research—was
challenged by Adelman et al.
Although case studies have often been used to sensitize researchers to
significant variables subsequently manipulated and controlled in an
experimental design, that is not their only role. The understandings
The Taped Scene: Briefly, what they saw was Lupita performing and
interacting outside of teacher awareness during free time. After having
finished the Spanish Tables instructional event task—she was the first to
finish the task—Lupita decided to work on a puzzle in the rug area. She
was soon joined by two other bilingual girls, each successful in placing a
few pieces back on the puzzle template. Then Marta, a bilingual child as
sessed bythe teacher asa very competent student, asked Lupita for help in
placing herfirst piece on thetemplate. Lupita notonly helped herbutalso
taught her how to workwith it by taking Marta's hand and showing her
where and how it should be placed. Lupita continued to help her for a
short while and then returned to her own puzzle. A few moments later,
Lupita disengaged herselffrom her puzzle workand became interested in
whata boy, who had justentered the scene, was doing with a box of toys.
Lupita asked him if she could play with him when, suddenly, the boywas
interrupted by the classroom aide who asked him if he had finished his
work at the Spanish Tables, trying to convey to him that he shouldn't be
REFLECTION
If you were a classroom observer who watched Lupita and later described
her interactions during this lesson to a colleague who hadn't been there,
what could you sayabout her? What can you infer about Lupita from this
description?
Defining CaseStudies
Different authors have defined cases studies in different ways. Those who come
from a background of naturalistic inquin,' see this method much differently from
those who come from the experimental tradition. Because it is so difficult (and
somewould argue, not helpful) to try to control variables in naturally occurring
classroom settings, in this book, we will consider case studies from the perspec
tive of the naturalistic inquiry tradition. The definitions printed below reflect
this orientation.
Despite their differences, these definitions have two main commonalities. The
most important of these is the notion that a case is a 'bounded instance.' By
bounded we mean "defined" or "having boundaries"—whether those boundaries
are physical (a certain school site, a child), or temporal (as in a lesson, which has
a beginning and an end). You can think of the metaphor of" a fenced-in area as
REFLECTION
Based on what you have read so far, and on your previous reading, what do
you see as the advantages of case study research? What might be somedis
advantages?
A single landmark can only provide the information that they are situ
ated somewhere along a line in a particular direction from that landmark.
With two landmarks, however, their exact position can be pinpointed by
taking bearings on both landmarks; they are at the point where the two
lines cross, (p. 198)
Particularity
The concept of particularity as a characteristic of case studies is related to the
boundedness of the case. In other words, "a single caseor nonrandom sampleis
selected precisely because the researcher wishes to understand the particular in
depth,not to find out whatisgenerally true ofthe many" (Merriam, 1998, p. 208).
Here, the analogyof a camera that uses different lenseswill be helpful. Sur
vey research, as described in Chapter 5, takes a wide angle view. It captures the
landscape—a panorama of mountains, streams, and trees. Casestudyresearch, in
contrast, uses a close-up lens. It examinesthe individualwildfloweror provides a
detailed study of a leaf. Although the flower and the leaf are part of the land
scape, looking at the photo shot with the wide-angle lens will not allow us to see
the petals of the flower or the delicate veins of the leaf.
In second languageclassroom research, we may choose to focus on a partic
ular student, or a particular teacher, or perhaps one particular conversational ex
change among three pupils doing group work. It is the close examination of the
particular phenomenon that allowscase study researchers to go into great detail
in terms of data collection and analysis.
Interpretation
To interpret something is to construe or attach meaning to it, that is, to under
stand it. When we look at case study data, we analyze those data and that analy
sis can be either qualitative or quantitative or both. Interpreting results in data
has to do with explaining what they mean. This comment is true of statistical
analyses as well of more qualitative analyses, and casestudy research employsin
terpretation in both contexts.
An example interpretation in language classroom research is found in
Ulichny's (1996) investigation of an interaction in an intermediate adult ESL
conversation class. Ulichny documented a particular classroom speech event,
which contained three different discourse activities. One student, Katherine, is
talking about why she decided not to continue with her volunteer work—a role
she undertook in order to practice her English. So, one discourse activity con
sisted of the actual conversation among Katherine, another student, and the
teacher, with the rest of the class listening. But, as Katherine's story goes on,
the teacher soon "exits from the conversation to work on specific elements of the
Deciding whether any given investigation is or is not a case study is not always
easy orstraightforward. As noted above, the term\case study has been defined in
various ways, and it is probably easier to say what a case is not than what it is.
While it seems reasonably clear that the study of an individual learner or an in
dividual classroom is a case, what about an investigation of an entire school or
even a whole school district? Any of these could be the focus of a case study.
In addition to focusing on a variety of topics for possible investigation, case
studies can serve a range of purposes and displayvarious characteristics. Sten-
house (1983), one of the 'fathers' of case study research in education, developed
a typology of case studies. The first type he identified as the neo-ethnographic,
which is the in-depth investigation of a single case by a participant observer.
Next, is the evaluative, which is a "single case or group of cases studied at such
depth as the evaluation of policy or practice will allow (usually condensed field-
work)" (p. 21). In contrast with these first two, the multi-site case study consists of
"condensed fieldworkundertaken by a team of workers on a number of sites and
possiblyoffering an alternative approach to research to that based on sampling
and statistical inference" (ibid.). Such research probably approaches ethnogra
phy (see Chapter 7), particularly if it attempts to capture a wide range of issues
and questions. The final type consists of action case studies. These are school case
studies undertaken by teachers who use their participant status as a basis on
which to build skills of observation and analysis.' (ibid.). A typology based on
Stenhouse is set out in Table 6.1.
REFLECTION
What do you see as the advantages of these different types of case studies?
What might be some disadvantages?
Adelman et al. (1976) argue that there arc six principal advantages of case study
research in educational settings. In the first place, in contrast with some other
research methods, case studies are 'strong in reality,' and therefore likely to
appeal to classroom teachers who will be able to identify with the issues and
concerns raised. Secondly, they claim that one can generalize from an instance
to a class. (For example, you may recognize in R. L. Allwright's (1980) case
study of a conversation between Igor and his teacher a number of garrulous
students you have known.) A third strength of the case study is that it can rep
resent a multiplicity of viewpoints and can offer support to alternative interpre
tations. Fourth, if they are properly presented, case studies can also provide a
database of materials that may be reinterpreted by future researchers. Fifth,
insights yielded by case studies can be put to immediate use for a variety of
purposes, including staff development, within-institution feedback, formative
evaluation, and educational policy making. Finally, case study reports are often
written in a more accessible style than are conventional experimental research
reports, and are, therefore, capable of" serving multiple audiences. Because they
are 'user-friendly,' case studies can contribute to the democratization of deci
sion making in ways that studies based solely on quantitative data and statistical
analyses may not.
REFLECTION
Look back to the brief description of Lupita's interactions with her class
mates. Which of the six advantages of case studies identified by Adelman
et al. (1976) are discernable if we consider that excerpt to be a "mini" case
study, or some data from a longitudinal case study?
As noted above, the case selected for investigation may be one person or a few
people. For instance, Carless (1999) conducted a case study of three primary
school teachers in Hong Kong, and Harklau (2000) studiedthree ESL commu
nitycollege students. Case studies have also been conducted aboutsingle or mul
tipleclassrooms, one or a groupofschools (see, e.g., Wang's [2003] case studyof
English language teaching in China).
Reasons for selecting the particular case(s) to be studiedare varied. Ideally, a
case can be chosen that embodies the phenomenon the researcher wishes to in
vestigate:
The individual case is usually selected for study on the basis of specific
psychological, biological, sociocultural, institutional, or linguistic
attributes, representing a particular age group, a combination of first
and second languages, ability level (e.g., basic or advanced), or a skill
area such as writing, a linguistic domain such as morphology and syntax,
or a mode or medium of learning, such as an online computer-mediated
environment. (Duff, 2008, pp. 32-33)
REFLECTION
Think about the reasons given above for selecting a particular entity as a
"case" for investigation. Given the research topics that interest you, what
would be a case that you could study? For what reason(s) would you select
that entity?
Read two or three case studies in books or professional journals from our
field. Do the authors explain how and why they selected the case(s)? Ifyou
are working with a group, compare the reasons found in the case studies
you read with those found byyour classmates or colleagues.
In relation to the internal validity of case study research, Yin (1984) claims
that this is a matter of concern only in
REFLECTION
The principal difference between case studies and other research stud
iesis that the focus of attentionis the case, not the whole population of
cases. In most other studies, researchers search for an understanding
that ignores the uniqueness of individual cases and generalizes beyond
particular instances. They search for what is common, pervasive, and
lawful. In the casestudy, there mayor maynot be an ultimate interest in
the generalizable. For the timebeing, the search is for an understanding
of the particularcase, in its idiosyncrasy, in its complexity, (p. 256)
In the past, case studies have often been accorded less status than more
rigorously controlled experimental or process-product studies because,
as the argument often goes, casestudies are not generalizable.However,
this criticism is unwarranted. It is probablytrue that it is difficult to gen
eralize from an individual (or a group) to an entire population without
the presence of strict controls to account for environmental variables.
However, there is also a form of generalization that proceeds not from
an individual case to a population, but from lower-level constructs to
higher-level ones. Furthermore, in the practical world in which case
studies are conducted, particularization may be just as important—if not
more so—than generalization, (p. 198)
According to van Lier, particularization means that "insights from a case study
can inform, be adapted to, and provide comparative information to a wide vari
ety of other cases" (ibid.). However, readers and researchers must take contextual
differences into account when doing so.
This idea has also been discussed by Larsen-Freeman (1996, citing Clarke,
1995). She says that particularizability involveshelping teachers find connections
between research results and the particulars of their own classroom realities. In
her view, those sorts of connections are more valuable than the statistical concept
of generalizing findings from studies using samples randomly drawn from a
defined population.
A related concept is the transferability (alsocalled comparability) of hypothe
ses, principles, and/or findings (Duff, 2008; Lincoln and Guba, 1985). In this
idea, it is up to the readers of case studies to decide for themselves "whether
there is a congruence, fit, or connection between one study context, in all its
complexity, and their own context, rather than have the original researchers
make that assumption for them" (p. 51).
Yin also argues that construct validity is especially problematic in case study
research. This problem is due to the frequent failure of case studyresearchers to
develop a sufficiently operational set of measures and because 'subjective' judg
ments are used to collect their data. This point leadsus to the issues of subjectiv
ity and objectivity.
How do you feel about the subjectivity issue n case study research? Can
you find research resultsconvincing if they are not totally objective?
Which of the three positions listed above could be used to justify the
following study?
Janet Allbright is investigating the different stages of acquisition that
learners go through as they acquire English. She uses a six-stage model of
acquisition that she has come across in the literature. This model argues
that all the grammatical structuresof English can be placed into six groups,
or stages, and that learners must pass through each of these stages in turn.
Her chosen methodology is case study. She records and analyzes the con
versations of an immigrant learner of English over a two-year period and
compares the learner's stages of language development from beginner- to
intermediate-level of proficiency.
A SAMPLE STUDY
Sometimes, case study research is criticized for being atheoretical, and it is true
that case studies are sometimes more data-driven than theory-driven (Duff,
2008). Nevertheless, "much case study research is embedded within a relevant
theoretical literature and is motivated by the researcher's interest in the case and
how it addresses existing knowledge or contributes new knowledge to current
debates or issues" (p. 57). In thissection, we will summarize a case study that has
a very strong tie to theory.
Nassaji and dimming (2000) analyzed the interactions of an ESL teacher
and young Farsi speaker learning English in Canada. The child, whose pseudo
nym was Ali, was six years old and had moved to Canada from Iran. The re
searchers examined dialog journal exchanges between Ali and his teacher, Ellen
(also a pseudonym), over a period of ten months. It might be argued that this
study is more an example ofclassroom-oriented research rather than classroom-
based research (see Chapter 1) since it did not investigate classroom interaction
perse. However, the interactions between Ali and Ellen were part of their natu
ral ongoing relationship as learner and teacher; it's just that the interactions
under investigation were written instead of spoken.
The authorsquote Peyton's (1990) definition of a dialogjournal as"a written
ongoing interaction between individual students and their teacher in a bound
notebook" (p. 100). Ellen told her students to write about things that interested
One key point is that all of these dialog entries were written before either
the teacher or the student were aware that they would be used for research.
For this reason, the dialog journal data "represent naturalistic classroom data"
(Nassaji and dimming, 2000, p. 101).
REFLECTION
Have you ever used the dialog journal procedure, either as a teacher or a
language learner? What do you think about this idea for encouraging
learners to communicate their ideas in writing in the target language?
What do you think about using dialog journal entries as data in language
classroom research?
As it turned out, only eleven of these categories were used in this study because
neither the teacher nor the student requested academic information (5) or com
plained (12), and thanking (9) happened very rarely or was embedded as part of
other functions.
In addition to the function coding described above, the researchers divided
the dialog journal entries into T-units. This is a unit of syntactic complexity de
fined by Hunt (1970) as an independent clause and any attached subordinate
clause(s). So, a subordinate clause alone is not a T-unit. A full sentence is a
T-unit, whether it is a simple sentence or a complex sentence. A compound sen
tence is categorized as two (or more) T-units, as shown below:
1. It's raining. (One T-unit)
2. It's raining, which is unfortunte. (One T-Unit)
3. It's raining and there is lots of lightning. (Two T-units)
4. It's raining, there is lots of lightning, and I hear thunder. (Three T-units)
For both the function coding and the T-unit analysis, Nassaji and Gumming
(2000) checked their inter-coder agreement. They reported strong inter-coder
indices:
REFLECTION
Based on what you have read so far, what do you think were the research
questions that Nassaji and dimming wished to address?
Sociocultural Theory
The first paragraph of Nassaji and Cumming's (2000) report begins with the fol
lowing questions (which, at first glance, may seem more like attention getters
and topic nominators than explicit research questions): "What doesa ZPD (zone
Quantitative Results
The majority ofAli's entries (58%) involved reportinggeneral facts. Twenty per
centwerereportingpersonal facts, and 18% werereportingopinions. The other
eight functions coded were requesting personal information (0.2%), requesting
general information (2%), requesting opinions (0.2%), requesting clarification
(0.2%), evaluating (0.7%), predicting (0.7%), and apologizing (0.7%).
The functions coded in Ellen's entries were reporting general facts (13%),
personal information (23%), requesting general information (4%), requesting
opinions (10%), requesting clarification (2%), evaluating (10%), predicting
(9%), apologizing (0.7%), and giving directions (2%).
Regarding these quantitative findings, Nassaji and Cumming (2000) say that
the variety and value of Ellen's language functions have to be recog
nized, not simply as proportional frequencies, but for the ways in which
she pitched her discourse to match Ali's basic 'reporting' mode. We as
sume Ellen was striving to scaffold their written interactions to prompt
Ali's potential for learning English in this context, (p. 104)
The numerous examples of dialog journal entries that the researchers provide
help the reader interpret and verify this interpretation.
Qualitative Results
In repeatedly reviewing the dialog journal exchanges, the researchers found five
patterns that "display salient aspects of their mutual process of constructing a
ZPD" (Nassaji and Cumming, 2000, p. 106). These five patterns "of comple
mentary asymmetry" (p. Ill) are discussed and some will be exemplified below:
1. Questioning: "In the early weeks of the journals, Ellen posed simple routine
questions seemingly to engage Ali in the discourse andto show Ali how to inter
act" (ibid.).
Example:
Ali: Today is +14 A.M. May 1995 is23th. Yestbday ismy borday. I am 7yors
old. I love myMom and my Dad. TodayisTeusday. I lovemyTeacher. I love
ApplTREE.
Ellen: Did you have a birthday cake?
Ali: Yes.
2. Give and Take: "At times when Ali increased the frequency of language
functions he used, Ellen correspondingly decreased hers, seemingly to allow
Example:
AH: Today isManday, Mayis 8th.The Yestoday is6th. I lave Mrs.[Ellen] My
frand shawN. May 1995.Today +6.
Ellen: Shawn and I love you too, Ali.
Ali: Me too.
The authors say the use of this typically spoken triadic speech structure in Ali's
writing may suggest that "Ali was learning new mediational means from a vari
etyofsources around him, such as classroom or conversational discourse" (ibid.,
p. 113). HisEnglish abilities were developing through theprocess of"extending
whatis appropriate in one domain to another"(ibid.).
Earlier in this chapter, we cited Duff's (2008) characterization of case stud
ies as being interpretive. The comments from Nassaji and Cumming (2000), as
theydiscussed these five patterns in the qualitative analysis, illustrate the inter
pretivenature of casestudies.
You will recall from Chapter 3 that one threat to validity in experimental re
search is mortality—the loss of subjects from the sample. In case study research,
thisissue is usually called attrition (Duff, 2008). This is one of the biggest pitfalls
of conducting a case study on an individual. Ifyou lose access to that person, you
can no longer collect data—a problem reminiscent of the familiar warning not
to put all your eggs in one basket. This problem has occurred when the family of
a child beingstudied has moved or when new job opportunities have taken adult
learners out of the researcher's reach. Sometimes, individuals decide they do not
wantto be the subjects of an investigation and remove themselves from the study-
entirely.
Another pitfall, perhaps especially for novice researchers, is that to conduct
a longitudinalcasestudy takes time, commitment, and being systematic. Months
or years may be needed to track changes and answer particular research
questions. Embarking on a longitudinal case study also requires the researcher
to be disciplined and dedicated in recording and managing the data. For
instance, if you choose to examine the development of English speaking fluency
among seven-year-old pupils over the course of a school year, you must observe
these children regularly. You will also need to carefully and consistently
labeland store any audiotapes or field notes documenting the children's interac
tions for an entire year. Finally, you will also have to do a substantial amount of
transcription.
In spite of these pitfalls, there are considerable payoffs in carrying out a
well-conducted case study. For example, for novice researchers, working with
one subject or one site may be much more manageable than trying to implement
"large N" research or to collect data at several sites. Looking closely at one
learneror a few learners may allow the researcher to notice and appreciate small
changes occurring over time that might not be noticeable in a cross-sectional
study of many subjects. Likewise, the varied types of data collection used in case
studies, including close and prolonged observation, may reveal important
changes that are not captured by language tests and other sorts of measurements
used in experimental research.
Ethnography
186
observed in situations, audio- or videotaping of interactions for close
analysis, collection of relevant or available documents and other materials
from the setting, and other techniques as required to answer researcher
questions posed bya given study, (p. 583)
In this chapter, we will first summarize the background and some key char
acteristics of ethnography. Then we will consider four principles of ethnography
and the typical stages of an ethnographic study. Next, we will focus on the role
of the researcher within the research process, including participant and nonpar-
ticipant observation, and developing emic and etic perspectives. We will then
grapple with several thorny issues related to quality control in ethnography, first
by examining ethnography in terms of the traditional criteria of reliability and
validity. However, we will see that ethnography can be more appropriately eval
uated by its own criteria.
As you read this chapter, keep in mind our metaphor about research cul
tures. If your main background in research is the experimental method, you will
find that ethnography has a very different culture. Its values, goals, norms, and
vocabulary are not the same as those of the experimental method in the psycho
metric approach.
REFLECTION
Based on your earlier readings, both in this book and elsewhere, what do
you already know about ethnography? If you see a book or article whose
title begins, "An Ethnographyof. . . ," what do you expect to read about
the topic?
Background to Ethnography
Like manv other research methods, ethnography entered applied linguistics
from another field, in this instance, anthropology. In fact, anthropology is one of
the parent disciplines of linguistics, and many eminent early linguists were
trained as anthropologists.
Anthropology is the study of cultures and societies. Anthropologists typi
cally spend long periods of time living and working among the people they are
studying. Originally, this immersion was among little known and so-called
'primitive' groups, such as the indigenous peoples of New Guinea, Africa, and
North and South America.
However, researchers came to see that they could apply anthropological re
search techniques to the investigation of groups and subgroups within their own
cultures. For example, in a classic study carried out in the 1930s,Whyte (1981)
portrayed the street corner societies of the urban poor. Smith and Geoffrey
(1968) conducted a yearlongstudy of a secondaryschoolclassroom in the United
States using ethnographic procedures. In the field of applied linguistics, Heath
REFLECTION
Based on what you already know, what would you predict might besome of
the differences between experimental research and ethnography?
Heath's stance is one that appeals to many applied linguists, including language
classroom researchers. The growing interest in finding alternatives to formal ex
periments has been fuelled by skepticism on the part of some leading researchers
over the ability of controlled experiments to "produce die definitive answers that
some researchers expect" (Ellis, 1990a,p. 67). Asa result, in the past three decades,
more researchers have turned to the naturalistic inquiry tradition—ethnography
in particular—inorder to understand language use, as wellas the practicesand be
liefs of people involved in language teachingand language learning.
Language classroom ethnographies include van Lier's (1996a) report of a
bilingual (Quechua and Spanish) program in Peru; Duff's (1995; 1996) research
on dual-language secondary school programs in Hungary; Harklau's (1994)
three-and-a-half-year study of Chinese immigrant children in California;
Canagarajah's (1993) account of university EFL students in Sri Lanka; Cleghorn
Characteristics of Ethnography
Ethnographies fall in thenaturalistic inquiry tradition, butthey differ from other
forms of qualitative research in three main ways: theyare longitudinal, theyare
comprehensive, and theyview people's behaviors in cultural terms.
REFLECTION
PRINCIPLES OF ETHNOGRAPHY
Ethnography is Holistic
The second principle is that "ethnography is holistic; that is, any aspect of a
culture or behavior has to be described and explained in relation to the whole
system of which it is a part" (Watson-Gegeo, 1988, p. 577). Duff's comment
about the Cleghorn and Genesee(1984) ethnographyof the immersionprogram
in Canada illustrates this principle. In summarizing their interpretation, Duff
(1995)—herselfa Canadian—says that their study
focused on interactions among anglophone and francophone teachers in
a Montreal school with both early French-immersion and regular
English-stream programs at a time of rather acute provincial political/
linguistic tensions and misgivings. Participant observation revealed
that, ironically—and indeed contrary to the publicized objective of
immersion to foster harmony, understanding and bilingualism across
Canada's largest linguistic communities—the teachers from the two
ethnolinguistic groups avoided contactwith one another, resented each
other's presence and resorted to English, the dominant language of
the country but not of that province, in cross-group discussions.
(pp. 507-508)
Thus, the findings werecouchedin the contextof the much broader political and
linguistic issues affecting the wholecountry at the'time.
*"
REFLECTION
ETHNOGRAPHERS'ROLES
Ethnographers can take a variety of roles within the given research context.
Originally, anthropologists were outsiders who went into a culture to study it.
They were clearly not members of the culture. As a result, two of their chal
lenges were to gain the trust of the members of that culture and to learn how to
participate in that society so that they could collect valid and reliable data. As
they gained entry into the field, anthropologists conducted observations along a
continuum of involvement. They were often engaged in the activities of the
group they were studying asparticipant observers. In other instances, they would
observe but not be actively engaged, in which case they were functioning as
nonparticipant observers.
REFLECTION
REFLECTION
If you were a researcher trying to collect data and you found yourselfiden
tifying too strongly with a subset of your research population, what steps
could you take to overcome this problem in your data collection phase?
What could you do in the data analysis phase?
Participant Nonparticipant
Observation Observation
Covert
Observation
REFLECTION
Imagine a research situation for each ofthe four quadrants in Figure 7.2.
When would it beappropriate to conduct coverfiobservations? Howto you
feel about the ethics of observingpeople when 1hey are unawarethat data
collectionis going on?
REFLECTION
List other criticisms that you think might be leveled against ethnographic
research in terms of reliability and validity. How might an ethnographer
respond to these criticisms?
ACTION
Turn back to Tible 5.2, which refers to sampling strategies. Which of the
strategies listed there do you think ethnographers might use to identify
their observation sites and the people they wish to study?
What other sorts ofdetails regarding the data collection procedures would
you want to see while reading an ethnography?
REFLECTION
Think of a study you have read or one that has been described in this book.
Did that study use high-inference or low-inference descriptors, or both?
REFLECTION
resources. The use of multiple sites and additional researchers will almost
certainly be very expensive. As we have already indicated, the predominantuse
of low-inference descriptors is also problematic because behaviors that are not
directly observable are often very interesting. For example, some of the most
important studies in classroom research have looked at learner characteristics
such as motivation, interest, power, anxiety, authority, and control. (See, for ex
ample, Canagarajah's (1993) study of students' motivation for English learning
in Sri Lanka.) These are all high-inference phenomena.
Table 7.2 summarizessome important questions about internal and exter
nal reliability in ethnographic research. The strategies embodied in these ques
tions can be used to guide ethnographers in conducting research and writing
their reports. They can also be used by readers of ethnography to evaluate such
reports.
(In this context,LeCompte and Goetz are using the term internal validity some
whatmore broadly than it is normally used in experimental research.)
Some of the concerns about internal validity in experimental research arise
in doing ethnography. Because the ethnographer typically studies a group for a
longperiod of time, maturational changes or the attrition of informants occur
ringduring thecourse oftheresearch might affect theoutcomes. Normally, both
theseconcerns are addressed by the very natureof ethnographic inquiry. That is,
the longitudinal data collection procedures capture change asthe daily life ofthe
group is documented. In addition, movement in and out of groups (through
birth, death, relocation, and so on) is a normal part of the cultural existence that
ethnographers strive to document.
Threats to internal validity in experimental research are sometimes related
to biasin the selection of subjects in a study—a threat that is normally dealt with
through randomization. In ethnographic research! informants are often chosen
precisely because theymeeta certaincriterion. For instance, ifyouwish to study
the identity construction of recent immigrants in secondary schools, youwould
choose a groupof recently arrived teenagers to observe. In writing an ethnogra
phy, it is important toexplain exactly why and howjan observation site and group
were chosen, but it is not a requirement that they be randomly selected.
We saw in experimental research that the reactive effects of the testing and
the reactive effects of experimental arrangements can be mattersof concern.The
parallel question in ethnography is whether the outcomes are due in part to the
presence of the researcher in the context. This issue, which we have already
labeled the observersparadox, is alsocalled reactivity. It can be dealtwith by build
ingthe trustofthe participants, byprolonged immersion in the research context,
and by behaving ethically and appropriately in that context.
REFLECTION
You have now seen several contrasts between experimental research and
ethnography in terms of internal validity. Which of these differences did
you predict? Which, if any, do you find troublesome? In which situation
described above (ifany) do you prefer the ethnographic stanceover the ex
perimental stance, or vice versa?
Concerns arise about external validity as well. The results of a study cannot
be generalized if the phenomena being investigated are unique to a particular
group or site. Someone who wishes to generalize the findings of an ethnography
must ask if the historical experiences of a particular group make it so unique that
findings about that group cannot be legitimately extended to other groups.There
mayalso be a worry that constructsand terminologyare not common to different
cultures and research sites. These concerns are summarized in Table 7.3.
Type Questions
REFLECTION
/• ^
Quality
V
Adequacy- Value
Internal External
Argument Evidence
Value Value
(You will recall that transferability and other similar issues arose in Chapter 6
when we discussed quality control in case studies.)
In fact, research of all kinds raisesquality control issues, and each tradition
demands that researchers show that they have taken the appropriate steps to
insure both reliability and validity. As van Lier (1988) has noted, "The black
smith cannot criticize the carpenter for not heating a piece of wood over a fire.
However, the carpenter must demonstrate a principled control over the mate
rials" (p. 42).
ACTION
Fill in eachblankbelow with one of these threi sets of criteria for judging
ethnography. Each criterion is followed by its experimental analog in
brackets.)
Dependability [reUability]
Transferability [external vaHdity]
Credibility [internal vaHdity] and Confirmability [objectivity]
1. depends on evidence of long-tem exposure to the
context being studied and the adequacy of data collected(use of differ
ent methods, etc.).
2. depends on a richness of description and interpreta
tion that makes a particular case interesting and relevant to those in
other situations.
3. should be assessed in terms of the documentation
of research design, data, analysis, reflection, and so on, so that the
researcher'sdecisions are open to others.
It is clear that not every time the wind blows against a tree, that tree will
fall down. When we study the phenomenon,we must add qualifications
and amendments which are endless: the wind must blow hard enough,
the tree must be fragile enough, the roots must grip the soil insuffi
ciently, the soil must be loose enough, etc. In addition, we must take
into account the position of the tree among buildings, other trees, and
so on. It would probably be impossible to lay down all the conditions
which would ensure a guaranteed tree-falling-down event. So even if we
are able to say: "The wind caused that tree to falldown," we are still not
able to specify exactly what it will take for another tree, say the orange
tree in the back yard, to fall down. (p. 37)
This comparison concludes with the important caveat that "it is obvious that L2
learning is an event which is vastly more complex than a tree blowing down"
(ibid.).
How does the tree-falling-down analogy relate to this discussion? In exper
imental research,internal validityhas to do with creating a research situation (by
controlling and manipulatingvariables) in which the researcher can confidently
claim a causal relationship between the independent and dependent variable.
However, establishing causal relationships is not an issue if the researcher is pri
marilyconcerned with providinga detailed description, analysis, and interpreta
tion of the chosen context and situation (as is the case with ethnography) rather
than with applying treatments and establishing causality. In fact, van Lier
(1990a) argues that
REFLECTION
Now that you have read about quality control issues in both ethnography
and experimental research, do you have a preference for one or the other?
These are the two "pure" research paradigms in Grotjahn's (1987) frame
work. They are apparendy juxtaposed to each other in (1) the research de
sign, (2) the nature of the data, and (3) the type of analysis used. Do you
feel more comfortable with one research culture or die other?
Ethnographies are strengthened and can be judged in part bythe extent to which
they use a number of procedures that fall under the general heading of triangu
lation. This concept is best known in ethnography, but in all forms of qualitative
research it provides a way for researchers working with nonquantified data to
check on their interpretations of those data. Byincorporating multiple points of
view, researchers can check one perspective against another. If more than one
type or source of data leads to the same conclusion, researchers have more con
fidence in those conclusions. (See van Lier, 1988, for further discussion of these
points.)
Triangulation isa geometric concept borrowed from navigation, astronomy,
and surveying. Hammersley and Atkinson (1983) state that if" people wish to lo
cate their position on a map,
a single landmark can only provide the information that they are
situated somewhere along a line in a particular direction from that
landmark. With two landmarks, however, their exact position can be
pinpointed by taking bearings on both landmarks; they are at the point
where the two lines cross, (p. 198)
The final project for the course consisted of a magazine that the students pro
duced through sitevisits, interviews, and a variety of computer tools.
In conducting thisstudy, Springer became interested in scaffolding and con
tingent language use. In Chapter 6, we defined scaffolding as "progressive help
provided by the more knowledgeable to the less knowledgeable" (Nassaji and
Cumming, 2000, p. 98). In naturally occurring conversations, contingency is the
tendency of one speaker's turn to be influenced by a preceding speaker's turn.
That is, a subsequent utterance iscontingent upononeor more preceding utter
ances. (See van Lier, 1989.) Springer's (2003) literature review led her to the
following descriptions of these concepts:
For Aljaafreh and Lantolf(1994), three significant characteristics ofscaf
folding [italics added] are that it be graduated, dialogic and contingent.
A SAMPLE STUDY
Duff (1996) chose history lessons for her data collection because history "is a
compulsory subject in the Hungarian curriculum, is enjoyed by most students,
and is rich in both interaction and in linguistic and cognitive structures (e.g.,
narrative, description, cause-effect)" (p. 413). She observed history lessons in
both dual-language and traditional schools to allow for comparison across the
two types of programs.
Data Collection
In order to answer these questions, Duff videotaped approximately fifty history
lessons, whichwere taught by three teachersworkingat four differentsecondary
schools located in different parts of Hungary. She usually observed a given
teacher's class for a period of two weeks in order to see the development and
review of themes across a unit of instruction.
The videotaped data were transcribed. Any data that originally occurred in
Hungarian were translatedinto English.In addition to makingher observational
field notes, Duff also interviewed students, teachers, and administrators. She in
cluded discussions with consultants and copies of students' essays in her data
base. The reports of her research are replete with excerpts from the transcripts
so readerscan get a strong sense of the interaction among students and teachers,
both in dual-language and traditional classrooms.
The processes of transcription and translation were facilitated by the stu
dents themselves. Duff (1995) says, "Nearly a dozen ... students volunteered to
be my part-time research assistants, helping with transcription, translation, ver
ification, and interpretation, especially during their summer vacation; in this
way, we becamequite familiar over time" (p. 513). Involving the students in this
way helped her to counteract the observer's paradox to some extent.
Key Findings
One of the differences between the traditional programs and the dual-language
programs was the less frequent use of xht feleles in the dual-language programs.
In fact, Duff (1995) observed that, to a large extent, "the daily recitation period
had been rejected by most of the history teachers in the dual-language
programs" (p. 517).
The speecheventsthat replaced thefeleles consisted of briefpreparedlectures
by students, often delivered from written notes, as well as question-and-answer
Richards notes that even though it may be "legitimate to use methods charac
teristic of ethnography, these do not in themselves mean that youareworking
within this tradition" (ibid.). For this reason, it is important to distinguish be
tween true ethnographies and research that might more properly be called
qualitative.
Because of the prolonged immersion of ethnographers in the field, the
sheer quantity of data can be overwhelming. For novice researchers in particu
lar, it is important to take care in organizing, labeling, dating, and storing all
field notes, audio tapes, videotapes, archival documents, photographs, and
maps. We will return to these issues in Chapter 14when we discuss the analysis
of qualitative data.
Another concern is that ethnography as a research method and grounded
theoryas an orientation may not suitsome researchers' personalities and cogni
tive styles. As van Lier (1990a) points out, "the heuristic quality of ethnography
makes it an inherently insecure pursuit, sincethere are no firm external rulesand
guidelines for proper scientific conduct" (p. 41). For this reason, it seems essen
tial to have a high tolerance for ambiguity before undertaking ethnographic
fieldwork. In addition, "the worker in the field isessentially alone, andinevitably
learns as much from opportunities missed, false leads too strenuously pursued,
and insights by-passed in inexplicable ways, asfrom routine description and cat
egorization" (ibid.). For these reasons, doing ethnography requires a substantial
commitment of time and personal fortitude.
Finally, researchers interested in learning more aboutethnography mayen
counter difficulty in finding training courses. Asvan Lier (1990a) notes, without
such training it is difficult to have a clear idea as to what constitutes 'good' or
'bad' ethnography. He adds that
We arrive in Tiyana just after eight in the morning. The school, consist
ing of three classrooms (adobe walls and corrugated tin roof) and a
storeroom (almacen), which the director (the principal) has converted
Many students have written canipu or canipo instead ofcampo, every time
they have copied the word. This is a very puzzling error, until I take a
good look at theblackboard and notice a small spoton it, just above the
third leg of the letter m in campo. Since canipu/-o is not a word in
Spanish, this is clear evidence of how mechanical the students' copying
activities are. (p. 372)
Given the depth and detail of van Lier's writings the reader is left with a vivid
picture of the educational context as well as sense of compassion and clarity
about his understanding of the teachers and children in the altiplano of Peru.
In this chapter, we have considered the background and some key characteris
tics of ethnography, which came into applied linguistics from the parent disci
pline of anthropology. We summarized four principles of ethnography and the
typical stages of ethnographic research, exemplifying each point with excerpts
from classroom ethnographies. We looked at two of the important roles and
responsibilities ethnographers take on,including participant and nonparticipant
observation, and the goal ofdeveloping both emic andeticperspectives. Several
important issues were raised regarding quality control in ethnography, but we
found that, rather than focusing narrowlyon the traditional criteria of reliabil
ity and validity, ethnography can be more appropriately evaluated by its own
criteria.
Ethnography represents a very different research culture from experiments
andsurveys. Arigorously researched, well-written ethnography can be a power
ful tool for informing readers about a particular cultural group (whether it is a
remote tribal culture, a neighborhood streetgang, the weekend bowling league,
or a language class), and for relating the daily life of that group to the broader
cultural context within which it is situated.
1. Look at the following topics from language classroom research. First, list
two or three behaviors associated with each topic. Then categorize the
nature of eachbehavior as beingeither low-inference (clearly definable in
operational terms; easy to verify) or high-inference (less easily defined or
verified).
Error treatment
Fluent speech
Teaching the past tense
Self-initiated turns
Social climate
Teacher praise for students
Teaching style
Use of the first language
2. From the listabove, choose one of the high-inference topics and/or behav
iors that interests you. Try to write a clear, operational definition for the
term. Checksome keyreferences on the topic to see how other researchers
havedefinedthe construct. Do you see any ways that some elementscould
be more low-inference in nature?
3. Read an ethnographic studyyou have located or one cited in this chapter
or listed below in "Suggestions for Further Reading." Does the studyuti
lize high-inference or low-inference categories or both in its data collec
tion and/or data analysis procedures? Were the categories clearly defined
and explicitly stated so that you could replicate the study if you wished?
We recommend all the ethnographies cited in this chapter. If you would like to
readmoreaboutthe bilingual Spanish-Quechua program in Peru,please seevan
Lier's (1996a) article. There is also a booklength account of this context by
Hornberger (1988). For further information about the classroom ethnography
in Sri Lanka, you can read Canagarajah's (1993) originalarticle and an interest
ing response to it by Braine (1994), who is from Sri Lanka himself.
We have cited Watson-Gegeo's (1988) paperextensively. Aslightly different
treatment of ethnographic principles is provided by van Lier (1990a), who
stresses that two main principles of ethnography are (1) an emic point of view,
and (2) a holistic concern for context. See also van Lier's (1988) book.
Freeman (1992) investigated the construction of shared understandings in
a high school French class. This report provides an excellent example of a re
searcher developing the emic perspective, and of the way research questions
emerge and evolve as an ethnographicstudy progresses.
Defining ActionResearch
Action research is becoming increasingly prominent in the research methodol
ogy literature in our field. (See, for example, Burns, 1999; 2004; Edge, 2001;
Nunan, 1990; 1993; Wallace, 1998; vanLier, 1994a.) Asan approachto research,
it has been around sincethe 1940s, when it first appeared in the social science lit
erature (Lewin, 1946; 1948). In the 1980s, it was adapted by educators such as
Carr and Kemmis (1986), who described it as follows:
Action research issimply a form ofself-reflective enquiry undertaken by
participants in order to improve the rationality and justice of their own
226
practices, their understanding of those practices and the situations in
which the practices are carried out. (p. 162)
This description is widely cited and it highlights the practitioner-driven nature
of action research as well as the social justice bias bequeathed to the concept by
Lewin, a left-wing sociologist. However, it is rather too broad to work as a defi
nition for a form of research, beingbasically a statementabout reflective teach
ing. (See Richards and Lockhart, 1994.) J
For us, there are key differences between reflective teaching and action re
search. One is that reflective teaching can be a solitary and private practice, but
in action research, the results of the process—the outcomes or products—should
be published. We are using publish here in its original sense: to make publicly
available to others for critical scrutiny. Another difference is that reflective
teaching could conceivably occur at one point in time, after a particular lesson,
whereas action research is cyclic and iterative.
We define action research as a systematic, iterative process of (1) identify
ing an issue, problem, or puzzle we wish to investigate in our own context;
(2) thinking and planning an appropriate action to address that concern;
(3) carrying out the action, (4) observing the apparent outcomes of the action;
(5) reflecting on the outcomes and on other possibilities; and (6) repeating
these steps again. Toour minds, the cycle described above must be carried out
at least twice (and typically more often) for the investigation to qualify as ac
tion research.
A more philosophical definition of action research is provided by Kemmis
and McTaggart (1982), who suggest that
[t]he linking of the terms 'action' and 'research' highlights the essential
feature of the method: trying out ideas in practice as a means of im
provement and as a means of increasing knowledge about the curricu
lum, teaching andlearning. The resultis improvement in whathappens
in the classroom and school, and better articulation and justification of
the educational rationale of what goes on. Action research provides a
way of working which links theory and practice into the one whole:
ideas-in-action. (p. 5)
In this quote, the authors highlight connections between theory and practice.
They also pointout that action research entails more thansimply providing de
scriptive and interpretive accounts of the classroom, no matter how rich these
might be. Action research is meant to lead to change and improvement in what
happens in the classroom. But, in contrast to experimental research, as Kemmis
and McTaggert (1988) note,
[a] distinctive feature of action research is that, those affected by planned
changes have the primary responsibility for decidingon courses of crit-
ically informed action which seems likely to lead to improvement, and
for evaluating the results of strategies tried out in practice. Action re-
search isa group activity, (p. 6)
REFLECTION
Based on your previous knowledge and what you have read so far, what
would you say are the key characteristics of action research that distinguish
it from both naturalistic inquiry and experimental research in the psycho
metric paradigm?
REFLECTION
REFLECTION
Based onwhat you have read sofar, howwould krou explain action research
to a colleague whowanted to know more about it?
does not do. Kemmis and Henry (1989) made these statements about what
action research is not:
1. It is NOT the usual thing teachers do when they think about their teach
ing. It is systematic anil involves collecting evidence on which to base
rigorous reflection.
2. It is NOT (just) problem solving: it involves problem posing, too. ... It is
motivated by a quest to improve and understand the world by changing it
and learning howto improve it from the effects of the changes beingmade.
3. It is NOT research on other people. Action research is research by partic
ularpeople on their own work, to help them improve whattheydo, includ
ing how they work with and for others.
4. It is NOT the "scientific method" applied to teaching.... It adopts a view of
social science which is distinct from a view based on the natural sciences (in
which the objects of research may legitimately be treated as"things"); action
research also concerns the "subject" (the action researcher) him- or herself.
Given thesestatements, we cansee that action research isunlike both exper
imental research and naturalistic inquiry. Action research differs from experi
mental research in that the former—like naturalistic inquiry—works with
naturally occurring groups and does not impose artificial control over variables.
However, unlike researchers in the naturalistic inquiry tradition, action re
searchers do seek to intervene and bring about change (as do psychometric
researchers conducting experiments). In fact, the "action" in action research is
Cycle 1
Step 1: Problem/puzzle "Student motivation is declining over the course of
identification the semester."
Cycle 2
Step 7: Identify follow-up "How can I ensure more involvement and
puzzle commitment by learners to their own learning
process?"
Step 8: Second hypothesis "Developing a reflective learning attitude on the part
of learners will enhance involvement and motivation
to learn."
Step 9: Take action and "At the end of each unit of work, learners complete a
observe outcomes self-evaluation of learning progress and attainment of
goals."
Step 10: Reflect on "Self-evaluations show not all learners feel they are
outcomes improving, even though I think they are."
parallel to the treatment in experimental studies, but external researchers are not
applying the treatment to subjects. Instead, in action research, the participants
themselves decide what to do to bring about positive change.
Like naturalistic inquiry but unlike experimental research, the research
questions may evolve as action research proceeds. And, because it is concerned
with a particular situation, action research "tends to be rather messy and unpre
dictable" (van Lier, 1994b, p. 7), so the data collection and analysis procedures
may also change. We turn now to discussion of the action research cycle with an
example of how this evolution may come about.
Approach
Why?
Other comments:
. . -
REFLECTION
If you are currently teaching, think about an action research study you
could conduct to address an issue in your own classes. What data collec
tion tools and procedures could you use to elicit the students' input in that
context?
SAMPLE STUDY
The sample study presented here incorporates many of these quality control
issues and illustrates a process for eliciting students' ideas. The teacher, John
Thorpe, conducted this action research project inhis own ESL class, buthecol
laborated with a colleague who was teaching thesame group ofstudents in a dif
ferentclass. The colleague focused on vocabulary teaming in hisaction research
project while Thorpe wanted to explore options in teaching listening compre
hension through the use of television news broadcasts.
This focus cameabout because Thorpe was working with three Koreanadult
learners of English whowere enrolled in a two-semester program sponsored by
their government. Upon their return to Korea, the students would be facing a
battery of English proficiency examinations. These students were worried
about a listening comprehension test which consisted almost entirely of news
broadcasts, so Thorpe decided to include television news broadcasts as part of
their regular biweekly two-hour lessons. Thorpe (2004) described his study asa
"two-phase, eight-cycle action research project" (p. 1). He gathered data by
videotaping lessons, keeping a teaching journal, and having his students com
pletea questionnaire each week. He also discussed the videotaped lessons with
his colleague.
REFLECTION
Thorpe began his action research project with a literature review that in
formed his teaching as well as his research focus. Thorpe was an experienced
teacher, but he was open to learning new things. He wrote,
Although I have over twelve years of classroom experience, I have only
usedTV news broadcasts sparingly. A review of the relevant literature...
uncovered only scant information pertaining specifically to the peda
gogical use of TV news broadcasts.... Although more recent resource
books (e.g., Larimer & Schleicher, 1999) do include some useful ideas,
in light of the popularity of TV news, I feel additional research using
TV news broadcasts in the second or foreign language classroom is
needed. This project aimed, then, to seek possible answers... to the
following research question: How can I best teach listening comprehension
using TVnews broadcasts? (pp. 1-2)
REFLECTION
Imagine yourself in Thorpe's position. How could you collect baseline data
about your students' listening comprehension? What actions might you
want to implement to address die questions would you pose?
Given this grounding, Thorpe initially planned the project as five research
cycles to be conducted over five weeks during one term of instruction, but the
project continued into a second phase the following semester. The interventions
that he tried as the course progressed included giving the students their choice
of which news story to listen to, providing transcripts of the story, letting the stu
dents choose when to view the transcript, providing written comprehension
questions about the broadcast, having the students themselves write the ques
tions, and varying whether the topic of the news story was familiar or unfamiliar
to the learners. The two phases consisted of eight cycles, which are summarized
in Table 8.2.
The data collection processes Thorpe used were varied and systematic.
They gave him a clear picture of the students' responses to these interventions.
To collect data, he videotaped all the listening activities in his class for a month.
These included brief broadcasts from CAW Student News, which had not been
simplified for language learners. Thorpe also used the transcripts of these broad
casts, which were available on CNN's Web site.
PHASE 1
PHASE 2
Thorpe had two lessons per weekwith these students, so he planned each
week to comprise one action research cycle. During the Tuesday class, he
would implement the actions he had planned. Then each Thursday, after they
had completed the listening activity, the students filled out a short question
naire eliciting their ideas about both the news story and the classroom proce
dures used with it. The teacher himself also completed this questionnaire. The
questionnaire formatwas based on an idea from Christisonand Bassano (1995)
in which the learners simply placed a vertical mark on a nine-centimeter line
to indicate their opinion. This procedure allowed Thorpe to compare the
Thorpe compiled allthe students' responses, asshown in Figure 8.6. The num
bers refer tothe five research cycles. |
Thorpe (2004) measured how far from the left end each mark was and cre
ated line graphs using these data. This procedure allowed Thorpe to compare
the students' various responses to the questions and to find patterns occurring
over the five-week investigation. He couldalsocomparehis reactions to those of
the participants.
Figures 8.7 and 8.8, respectively, show the datafromthe teacher (J) andthe
students regarding the difficulty of the news stories and the helpfulness of the
teaching activities. The nine-pointscale derives from the fact that the students
marked their impressions on the nine-centimeter line. This process gave Thorpe
a clearvisual wayto represent and comparehis students' opinions.
^^^^^
8-
7-
« 6-
u
J*" 4 -
_S^—-•-—^r^^
"33
* j- -•-M
2- -»-K
-♦-C
1- -•-J
n -
U 1 i i i i i
1 2 3 4 5
Cycle
REFLECTION
What patterns do you notice in Figures 8.7 and 8.8? If you were the
teacher,how would you interpret these data?
These data were compiled after the fifth cycle(at the end of the first course
Thorpe [2004] had with these students and at the end of the first phase of the
action research project). Here's part of what the teacher had to sayabout his in
terpretation of these data:
Thorpe wondered if interest in the news storiesmight alsobe a factorin the stu
dents' listening comprehension. He and his colleague discussed this issue and
decided he should let the students themselves choose the particular news story
based on the brief "teasers" that CNN uses at the start of the broadcast to tell the
listeners what stories will be covered. Of this decision he wrote,
On Tuesday, the first day of the new cycle, Thorpe played the preview of the
news broadcast and let the students choose the story they would listen to. They
chose one about household germs. The teacher wrote,
At the Thursday class, continuing with this particular intervention, Thorpe had
the students select a story from the news broadcast. They chose a report about a
Each participant was given a packet of the strips and asked to arrange
them in order. In reviewing their results, I was pleased not only with
their abilityto accomplish the task but in their reasoningas well. For ex
ample, C explained his arrangement by saying that he realized that the
newscaster would probably speak first and last and that the reporter's
speech would be interspersed with eyewitnesstestimony or some other
corroborative statement, (p. 16)
After the students predicted the structure of the newsstory and possible alterna
tives, the teacher played the videotape for them. When he played it the second
time, he stopped at the end of each segment so the students could ask questions.
At the next classmeeting, on Thursday, he repeated this process. He wrote,
This time, the participants chose a story about a mine rescue in Russia.
One additional insight participants gained from this activity was the
use—very common in TV news—ofword plays and puns. In this case,
the newscasterreferred to the rescue as a "miner miracle." What the par
ticipants realized was that the outcome of the story dictated the use of
these playson words. If, for example, all or most of the miners had died,
the pun would not have been appropriate. This insight would be helpful
in anticipating the outcomes of subsequent TV news stories and poten
tial reasons why, for example, the newscaster smiled without apparent
reason. I have found that these word plays and puns are often very
difficult for learners to understand, even at advanced levels, (p. 16)
REFLECTION
At this point, Thorpe had finished the five-cycle action research projecthe
had originallyplanned to conduct. What might he do next, if he decided to
continue the project through further action research? Brainstorm a list of
possible options for further investigation.
These procedures continued for five weeks. After completing these five cy
cles, Thorpe decided to extend the projectfor an additional three cycles. During
this second phase, the data collection procedures were similar but not identical
to those of the first phase. In the data collection for the second phase, Thorpe
continued to videotape his classes, write in his teaching journal, and have his stu
dents complete the questionnaire about each lesson's activities. However, a
schedule change meant that Thorpe and his colleague could now observe each
other's teaching instead of being limited to viewing videotapes of one another's
classes. Thorpe also made a change in his source materials from the minute-long
CNN StudentNews reports:
Thorpe posted a link to The News HoursWeb site on his course Web site since it
contained complete transcripts of the broadcasts as well as links to the audio and
video clips. This resource enabled the students to review the news stories if they
wished.
It seemed that the regular feedback Thorpe got from the students helped him
to predict their needs and select materials and tasks they would find useful as
well as judging the kinds of news broadcasts that would be appropriate. He
wrote,
Thorpe found that the perceptual mismatches between his view and the students'
were less pronounced in phase two than in phase one. He attributed this differ
ence to the systematic practice of collecting their feedback regularly in each
cycle. He wrote, '
I think that including the participants in the project also increased the
likelihood they would get involved in their own learning as well In
volvement in action research—both in this study and my collaborative
role in [my colleague's] project as well—can "make our work more pur
poseful, interesting, and valuable, and as such it tends to have an ener
gizing and revitalizing effect" (van Lier, 1994a, p. 33). Although action
research does not aim at generalizability, I have discovered both per
sonal and professional insights that definitely can be applied to many
other teaching situations, (pp. 31-32)
This summary shows how Thorpe used the action research cycles to systemati
callyvary the changes he made in his teaching. He collaboratedwith a colleague
Possible Solutions
We have experimented with a number of solutions to these problems. Chances
of success for any given project will be maximized if there is someone to 'own'
the project. Likewise, it is advisable that one or more advisors with training in
tend to be directive 1 14 10
givedirections 4 16 5
that might influence the outcomes, so we cannot unequivocally say that the
planned interventions caused the observed results. Instead, in conducting action
research, teachers seek out options that seem to them to be convincing solutions
to problems or classroom puzzles.
Likewise, there can be no external validity in action research because the
conditions in any given setting could never be duplicated in another. Indeed,
generalizability is not a goal of action research. Rather, action researchers seek
localunderstandingand wish to improvetheir own practice.
However, like case studies and ethnographies, action research reports are
often thought-provokingand comparable. We willcite just one example here. In
Chapter 3, we reported on Sato's (1982) observational researchon turn takingby
Asian students in English classes. This concern is one that many teachers have
faced. In fact, Tsui (1996) wrote a report on action research projects conducted
by thirty-eight teachers in Hong Kong about this very issue.
In Tsui's report, the teachers first collected baselinedata on the interaction
in their classrooms. The teachers themselves identified five possible reasons to
account for the students' apparent reluctance to speak in class. These were
(1) the students' low English proficiency, (2) their fear of making mistakes and
being derided for doing so, (3) the teachers' intolerance of silence, (4) the teach
ers' uneven allocation of turns, and (5) the incomprehensible input the students
experienced in class—oftenin the teachers' speech.
CONCLUSION
We end this chapter where it began, with a quote from Mingucci (1999), who
wrote about the role of action research in professional development. She said
that for teachers
255
Chapter 11: Elicitation Procedures
Bythe end of this chapter, readers will
• describe and exemplify a range ofelicitation procedures, including
interviews, production tasks, role plays, questionnaires, and tests;
• identify five different types ofinterviews and explain thedifferences among
them;
• describe the relationship between production tasks and learnerdiscourse;
• discuss the advantages and disadvantages of using elicitation procedures to
gather data.
Classroom Observation
257
research still holds true, whether teachers and learners areinteracting in a phys
ical space or an electronic lesson:
Classroom-centered research is just that—research centered on the
classroom, as distinct from . . . research that concentrates on the inputs
to the classroom (the syllabus, die teaching materials) or the outputs
from the classroom (learner achievement scores). It does not ignore in
any way or try to devalue the importance of such inputs and outputs. It
simply tries to investigate what happens inside the classroom when
learners and teachers come together, (p. 191)
That is, indeed, a key purpose of observation: "to investigate what happens in
side the classroom when learners and teachers come together" (ibid.).
REFLECTION
Given the information above about classrooms, how would you define
class-room observation ?
f Field Notes
f Manually •<
Observation
Schedules
Collecting
Information During <
Observations Video-Recordings
Synchronous and
^ asynchronous chat records
REFLECTION
Study the data in Table 9.1.This record was produced bya classroom ob
server who used an observation scheme and made a tally mark every time a
particular behavior occurred. How much can you tell about the lesson in
which the observation took place? Can you make inferences about the
following, for example?
• what the size of the class is
• whether the students are children or adults
• whether it is an EFL or ESL class
• what the focus or objective of the lesson is
• how long the interaction lasts
• when the interaction took place (at the beginning, middle, or end of the
lesson)
REFLECTION
Given the tallies in Table 9.1, try to imagine the discussion that produced
these data. As a hint, we will tell you that the interaction takes place in a
classroom at the beginning of a lesson after lunch.
A coding system like die one depicted in Table 9.1 can either be used while
the lesson is proceeding (in "real time") or later with videotapes or audiotapes of
the lesson. If the system is used by an observerduring the actual lesson, the tally
marks are either made at regular time intervals (e.g., every three seconds) or
every time there is a category change. Long (1980) describes and names these
two approaches:
When each event iscoded each time it occurs,we are dealingwith a true
category system. When each event is recorded only once during a fixed
period of time, regardless of how frequently it occurs during that
period, we have a sign system, (p. 6)
REFLECTION
8. Teacher praises i 1
9. Teacher criticizes 0
->
ACTION
Compare the tally data in Table 9.1 with the narrative data in Figure 9.2.
List three to five differences in the sorts of information these data provide.
REFLECTION
REFLECTION
Advantages Disadvantages
Observation May seem objective Likelyto distort actuality
System Good for observer to use while Does not show the human
watching class element
Good for self-analysis by Very abstract
teacher
Focuseson quantity not quality
Easy to compare different Does not indicate success or
interactional categories failure
Easy to focus on specific Does not indicate sequences of
elements
interaction
Orients one's mind set as Open to misinterpretation
observer
Categories create straitjacket
Visual presentation—easy to
overview
Categories may be biased
toward teacher
Does not indicate length of
interaction
Ethnographic Displayssignificant Difficult to use for clinical
Narrative paralanguage purposes
Reflects rapport between Time-consuming to write
teacher and students Allows distraction by focusing
Gives overall effect of on unimportant details
interaction
Open to emotive reporting
Can be used to carry out Inadequate on its own
subsequent tally-sheet analysis
Cannot be done in real time
Shows real nature of questions
asked
Requires high-quality recording
equipment and/or note-taking
Context given to support the skills
language
The data reported in the ethnographic narrative in Figure 9.2 are presented
below as a transcript (adapted from Nunan, 1989, pp. 80-81). It has been ana
lyzed using Bowers' (1980) categories.
T Of course I had lunch ... not enough ... why? Why? (sotiating)
Well, like I say, I wantto give you something to read (organizing)
—so what you do is, you haveto imaginewhat comesin between,that's all
... (organizing)
... Bring,er, bring your chairsa little closer, you're too far away er, ha, not
that close (organizing)
S Quiss? (eliciting)
Category Description
Responding Any actdirecdy sought bytheutterance ofanother speaker, such as
answering a question.
Sociating Any actnotcontributing direcdy to the teaching/learning task, but
rather to the establishment or maintenance of interpersonal
relationships.
Organizing Any actthatserves to structure thelearning task or environment
without contributing to the teaching/learning taskitself.
Directing Any actencouraging nonverbal activity as an integral part ofthe
teaching/learning process.
Presenting Any act presenting information ofdirect relevance to the learning task.
Evaluating Any actthatrates another verbal actpositively or negatively.
Eliciting Any actdesigned to produce a verbal response from another person.
T Pardon? (responding)
S It will be quiss? It will be quiss? Quiss? (eliciting)
Ss Quiss ... quiss (eliciting)
T Ahm, sorry... try again (eliciting)
S I ask you ... (eliciting)
T Yes?
S You give us another quiss? (eliciting)
T Oh, quiz, oh! No, no, not today... It's not goingto be a quiss today...
sorry ... but, um, what's today, Tuesday, is it? (eliciting)
S Yes (responding)
T I think on Thursday, if you like . . . same as before . . . only I'll think up
some new questions—the other ones were too easy . . . um, okay, er I'll
take some questions from, er, from newspapers over the last few weeks,
right? so—means you've got to watch the news and read the newspaper
and remember what's going on ... if you do, you'll win ... if not, well,
that's life (organizing)
S Will be better from TV (sociating)
[laughter]
T From the TV? ... What, er, what programmes ... (eliciting)
Ss News, news (responding)
T Did yousay... ? Oh, oh, we'llhave, er, it'll be the s ..., it'll be the same ...
there'll be different... ? er, there'll be different... ? Different? Different?
The questions willbe on different... what? Different? (eliciting)
S Talks (responding)
ACTION
1. Does the scheme require the observer to check a behavior (such as the
teacherasking a question) every time the behavior occurs, or is it necessary
to make a check at regular intervals?
2. Doesthe scheme deploy high- or low-inference categories? (A high-inference
category requires the observers to interpret die behavior they observe: For
example: "Students are on task," or "Students are interested in the lesson.")
ACTION
Evaluate the observation system in Table 9.3 against these six questions.
Keep these questions in mind as your read the next sections of this chapter.
We will discuss two observation systems that have been influential in second lan
guage classroom research and teacher education.
ACTION
Ten years later, Spada and Frohlich (1995), in their booklength manual on the
COLT system, stated that three major 'themes' in the L2 teaching and learn
ing literature influenced the design of the COLT scheme. These were (1) the
widespread introduction and acceptance of communicative approaches to L2
learning; (2) the need for more and better research on the relationship between
teaching and learning; and (3) the need to develop psycholinguistically valid
categories for classroom observation schemes.
The COLT consists of two parts. Part A focuses on the description of class
room activities and contains five subsections: the activity type, the participant
organization, the content, the student modality, and the materials. Part B is
Feature Questions
We will address discourse analysis more fully in Chapter 12. Here we just wish
to raise some issues related to the use of observation to collect data that can be
subjected to some discourse analytic process. Some forms of observation result
in records of classroom data that can beanalyzed using thevaried procedures of
discourse analysis, while others do not.
The advantage of observation schemes is that they serve to condense data
and facilitate the process of identifying patterns. However, unless such schemes
include the collection of actual direct quotes, theytypically obscure or lose the
verything that is of most interest to language teachersand researchers—the lan
guage itself. Observation systems in which the observer codes (i.e., interprets
and assigns to categories) interaction as it occurs result in data for which readers
onlysee the tallies—not the interaction that led to thosetallies. For this reason,
transcripts that can be subjected to discourse analysis are extremely valuable
sources of information about classroom interaction.
Two of the first linguists to develop a system for the analysis of classroom
discourse were Sinclair and Coulthard (1975). They showed that much class
room interaction consisted of a recurring pattern of teacher initiation, student
response, and teacher feedback evaluating the response (the so-called "IRF
pattern"). Here is an exampleof such a pattern:
Teacher: The questions will be on different subjects, so, er,well, one will
be about, er,well, some of the questions will be aboutpolitics, andsome of
them will be about, er ... what? (Initiation)
Student: History. (Response)
Teacher: History. Yes, politics and history. (Feedback/evaluation)
REFLECTION
Based on your experience and what you have read so far, as well as your
particular research interests, doyouhave apreference for real-time coding,
orwould youprefer to categorize behavior usingvideotapes, audiotapes, or
transcriptions? j
t 2 . -
(Students'
Desks)
1 1 F4 A A M2
I I O
important influence on the learning that goes on in the classroom and the kind
of language that is generated.
SCORE Data
REFLECTION
Using this key, write a description of the interaction shown in die SCORE
chart in Figure 9.4. Or, with a colleagueor classmate, describe the interac
tion orally.
M or F = Male or female
FIGURE 9.5 Key to SCORE Data Symbols (from Bailey, 2006, p. 108)
Classroom Maps
While visual representations like seating charts and maps donot preserve theac
tual discourse that occurred in the classroom, they do enable researchers to
record the extent to which interactions occur. They also often allow patterns to
emerge thatare notimmediately apparent in transcripts or other forms ofdocu
mentation. For example, Bailey (2006) drew a map of a class taught by a non-
native speaking teaching assistant in mathematics, whose class she observed on
three different occasions. The map shows that there are forty-two desks in the
classroom, but at the three differentobservations, eighteen,three, and sevenstu
dents were present. At the scheduled time for one observation, no students
attended the class. Bailey was puzzled as to why such a small class had been
scheduled in such a large room, but when she checked with the mathematics
ACTION
If you can observe a class, compile a SCORE chart such as the one in Fig
ure 9.4 for a lesson or part of a lesson. As an alternative, you could com
plete the following procedure adapted from Freeman (1998, pp. 203-204).
1. Outline a bird'seye view of die classroom spacethat shows the walls and
other structures as if you were looking at diem from above.
2. Identifyeverything you cansee; be as detailed asyou reasonably can.In
clude yourself in the picture. Scale is less important than accurately in
cluding as much as possible.
3. Use the same symbol for a category (e.g., circles for students and squares
for desks). You can create ways of showing differences within a category
(e.g., green circle is a bilingual student; red square is a teacher's desk).
4. To record students' movements, draw a line from where the student
starts to where he or she ends the movement. If the student makes the
same movement more than once—perhaps he or she goes back and
forth to the teacher's desk for help—you can put a check on the basic
trace line to keep track of the number of steps.
that were associated with successful second language classroom acquisition. The
following keyfactors are adapted and summarized from Ellis:
1. Quantity of'intake': The amount of the target language that learners at
tend to is significant—quantity alone is insufficient (i.e., the quantity of
language produced by the teacherasinput).
2. A need to communicate: This need can be provided if the target lan
guageserves as the medium as wellas the target of instruction.
3. Independent control of the propositional content: Learners have a
choice over what is said, and part of this should be content known to the
learner but not the teacher.
4. Adherence to the 'here and now* principle: In the early stages at least,
encoding and decoding are facilitated if the things being talked about are
present in the learning environment.
5. The performance of a range of speech acts: The learner should be
encouraged to use a range oflanguage functions and to perform a variety
of roles with the classroom discourse (for example, initiating as well as
responding to discourse).
6. An input rich in directives: Particularly in the early stages of learning,
directives occur in familiar and frequently occurring contexts, they refer to
the 'here and now,' they are morphosyntactically simple, and, as they
require a nonverbal result,they are more likelyto count as successful com
munication than interactions requiringa verbalresponse.
ACTION
Using these eight categories from Ellis (1988), analyze the transcript
above, in which the students ask about the possible "quiss."
1. Ifyou are going to take notes, always carry paper and pens or pencils and
something firm to write on, like a clipboard, (p. 4)
2. Ifyou are going to tape record, make sure you have access toa good tape
recorder and thatyou know how to operate it correctly, (p. 4)
3. Always investigate the classroom where you will be observing before the
actual observation begins, (p. 4)
4. Ifyou are observing regularly scheduled classes, always leave room inyour
plan to reschedule anobservation as needed, incase a class iscancelled, or
the teacher gets sick, or you miss yourbus. (p. 5)
5. Always plan free time immediately after an observation so you can write
your field notes ifyou aren't able to record them during the actual obser
vation for any reason, (p. 22)
6. Always arrange a big enough subject pool that you can re-sample from
among the possible subjects if your presence seems to affect someone's be
havior noticeably, and always allow time foryour subjects to become com
fortable with yourpresence before you tryto collect data on theirbehavior,
(p. 22)
7. Always carry extra batteries (ifyou are using a battery-powered recording
device), (p. 22)
8. Never allow yourself to be entrapped in an unwanted discourse act with a
subjectduring an observation, (p. 22)
9. Always use, or consider using, multiple data collection procedures, (p. 22)
10. Always do a pilot study, (p. 22)
Some of the points made above may seem obvious, but we know of many
projects—including our own and thoseofour graduate students—that have been
negatively affected by breakdowns caused by very simple problems thathad huge
consequences.
Ouradvice, based onyears ofboth successes and frustrations, is that ifpossi
ble, you should collect data that are detailed and in-depth rather than relying en
tirely on data coded live in "real time," thatis, during the observations. Data that
are audio- or videotaped can later be coded or transcribed, as needed. But data
that are simply coded cannot be reconverted to direct quotes or descriptions.
Wealso strongly encourage you to carry out a pilot study to try out and re
fine your observational procedures before you attempt to collect the data you
wish to analyze. Doing a brief pilot study can reveal problems in coding cate
gories, thoroughly familiarize observers with data collection procedures, andac
quaintobservers with the local conditions if theyare not insiders to the school or
program.
Observation can be used with other data collection procedures to provide
methods triangulation. You will recall from Chapter 7that methods triangulation
A SAMPLE STUDY
As the sample study in this chapter, we will discuss just a briefsegment from an
interesting article by Lynch and Maclean (2000), two teachers inacourse entided
English for Medical Conferences, which "caters for health professionals who
want to improve their ability to present papers inEnglish at international meet
ings and conferences" (p. 226). These authors wanted to study the outcomes of
building repetition into atask inwhich the learners explained aposter based on a
research article to people who visited theposter exhibit (their classmates).
By repetition Lynch and Maclean do not mean the sorts of pattern drills
where students repeat after ateacher. Rather they are referring towhat they call
recycling orretrials, "where the basic communication goal remains the same, but
with variations of content and emphasis depending on the visitor's questions"
(p. 227). The procedure that was used—both for teaching and for data collection—
was this:
1. Participants are paired upand each pair is given adifferent research article.
They have one hour to make a poster based onthatarticle.
2. The posters are displayed around alarge room. From each pair, one partic
ipant (A)—the 'host'—stands beside their poster, waiting to receive 'visi
tors' asking questions. The B participants visit the posters, one by one,
clockwise. Their task is to askquestions abouteach poster. The host is in
structed nottopresent, buttorespond to questions. They are allowed only
limited time (approximately 3 minutes) at eachposter.
3. When the Bparticipants arrive back at base, they stay by their poster and
the A participants go visiting.
4. Once the second round is completed, there is plenary discussion of the
merits of the posters (by the participants) and the teachers provide feed
backon general language points, (p. 227)
So the "poster carousel," as these authors call this teaching-learning activity,
provides built-in opportunities for reiterating and rephrasing core content re
peatedly ina brief period oftime to aseries ofinterlocutors.
We recorded all six interactions between each host and visitor by plac
ing an audiocassette recorder near each of the seven posters. This sort
ofrecording is a routine part ofthe course and so the participants were
used to being recorded bythe time they did the poster carousel. All 14
sets ofsix interactions were transcribed, (p. 229)
I wanted to use phrases I have learned during the course and I worked
at it 1tried to find outifdifferent explanations were accepted [by the
visitors]. I felt I was quite relaxed allthe time. I got to know the vocab
ularybetter during the time. (p. 231)
Thearticle provides further data from the two learners' self-report statements as
well as numerous illustrations from their transcribed conversations with the vis
itors to their posters. These data provide a good example ofmethods triangula
tion. What we find particularly compelling about this study is how the two
teachers conducted practical research about a teaching activity and incorporated
There are certainly some pitfalls to be avoided when you collect data through
observational processes, and several of these have been alluded to above. To
recap the main points, having an observer in the classroom, perhaps especially
one with a video camera, can be disruptive. Students and teachers may act in
ways that are not typical of their usual classroom behavior—an example of the
Observer's Paradox (Labov, 1972). lb counteract this problem, it is important
that observers spend enough time in a site so that the participants in that context
become familiar with the visitors and accept their presence in the classroom. It
also helps if the students are familiar with the data collection devices, as they
were in the study by Lynch and Maclean (2000).
Another major pitfall is the worry that what is observed can be influenced
very strongly by the observer's own experiences and preconceptions. If a data
base consistssolelyof observational fieldnotes or real-time coding done by a sin
gle observer, it is difficult to demonstrate the reliability of the coding or the
validity of the observations. For these reasons, we recommend methods triangu
lation, particularly those combining observational field notes with electronic
records. Video- or audio-recordings permit the researcher to transcribe interac
tions, and the resulting transcripts provide opportunities for more precise
analyses than does real-time coding.
In addition, having electronically recorded data, such as audiotapes, video-
recordings, or chat transcripts, is very useful in studies employing stimulated
recall to get the participants' interpretations of events. Having been an observer
in the lessons where such data were collected can give the researcher a first
hand vantage point for asking key questions about the interactions that
occurred.
Transcription itself, while valuable and informative, can be a daunting
process.Unless you have good quality recordings, transcribing language learn
ers' speech can be terribly time-consuming and frustrating. This problem is
substantially reduced in studies involving typed chats or online forum discus
sions, in which a transcript of the interaction is automaticallyproduced by the
program.
In spite of these problems, the benefits of using an observation component
in classroom research are too numerous to overlook. Without observational
data, we are limited almost entirely to product studies—and even if their out
comes are interesting and provocative, we cannot say with much confidence
what elements of classroom interaction and instruction led to any significant
gains or differences that may emerge. Some form of observation is essential in
any process study or process-product study.
10
Introspective Methods
of Data Collection
284
Defining Introspective Data Collection
As a research procedure, introspection istheprocess ofobserving and reporting on
one's own thoughts, feelings, motives, reasoning processes, and mental states,
often with a view to determining the ways in which these processes and states
shape behavior. The tradition has been imported into applied linguistics from
cognitive psychology, where it has aroused considerable controversy (Ericcson
and Simon, 1987). Particularly contentious is the assumption that the verbal re
ports resulting from introspection accurately reflect the underlying cognitive
processes giving rise to behavior. Critics of the approach argue that there might
be a discontinuity between whatthe subjects believed theywere doing and could
articulate, and what they were actuallydoing.
In this chapter, we use the term introspection to cover techniques in which
data collection happensat the sametime as or veryshortly after the events being
investigated. We will also use it as a general rubric to cover research contexts in
which the data are collected retrospectively, that is, some time after the events
themselves have taken place. One challenge with this approach is the fact that
the length of time elapsing between the mental event and the reportingof that
event may distort what is actually reported. (In a sense, all the techniques re
ported here are retrospective because there will always be a gap, however fleet
ing, between the event and the report.)
Cohen and Hosenfeld (1981) distinguished three types of introspective data
collection. The types are defined by the timing of the introspecting relative to
the timing of the event being investigated. First, concurrent introspection (during
the event) represents a particular point in time. Concurrent introspection occurs
simultaneously with the event being investigated. In contrast, immediate retro
spection occursright after the event,and delayed retrospection occurshours or more
following the event. Thus, immediate retrospection and delayed retrospection
REFLECTION
V.
Introspection
represent spans of time instead of particular moments. The more general cover
term, introspection, entails all three zones, as depicted in Figure 10.1 (adapted
from Bailey, 1991, p. 64).
ACTION
What are the internal mental differences you experience as you tiy each of
these tasks?
Think-Aloud Protocols
In think-aloud techniques, subjects complete a task or solve a problem and
verbalize their thought processes as they do so. The researcher records the
verbalization and then analyzes the thought processes the subjects report. In
this procedure, the gap between the mental process and the reporting is closer
than with other techniques, such as stimulated recall and diaries. However, we
may still question whether the verbalization accurately reflects the mental
1. The first level is simply reporting—that is, "the vocalization of covert ar-
ticulatoryor oral encodings,as required in the tasks" [beingdone]. "At this
level, there are no intermediate processes and the subjectneeds expend no
specialeffort to communicate his thoughts" (p. 79).
2. The second level involves some description and "explication of the thought
content" (ibid.).
3. The third level "requires the subject to explain his thought processes or
thoughts" (ibid.).
Ericcson and Simon caution that at the third level "an explanation of thoughts,
ideas, or hypotheses or their motives is not simply a recoding of information
already present in short-term memory, but requires linking this information to
earlier thoughts and informationattended to previously" (ibid.). They stressthat
verbalization at the second level "does not encompass such additional interpre
tative processes" (ibid.).
It is important to be aware of the sorts of mental processing we are asking
participants in a research project to do for two reasons. First, we want to be sure
to obtain the type of data that willaddress our research questions or hypotheses.
Secondly, we don't want to impose additional mental tasks on the participants
that wouldundulyinfluence the actualmental processes or emotionalstates that
we are investigating.
"In this experiment we are interested in what you think about when
you find answers to some questions that I am going to ask you to
answer. In order to do this I am goingto askyou to THINK ALOUD
asyou workon the problem given. What I mean by think aloud is that
I want you to tell me EVERYTHING you are thinking from the time
you first see the question until you give an answer. I would like you to
talkaloud CONSTANTLY from the timeI presenteach problem until
you have given your final answer to the questions. I don't want you to
try to plan out what you say or try to explain to me what you are say
ing.Just act as if you are alone in the room speaking to yourself. It is
most important that you keep talking. If you are silent for any long
period of time, I will ask you to talk. Do you understand what I want
you to do?" (p. 378)
How would you change this text to make it more easily understood by some
one for whom English is a second language? Think about an intermediate
ESL/EFL learner as the person who would be getting the instructions. You
can either translate some version of this text into the learner's mother tongue
or modify die English instructions above.
The aim of this study was to investigate teachers' interactive decision making,
that is, the decisions they made 'online' as they taught. (In the context of re
search on teacher cognition online means during a lesson—a decision in real
time.) As it was clearlynot feasible for the researcher to interrupt the teachers in
the middle of their lessons, he recorded the lessons and replayed the recordings
for the teachers immediately afterwards.
REFLECTION
Givenwhat you knowso far, think of a research question that you'd like to
address that could incorporate the stimulated recall procedure. What
sort(s)of data wouldyou use to prompt your participants'memoriesof the
events being investigated?
REFLECTION
What would you have asked the teacher about whatwas goingon in this
piece of interaction if you were using this transcript to stimulate the
teacher's recall? i
DiaryStudies
Since the late 1970s, entries recorded in teachers' and learners' journals, or di
aries, havebeen used as data in studiesof secondlanguage acquisition and teach
ing. Language learning diaries have been kept in both formal and informal
learning contexts, in foreign language and second language situations. Teaching
journals have been kept by both novice and experienced teachers.
What is a diary study and how doesit differfrom a diary orjournal? (Wewill
use these two terms interchangeably, as they havebeen used in the existing liter
ature for the past three decades.) According to Bailey and Ochsner (1983),
A diarystudy in secondlanguagelearning,acquisition, or teachingis an
account of a second language experience as recorded in a first-person
journal. The diarist may be a language teacher or a language learner—
but the central characteristic of the diary studies is that they are
introspective:The diarist studies his own teaching or learning. Thus he
can report on affective factors, language learning strategies, and his
own perceptions—facets of the language learning experience which are
normallyhidden or largelyinaccessible to an externalobserver, (p. 189)
The learner's or teacher's experiences are "documented through regular, candid
entries in a personal journal and then analyzed for recurring patterns or sahent
events" (Bailey, 1990, p. 215).
Youwill recall from Chapter 1 that Grotjahn (1987) characterized research
paradigms in terms of (1) research design (nonexperimental, pre-experimental,
quasi-experimental, and experimental designs); (2) the type of data collected
(qualitative or quantitative); and (3) the type of analysis conducted (interpretive
or statistical—i.e., qualitative or quantitative). In Grotjahn's terms, diarystudies
are typically pre-experimental or nonexperimental. They are basedprimarilyon
quahtative data (the written or tape-recorded diaryentries), and they are usually
analyzed interpretively (thoughsomehave been analyzed quantitatively aswell).
Diariescan be kept by teachers or by learners. However, undertakinga diary
study requires discipline and application because if the entries are not made
REFLECTION
ACTION
REFLECTION
After you have tried making entries in a learning or teaching diary, con
sider the following issues:
1. Should a diarist read other language learning or teaching diary studies
while keeping a diary, or does this lead to "contamination" of the re
ported experience?
2. Should a diarist read about and comment on language learning theories
while keeping a diaiy, or does this mold the diarist's recollections and
insights to fit the theories?
3. Should a diarist try to take notes during the actual language learning or
teaching experience (e.g., during an ongoing language lesson), so the im
pressions are more concurrent with the event? Or would that process be
so distracting that it would interfere with language learning or teaching?
4. To whatextentdoesthe process of keeping a diary (forinstance, of exam
ining one's own language learning experience) influence the experience?
Compare yourresponses to those of yourclassmates or colleagues.
Language Pre-/In-service
Authors(s) and Date (to Be)Taught Teacher(s) Analysis
Appel (1995) English In-service Diarist
Bailey(1990) Various Pre-service Others
Languages
Bailey(2001b) English In-service Diarist
Block (1996) English In-service Other
Brinton&Holten(1989) English Pre-service
Brock, Yu, & Wong (1992) English In-service Diarists
Cole, Raffier, Rogan, & English Pre-service Diarists
Schleicher (1998)
Delaney& Bailey (2000) English Pre-service Diarist &
Other
Grandcolas & French Pre-service Diarists
Soule-Susbielles(1986)
Ho & Richards (1993) English In-service Others
Jarvis (1992) English In-service Other
Lee & Lew (2001) English Pre-service Others
Matsuda & Matsuda (2001) English In-service Diarists
McDonough (1994) English In-service Diarist &
Other
Numrich (1996) English Pre-service Other
Palmer, C.H. (1992) English In-service Other
Palmer, G.M. (1992) English In-service Other
Polio & Wilson-Duffy (1998) English Pre-service Others
Pennington & English In-service Others
Richards (1997)
Porter, Goldstein, Various Pre-service & Diarists &
Leatherman, & Conrad (1990) Languages In-service Others
Ruso (2007) English In-service Diarist
Santana-Williamson English Pre-service & Other
(2001) In-service
Tsang (2003) English Pre-service Other
Verity (2000) English In-service Diarist
Winer (1992) English Pre-service Other
Yahya (2000) English In-service Diarist
REFLECTION
Write three insights or observations that you get from Choi's account
(e.g., learning a language can have a powerful impact on personal identity). If
possible, share these insights with other people. Did you and your col
leagues find similar or differentissues in her copiments?
One difficulty with having language learners report on their learning is that not
all learning processes may be available for introspection. Some processes may
happen outside awareness. There are also conscious learning processes of which
we are aware, and those are available for introspection. Some subset of those
learning processes will be written about by the second language learner, as
shown in Figure 10.2 below:
REFLECTION
A SAMPLE STUDY
The sample studywe have selected for this chapteris based on an EFL teacher's
journal that was kept for two academic semesters (Bailey, 2001b). The author
had manyyears of experience teaching ESLandworking asa teacher educator in
the United States, but then she taught at a university in Hong Kong for a year.
(See Verity, 2000, foran account of a similar situation inJapan.) Uponreturning
to her regular graduate teacher trainingposition a yearlater, Bailey noticed that
her teaching(e.g., in statistics courses, language assessment seminars, and so on)
hadchanged. Shethen decided to readthroughher Hong Kongdiary entries to
see if she could determine how those changes had come about.
The database for this report consisted of fifty-two single-spaced pages for
fall semester and fifty-eight single-spaced pages for spring semester. These en
tries wereword processed and saved electronically. Paper printouts of the diary
entries were filed along with copies of the class handouts and lesson plans. The
data were qualitatively analyzed and in the process, a number of teaching
strategies related to scaffolding emerged. Scaffolding was defined by Bruner
(1983, p. 60) as "a process of 'setting up' the situation to make the child's entry
easyand successful, and then gradually pulling backand handing the role to the
childas he becomes skilled enoughto manage it." (Here the learners wereyoung
adults rather than children, but the concept is still useful.) The following scaf
foldingprocedureswere identified in the teachingdiary (Bailey, 2001b):
1. the teacher's use of multiple channels for presenting information or
instructions to the learners;
2. havingthe students "feed me backthe task" (i.e., paraphrasing the instruc
tions to checktheir understandingof the assignment);
3. having the students compare their ideas with a classmate privately before
giving a public response to a solicit;
4. the teacherbuilding in a recognition step(sothe learners could identify the
linguistic item of focus in the input prior to having to produce it them
selves); and
5. the teacher usingschema activators to preparethe learners for readingand
listening tasks, (pp. 15-25)
CONCLUSION j
Introspective data collection procedures can be characterized in terms of when
the data are generated relative to the event being investigated—concurrently
with the event, immediately afterwards, some time afterwards, or long after
wards. In addition, the data collection can be very brief, as with think-aloud pro
tocols and some stimulated recall procedures, or much longer as in the diary
studies. Some autobiographical and biographical research spans years of data
collection—or recollection.
While there are potential problems with these data collection procedures,
introspective procedures help us identify and understand issues that are not
readily accessible through other means. When used with other data collection
procedures over time in studies that employ careful triangulation, they can pro
vide helpful insights in research on languageteaching and learning.
December 8
I've been living in Hong Kong now, and speak very little Cantonese.
This is something I'm embarrassed and ashamed of, being a language
teacher myself. Not that it's unusual. I have colleagues who have lived
here twice as long as I have who speak even less Cantonese than I do.
So, what are the reasons? Firstly,it's a very difficult language.The com
mitment of time to makea decent fist of learning the language is enor
mous. Like most expats in Hong Kong, I lead a very busy life. In fact, I
spend approximately half of my life travelingout of Hong Kong. Out
side Hong Kong, the language is of little utility. If I'm going to learn a
Chinese language it might as well be Putonghua.
Secondly, despite constant laments about the poor standard of Eng
lish, most of the local Chinese that I interact with have a reasonable
level of English. I would have to study Cantonese for many years for
my Cantonese to be comparable to or superior to their English. From
a practical, communicative perspective, there is therefore no need to
December 9
I've set myself the goal of learning 1,000 words and phrases byJune—
only around five a day, but I'm having trouble remembering even that.
The CD that L. got me is much better than tapes I have, but there's not
enough repetition, and the phrases are presented out of context.
'Maai' rhymes with 'buy' so that should be easy enough to remember.
I'm confused by the particles 'ma' and 'ah'. As far as I can figure out,
'ma' is a question particle, and 'ah' functions more likea question tag in
English. So 'Mohngma?' = 'Areyoubusy?' and 'Mohngah?' = 'You're
busy, are you?' the 'ah' form seems much more prevalentthan the 'ma'
form.
I tried creating little dialogues from the phrasesI wastrying to learn in
order to give them some context, but didn't get very far. I don't even
know how to say 'yes'. From what I know of other Asian languages, I
guess there won't be a singleword as there is in English.
Really frustrating! I have no ideahowto give affirmative responses. The
book I'm working with teaches 'Gei hou ma?' (How are you?) but not
how to respond. To respond in the affirmative, I did what I'd do if
speaking Thai—repeat the phrase. I have no ideaif it's right or wrong.
In response to 'mohng ma?' 'Are you busy?' I made up'Haila'because
it's what I hear people in the office saying all the time. I'm sure it's
wrong, but I have no other resources to use.
I checked with L. who suggested Gei hou (quite good) or simple 'hou'
and 'hou mohng ah!' (very busy) or simply 'hou mohng' for answers.
At lunch, I was really pleased when I called for the check"Maaihdaan,
mgoi"and the woman understoodinstantly.
December 15
Opportunities to get out and actually practice the language are close to
zero. My Chinese friends and acquaintances are all totally bilingual,
and even most of the cab drivers on Hongkongside are much better at
English than I'll everbe at Cantonese. This is MUCH more like learn
ing asa foreign than a second language. It it very demotivating.
December 28
Conversation practice withL. (40minutes). Westartedpracticing more
freely today, and I tried to use whatever resources I had to communi
cate. It was hard work, but fun, and motivating when L. understood
whatI was trying to say. It's so motivating to have a sympathetic native
speakerto reassure me that I AM makingprogress.
At one point I wanted to ask L. if she liked tea, so I said, "Nei jungyi
yum cha." She corrected me to the like/not like form "Nei jung-mh-
jungyi yum cha a?" I asked her why she didn't use the full form of the
verb to like 'jungyi', but just part of it 'jung' in the first part of the
negative/positive, i.e., why she didn't say "Nei jungyi-mh-jungyi yum
chaa?"Shelooked mystified for a minute, not understanding whatI was
trying to say, then laughed, because she'd never noticed that the full
form of the verb isn't used with this question form.
January 7
It's time to push ahead. I am constantly tempted to 'consolidate' rather
than to work on new language.
This morning, L. asked me, "Nei yau mou tung nei go leiu gong dinwa
a?" I knew instantlywhat she had asked, "Did you talk to your daugh
ter (whojust went backto England)on the phone?" I did this by recog
nizing the phrase "leiu gong dinwa" ("daughter speak phone"),
although I wasn't able to respond appropriately. EventuallyI came up
with 'yau' (yes). L. then asked "Geisi a?" Using the context, I guessed
that this must mean 'when'—also drew on the fact that 'Geido' means
'how much/how many' and made the assumption that 'Gei' combines
with other particles to form 'wh-' questions. L. confirmed this 'Si'
means 'hour'.
SUGGESTED READINGS
Ericcson and Simon (1993) provide a thorough overview of the use of verbal
reports and protocol analysis in psychology research. If you plan to work with
think-aloud protocols, this book maybe useful. For a chapter on the use of the
think-aloud procedurein second language research, seeJourdenais (2001).
Fserch and Kasper (1987) edited a collection of articles about introspection
in second language research. This volume is the source of Grotjahn's (1987)
category system, which we have cited throughoutthis book, as well as Ericcson
and Simon (1987).
Gass and Mackey's (2000) book is an excellent! source of information about
stimulatedrecall in secondlanguage acquisition research.
Any of the diary studies listed in Tables 10.1 and 10.2 above will giveyou
samples of how this introspective method has been used. Many are illustrations
of classroom research, while others document language acquisition in naturalis
tic contexts.
For interesting critiques of the diary studies, see Fry (1988), Matsumoto
(1987), and Seliger (1983b).
11
Elicitation Procedures
312
REFLECTION
INTERVIEWS
Structured Interviews
The structured interviewis likea questionnaire that is administered orally rather
than in writing. The researchernormallyworks with one person at a time, ask
ing him or her questions and recording the person's answers. The interview fol
lows a pre-set list of questions, and the researcher is careful to elicitanswers to
the same questions from all of the respondents. Conducting a structured inter
view demands training and discipline on the part of the researcher to stick
closely to the predetermined agenda. The advantage of using structured inter
views is that they provide detailed data that is comparable across informants.
Semi-Structured Interviews
In a semi-structured interview, the researcher will have a general idea of how he
or she wants the interview to unfold and may even have a set of prepared ques
tions. However, he or she will use these questions as a point of departure for the
interview and will not be constrained by them. As the interview unfolds, topics
and issues rather than pre-setquestions will determine the direction that the in
terview takes. The main difference between a semi-structured interview and an
Unstructured Interviews
An unstructured interviewwill develop according to the agenda of the intervie
wee rather than the agenda of the interviewer. While there will be a general
theme underpinning the interview, it can take off in unexpected directions,
which the interviewer will follow, picking up on issues and themes suggested by
the interviewee. For example, the interviewer might begin by asking a teacher
about how he or she modifies and adapts course books and other commercial
materials into her lessons but then segue into the role of technology in language
teaching.
Ethnographic Interviews
Aswe sawin Chapter 7, ethnography seeksto document both the emic (insider's)
and the etic (outsider's) point of view. Ethnographers doing field researchoften
use interviews to discover and develop the emic perspective. According to
Spradley (1979), these interviews are like "a series of friendly conversations"
between the researcher and the members of a culture (p. 58). Ethnographicin
terviews occur in the natural course of the longitudinal, ongoing relationships
the ethnographerbuilds with the cohort in the study. They are characterized by
(1) "a specific request to hold the interview (resulting from the research ques
tion)" (Flick, 1998, p. 93); (2) ethnographic explanations given in everyday
language, in which the ethnographer tells the informant explicitly what he is
seeking and why; and (3) specific question types that elicit information about
how the participants constructmeaningand organize their society.
REFLECTION
Can you think of any disadvantages of using focus group interviews when
second language learners are the interviewees?
The choice to interview (and of the type of interview you might use) is
directly related to your research question(s). Clearly, if you wish to investigate
the perspectives of participants in a course or members of a culture, interviews
are one wayto gather in-depth data. The types of interviews described aboveare
summarized in Table 11.1.
REFLECTION
Based on your previous reading and experience, as well as what you have
read so far in this book, what do you see as the advantages of interviews?
Do you have a preference for any of the types of interviews described
above? If so, why?
REFLECTION
ACTION
QUESTIONNAIRES
More Experienced
N= 12 N = 8
Teachers (N = 20)
Less Experienced
N = 7 N=13
Teachers (N = 20)
N = 2 N = 2
Less Experienced
Teachers (N = 20)
^^ N = 7 N=13 ^\
FIGURE 11.2 The "two-phase" or "raised" design
REFLECTION
REFLECTION
Role Plays
Some researchers have used roleplay scenarios to elicit language learners' speech
samples and ideas. For instance, as early as 1980, Fraser, Rintell, and Walters
used tworoles plays aboutawkward situations. In one,theyasked respondents to
imagine themselves at a parking meter with no change, having to borrow some
coins from an older stranger as the meter maid draws closer and closer. In the
second situation, they asked the respondents to put themselves in a situation
where theywerelateto a lunchappointment withan olderbusiness acquaintance
whom they do not know well. Here are the instructions that theygave:
ACTION
Write instructions for a role play to elicit data in a study that would inter
est you, but imagine that your respondents are intermediate learners or
false beginners ofthe target language. Make sure the instructions are both
clear and positive in tone.
Role plays have been used in language assessment as well as indata collection,
but some researchers have voiced concerns about whether personality or acting
ability may influence the outcomes (see, e.g., van Lier, 1989). Others have noted
that students' abilities to play a role may be related to their experience. Bailey
(1998b) gives an example ofa native speaker of English doing two different role
plays—one in English and one in Spanish, her second language. The speaker
reported that the English role play was much more difficult because she couldn't
imagine thesituation, while the Spanish role play was fun and plausible.
REFLECTION
Can yousee any problems with this landofpict ure elicitation device?
REFLECTION
Which is the trigger and which the modified utterance in the following-
piece of interaction (Martyn, 2001, p. 33)?
A: She's a loner.
B: Sorry?
A: She stay away from others.
REFLECTION
Why do you think that Martyn chose to carry out her investigations in
irtl-or-f classrooms?
intact pliccrriAmc?
Martyn used five production tasks: (1) jigsaw, (2) information exchange,
(3) problem solving, (4) decision making, and (5) opinion exchange. From her
literature review, she also isolated the following four cognitive demand features
of tasks:
Martyn then mapped these cognitive demand features onto the five production
task types. This procedure resulted in the following matrix (see Table 11.2),
which she used to investigate interactional mollifications. Martyn found that
tasks with the highest cognitive demand, such as the opinion exchange task, gen
erated the most interactional modifications, while jigsaw tasks, with relatively
low cognitive demand, generated the fewest modifications.
Martyn's study isvaluable, not only because sheworked inactual classrooms,
but also because the kinds of tasks she chose to investigate are those which
language teachers often use.
A SAMPLE STUDY
For the sample study in this chapter, we decided to focus on some research by
Snow, Hyland, Kamhi-Stein, and Yu (1996). They investigated the ideas of lan
guageminorityjunior high schoolstudents in Los Angeles, usingoral interviews
in both Spanish and English. Their research questions were the following:
Thus, the forced-choice activity provided comparable dataacross all the students
while the original ideas they contributed were open-ended and creative. The
rankingof both the selected and the constructed "ingredients" provided the re
searchers with information they could not have gotten from either the provided
categories or the students' own ideas alone.
The second part of the interview was conducted in Spanish. This was a
problem-posing task involving a role play in which students were asked to ex
plain how they would orient a new student in their school. The students were
supposed to say how they would tell the new arrivalwhat he or she would need
En tu clase hay un alumno nuevo que habla muy poco ingles. Tii eres el
consejero de esealumnoy debes ayudarlo. dQue es lo que esealumnotiene
que hacer para convertise en un buen alumno? ^Como tiene que estudiar
para un examen? Recuerda que tu debes aconsejar a tu nuevocompanero.
jSu exito depende de ti!
Snow et al. (1996) provide the following commentary about thisdata elicita
tion procedure:
The students, in general, responded readily to the task. They quickly as
sumed the role of the experienced student, offering advice to the new
comer. In some cases, the interviewers had to repeat parts of the
question and prompt thestudents to respond toall parts ofthesituation.
Five ofthesixty-six students said thatthey could not perform thetask in
Spanish. They responded in English; however, theirresponses were not
included in the analysis, since one objective of the task was to see if the
students could communicate their meta-notions of student role efficacy
in Spanish, (p. 308)
In addition to the quahtative analysis, the students' preferences in the card
sort activity were analyzed quantitatively using a statistic called the chi-square
analysis (see Chapter 13). Statistically significant preferences for the students'
views of the ideal class emerged for the following descriptors in the card sort
activity:
A classwhere the teacher uses cooperative learning.
A class where I writejournals in English or Spanish.
A class where I participatea lot.
A class where I am expected to take notes.
A class where I help myclassmates edit what theywrite before theywrite
the final version of the assignment.
A class where I learn from my classmates, (pp. 308-309)
There were no statistically significant differences in the students' choices for the
paired statements "A class where I can speak English ifI want to," and "A class
Some students indicated that they could learn English better if they had
to use it. Others said that it would be rude or unfair to speak Spanish
since some of their classmates did not understand Spanish. Some also
said that it would be impolite to speak Spanish in a class where the
teacher did not know what they were saying. One student indicated that
he would speak Spanish only if he wanted to say something he did not
want the teacher to hear. (p. 309)
The students' views about what constitutes an ideal class were also diverse.
The greatest number of student-generated responses had to do with the
teacher's role. Studentsofferedthe following ideasabout their ideal class:
Aclass where the teacher could helpyou moreoften—the teacher cangive
you more qualitytime and work with you until you get it.
A class where the teacher can be your friend.
A class where the teacher is nice.
A class where teachers have positive attitudes.
A classwhere the teacher is patient.
A class where the teacher cares—where the teacheris on your backwhen
ever you fool around.
A class where the teachersare more open and more fun.
A class where the teacher makes the class fun.
A class where the teacher helps students when they need help.
A class where I can get extra help on something I may find difficult.
A class where the shy people are encouraged to participate.
A class where everyone is encouraged to try.
A class where teachers explain the assignments well.
A class where the teacher is clear about assignments and deadlines.
Aclass where the teacher doesn't give too much workbut gives it correctly
with enough information.
Aclass where the teachers don't have toreferj tothe textbook so much.
A disciplined classwhere the teacher teaches well and understands.
A class where things work well; teachers help students improve their
education.
A class where the teacher assigns a lot of work.
A class where teachers show you tests ahead of time to helpyou get a good
grade.
A class where the teacher asks questions before a test so that everybody is
forced to study.
In their conclusion, these authors comment about their data elicitation proce
dures.They saythe interviews reveal the students' "insightsinto effective instruc
tion and their perceptionsof strategiesfor successful academic behavior" (p. 316).
The problem-posing role play task about helping the new student showed the
researchers that the students were "developing metacognitive awareness of
appropriate learningstrategies ... whichcontribute to academic success" (ibid.).
REFLECTION
The data collection techniques we have grouped together under the rubric
of 'elicitation' have some obvious advantages. Because the techniques are so di
verse, they can resultin data that are incredibly rich, as Dowsett (1986), among
others, has pointed out. Most can also be used in combination. For example,
Benson and Nunan (2005) used both questionnaires and interviews in their in
vestigations into language learning histories. This mixing andmatching helps in
methods triangulation. (See Chapter 7fora detailed discussion oftriangulation.)
Another advantage of eliciting data, and one that hasalready been touched
on, is that elicitation can be a great time-saver, providing the researcher with
large amounts of data in a much shortertime than would be required to collect
such data through naturalistic observation. In fact, desired data may never be
forthcoming if we simplysit and wait for it.
A third advantage of elicitation (which we will see when we discuss pitfalls
below) is that elicitation enables the researcherto collectdata that could simply
not be obtained in any other way. For instance, the classroom researcher who
wants to obtain insights into why the teacher made certain spontaneous deci
sions to departfrom his or her lesson plan while the class was in progress could
make certain inferences bysittingin on the teacher's class, or byviewing a video,
but willnever really knowfor sure without interviewing the teacher.
Brainstorm alist ofthe possible pitfalls thatmiglit beset research using one
or more of the elicitationdevices discussed in this chapter.
There are, of course, also pitfalls involvedin using the various elicitation de
vices described in this chapter. Use of elicitation devices rather than naturalistic
observation has been criticized on a number of grounds. In the first place, the re
searcher determines in advancewhat is to be investigated.There are at least two
possible threats to the validity of such investigations:
The first is that by determining in advance what is going to be
considered relevant, other potentially relevant phenomena might be
overlooked. The other danger, and one which needs to be considered
when evaluating research utilizing such [elicitation] devices, is the ex
tent to which the results obtained are an artifact of the elicitation de
vices employed (see, e.g.,Nunan [1987] for a discussion on the dangers
of derivingimplications for secondlanguage acquisition from standard
ized test data). One needs to be particularly cautious in making claims
about acquisition orders based on elicited data, as Ellis (1985) has
pointed out. [In at leastone study] it seemsclearthat the so-called order
of acquisition is the creation of the elicitation device and the statistical
procedures used to analyze the data. (Nunan, 1992, pp.138—139)
Regardless of these kinds of problems, however, elicitation procedures provide
effective ways of gathering data that might otherwise be unobtainable.
CONCLUSION
1. Think of a research question that interests you in which the use of dis
course completion tasks would be an appropriate form of data collection.
Write three to five discourse completion tasks designed to elicitthe target
I n this final section, we do two things. First, we revisit and extend into
the realm of data analysis several discussions that were initiated earlier in
the book. Secondly, we draw together the themes that have emerged in the
course of the book thus far. Because this volume contextualized the research
process in terms of the classroom, the initial chapter in this section takes a
somewhat detailed look at the analysis of classroom interaction. The chapters
that follow deal, respectively, with methods for quantitative and qualitative data
analysis. The section ends with the final chapter in the book, which pulls to
gether themes and issues and revisits practical suggestions for getting started on
designing and conducting your own studies as well as ways to publish them.
337
Chapter 13: Quantitative Data Analysis
By the end of this chapter, readers will
0 be able to explain the measures of central tendency and measures of
dispersion;
•a understand the concept of degrees of freedom;
0 understand the concept of statistical significance;
ra know how to calculate and interpret the chi-square test;
0 know how to interpret correlation coefficients, t-tests, and analysis of
variance;
ei identify several concerns about statistical analyses.
12
339
In another study that involved FLint for real-time coding,trained observers
watched eleven foreign language teachers who had been identified as outstand
ing by their former students, as well as eleven typicalforeign language teachers.
The observers did not know that some teachers had been identified as outstand
ing and others as typical. Each teacher was observed teaching four different
lessons: (1) grammar, (2) reading skills, (3) some sort of new material, and (4) a
lesson based entirely on the teacher's choice.
When Moskowitz(1976) compared the results, she found eighty-five statisti
cally significant differences in the coded behavior of the outstanding and typical
teachers. Several contrasts were observed in three out of the four lessons coded.
These included the fact that the outstanding teachers and their students used more
of the target language than did the typical teachers and their students. There was
also less off-task talk in the lessons taught by the outstanding foreign language
teachers. There were more personalized questions as well as more praise and
joking in the outstanding teachers' classes. In addition, the coding of nonverbal
behaviors showed that the teachers identified as outstanding walked around more
and looked at most of their smdents more often than did the typical teachers.
These findings were intriguing and the research methods Moskowitz used
wereappropriateat the time. Butwhilethesesorts of investigations were inform
ative, they were problematic in the sense that the results of real-time coding
couldnever be checked against the originalclassroom interaction data,nor could
the actual utterances be analyzed. And unless there were two observers present
during the lessons, it was not possible to computeinter-coderagreement.
Recently, withthe advent of accessible, transportable, and affordable record
ing devices, as well as advances in discourse analysis, we have focused more on
the analysis of tape- and video-recorded data. Such electronic recordings have
many advantages over real-time coding since they can be replayed as often as
necessary for transcription and/or coding. As a result of advances in both tech
nology and research methodology in the past two decades, we have come to
understand a great deal about classroom discourse and how it shapes students'
opportunities to learn.
CLASSROOM DISCOURSE
Category Description
Responding Any act direcdy sought by die utterance of another speaker, such
as answering a question.
Sociating Any act not contributing directly to the teaching/learning task, but
rather to the establishment or maintenance of interpersonal
relationships.
Organizing Any act that serves to structure the learning task or environment
without contributing to the teaching/learning task itself.
Directing Any act encouraging nonverbal activity as an integral part of the
teaching/learning process.
Presenting Any act presenting information of direct relevance to the learning
task.
Evaluating Any act that rates another verbal act positively or negatively.
Eliciting Any act designed to produce a verbal response from another person.
Extract 3:
T: (Holding up a picture.) What's the name of this? What's the name?
Not in Chinese.
S: Van. Van.
S: Drier?
T: The driver.
S: The driver.
T: The milkman.
S: Millman.
T: Milkman.
Ss: Milkman.
T: Where are they?
Ss: Where are they?
T: Where are they? Inside? Outside?
S: Department.
T: Department?
S: Department store.
T: Mmm. Supermarket. (N unan, 1988, pp. 84-85)
Extract 4:
T: The questions will be on different subjects, so, er, well, one will be
about, er, well, some of the questions will be about politics, and some
of them will be about, er . . . what?
S: History.
T: History. Yes, politics and history, and, um, and . . . ?
S: Grammar.
T: Grammar's good, yes, . . but the grammar questions were too easy.
T: ... today, er, we're going to do something where we, er, listen to a
conversation—er, in fact, we're not going to listen to one conversa
tion. How many conversations're we going to listen to?
S: Three. (Nunan, 1989, pp. 41-42)
According to McCarthy and Walsh, all these questions are relevant to language
teaching. From their research, they have identified four basic discourse
patterns, or modes as they call them, in classroom interaction. These are man
agerial mode, materials mode, skills and systems mode, and classroom context
mode.
Managerial mode occurs when the teacher is setting up a lesson or lesson
phase, or transitioning from one phase to another. Not surprisingly, it occurs
most often at the beginning of a lesson. In materials mode, the discourse is driven
REFLECTION
Look at two extracts below and decide if they represent managerial, mate
rials, skills and systems, or classroom context mode. In many transcripts,
AAA' stands for unintelligible speech, but here it means that the teacher
is reading out a blank filling exercise, so the XXXs represent the blank
spaces that the students haveto fill in the text.The equal signs(=) indicate
latching, when one turn follows another without a pause,and brackets([])
indicate overlapping turns.
Extract 5:
Teacher: nil nil (reading) and it remained the same after 30 minutes OF?
Student C: extra time
Teacher: extra time, very good, Emerson. (McCarthy and Walsh, 2003,
p. 180)
Extract 6:
Teacher: he went to what do we call these tilings the shoes with wheels=
Student 1: = ah skates =
REFLECTION
Extract 7:
Teacher: Ok, we're going to look today at ways to improve your writ
ing and at ways whichcanbe more effective for you and if you
lookat the writingwhichI gave yo i backyouwillsee that I've
marked any little mistakes and eh I've also marked places
where I think the writing is good a id I haven't corrected your
mistakes becausethe best way in Tyriting is for you to correct
your mistakes so what I have done I have put little circles and
inside thecircles there issomething which tells you what kind
of mistake it is so Miguel would ydu like to tell me one of the
mistakesthat you made? (McCarthy and Walsh, 2003, p. 179)
Student 1: =ahh nah the one thing that happens when a person dies my
mother used to work with old people and when they died . . .
the last tiling diat went out wasthe hearing about this person=
Teacher: =aha
Student 1: so I mean even if you are unconscious or on drugs or something
I mean it's probably still perhaps can hear what's happened
Student 2: but it gets =
Students: but it gets/there are=
Student 1: =1 mean you have seen so many operation and so you can
imagine and when you are hearing the sounds of what hap
pens I think you can get a pretty clear picture of what's really
going on there =
Student 3: =yeah= (ibid., pp. 181-182)
Once you have tape- or video-recorded some classroom interaction, you must
decide whether to transcribe some or all of the interaction. This decision should
be guided by your research question. In some cases, it may not be necessaiy to
Extract 9:
Teacher: Let's check exercisefour. How do ]ran feel about exercisefour,
was that strange? Number four, $h, page one hundred fifty-
four. Was it difficult? How did you feel?
Student 1: It was easy! We did that!
Teacher: Ah, no, people, Pat turned off the tape recorder by pushing
the stop button. We didn't do that. No, we didn't.
Teacher: How about that kind of pattern, how do you feel about that?
Have you used that one before?
Students: No.
Teacher: Does it look easy?
Student 2: I use the wrong way. I. . .
Teacher: You used the wrong way? How was it?
Student 2: I—how is it? I got the meaning. With reading, I get the
meaning of this word with reading the dictionary, I got die
meaning of this word. (Nunan & Lamb, 1996,p. 277)
Transcription Conventions
1. Participants: T = teacher; S = student; Ss = two students; SSS =
manysmdents. Initials are used for students identifiable by name (e.g.,
M, SZ, J) rather than S.
2. Left bracket [: Indicates the beginning of overlapping speech, shown
for both speakers; second speaker's bracket occurs at the beginning of
the line of the next turn rather than in alignment with previous
speaker's bracket.
3. Equal sign =: Indicates speech which comes immediately after an
other person's, shown for both speakers (i.e., latched utterances).
4. (#): Marks the length of a pause in seconds.
5. (Words): The wordsin parenthesis () werenot clearly heard;(x) = unclear
word; (xx) = two unclear words; (xxx) = diree or more unclear words.
6. Underlined words: Words spoken with emphasis.
7. CAPITAL LETTERS: Loud speech.
8. ((Double parenthesis)): Comments and relevant details pertaining to
interaction
9. Colon: Sound or syllable is unusually lengthened (e.g., rea::lly lo:ng)
10. Period: Terminal falling intonation.
11. Comma: Rising, continuing intonation.
12. Question mark: High rising intonation, not necessarily at the end of
a sentence.
In their book on analyzing learner language, Ellis and Barkhuizen (2005) iden
tify three research paradigms in second language acquisition. These are the nor
mative, the interpretive, and the aitical. These paradigms approach the analysis of
learner languagein very differentways that reflect the assumptions and purposes
behind the paradigms. The purpose of normative approaches is to test a theoret
ically motivated hypothesis. The purpose of the interpretive paradigm is to
describe and understand L2 acquisition through the intensive, and usually longi
tudinal, study of a limited number of cases. The critical paradigm investigates
language acquisition in its sociocultural context. Table 12.2 sets out the kinds of
analysis of learner language carried out within the three paradigms.
Classroom data come in many shapes and forms. Learner data can be
classified into three different types: nonlinguistic performance data, samples of
learner language, and verbal reports from learners about their own learning
(Ellis and Barkhuizen, 2005). Nonlinguistic performance data include reaction
times to linguistic stimuli, nonverbal measures of comprehension, and gram-
maticality judgments. Learner language samples can be elicited or naturally
occurring. Verbal reports include self-reports, interview data, questionnaire
responses, stimulated recall, think-aloud protocols, and self-assessments.
REFLECTION
Study the following extract,lb what extent does it followthe IRF pattern
identified by Sinclairand Coulthard? What variations are evident? What
other commentswouldyou make on the teacher talk?
Extract 10:
T: OK, let's try number one. Bin* why don't you start and Tomo will
follow. Go ahead try it... Number one.
Bin: It warm this evening.
Tomo: Yes, the evenings are getting warmer.
Bin: I think it get warmer this evening.
T: OK... How could we change that a little bit?
Bin: Getting warmest?
CONVERSATIONAL ANALYSIS
022 (0.3)
023 L10: foo-
024 L9: no it is not a food it is (.) like a stone you know?
025 L10: oh I see I see I see I see I see I know I know hh I see h a
whit- (0.4) a
026 kind of a (0.2) white stone h [very beautiful]
027 L9: [yeah yeah] very big yeah
028 [sometimes very beautiful and] sometimes when the ship moves
029 L10: [T see I see ok] (Markee, 2005, pp. 359-360)
Extract 12:
((L9, LIO, and LI 1are alllooking down at their class materials, reading an
article on global warming. L9, who is facing the camera, is leaning her
head on her left hand. LIO has her back turned to the camera and is facing
L9. LI 1 is in profile but her hair hides her face))
001 LIO: coral, what is corals
002 (1.3)
003 L9: ((L9moves herhead slightly to herright to
004 look tit the right-hand page ofher materials.))
005 (1.3)
006 L?: hshhh
007 (1.3)
008 L9: [X_ ((L9looks up at L10, holding
009 [hh her chin in her left handin
010 a thinking pose))
011 (1.3)
012 L9:
013 L9: do you know the under the sea::, ((L9 leans
014 forward and
015 drops herleft
016 hand toherlap))
017 L9: ((L9 looks down at LWs article))
018 L9: under the sea::,
(Markee, 2005, p. 362)
Teacher Talk
Teacher talk is a crucial element in the classroom. In second language contexts,
researchers have investigated how teachers speak to minority language children
(Strong, 1986).In many EFL contexts, teacher talk represents the only 'live' tar
get language input that learners receive. Some of the questions that have been
investigated in classroom research include the following: How much talking do
REFLECTION
In Extract 13, the teacher claims that there is no explanation for the grammati
cal issue that the student raises. This statement, of course, is incorrect, and the
teacher quickly(and wisely) admits that he does not know the answer.
REFLECTION
How would you answer the question above? Why don't we say "three
bedrooms house"? Have you ever told your students (or been told by a
teacher), "There is no reason—that's just the waywe sayit"?
Inexperienced
Teachers
What makes an explanation effective? Think about times you have been
teaching or times you have taken a language course. Mow can you tell if
your students have understood and benefited from the explanation? When
you have been a student, how could your teachers have known if theirex
planations were successful?
T: Do you think that, um, was it exciting that night? Mm? Do you think
that it was very exciting? Right. Chi-ming. What do you think? It was,
it was—
S: It was very exciting.
T: It was very exciting. Right. Sit down.
The teacher appears to be asking for the student's opinion of the story,
hence a 'referential question.' However, when we look at the clue that the
teacher provides—It was, it was—and her evaluation of the student
response as correct, we know that in fact the teacher expected the student
to say It was exciting, which is the concluding sentence in the story. It is
therefore a 'display' question. (Tsui, 1995, p. 29)
Leibscher and Dailey-O'Cain underscore the fact that the overall category is
called "repair rather than correction, the latter being the particular subtype of
repair that occurs when a notable language error is corrected" (ibid.).
Some of the questions that researchers have addressed in relation to error
treatment include the following: (1) When should errors be treated? (2) How
should they be treated? (3) Who should treat errors? (4) I low effective is self-
and peer-correction? (5) What errors should be treated? (6) To what extent do
learners take up teachers' responses to their errors (i.e., how effective is teacher
treatment ol learners' errors)?
REFLECTION
Choose one of the questions listed above. What sort of study could you
design to address the question of your choice?
Student-Student Interaction
The other major area of interest naturallyenough is that of student talk. As you
might imagine, the range ofresearch issues and questions isenormous although
mostseekto establish some type of relationship between input or environmental
factors and learner acquisition. For example, researchers have asked, "What is
the relationship between inputanduptake?" That is, do items made available for
learning subsequently appear in learner output? (This is one of the questions
addressed by the sample study in the nextsection.) Other key questions include
(1) What is the relationship between participation structures (group size and
composition, etc.) andlearner output? (2) What is the relationship between task
type and learner language? and (3) What patterns of interaction typify student
talk in language classrooms?
REFLECTION
Look back atExtracts 11 and 15. In each one, fiid an example ofalearner
scaffolding another learner. Compare your iceas with a classmate or
colleague.
A SAMPLE STUDY
For the sample study in this chapter, we have selected a report about some
teacher research. Storch (2002) carried out an investigation into patterns of
interaction in pair work. Her classroom-based study was conducted in a one-
semester, credit-hearing ESL course offered at a university in Australia. The
purpose of the course was "to develop learners' academic listening, reading,
speaking and writing skills" (p. 123). Storch collected her data in the writing-
classes, which included a focus on grammatical accuracy. In this particular
report, she addressed the following research questions:
1. What patterns of dyadic interaction can be found in an ESL university-
level class?
2. Does task or passage of time affect the pattern of dyadic interaction?
3. Do differences in the nature of the dyadic interaction result in different
outcomes in terms of second language development? (p. 123)
I [ere, due to space constraints, we will only summarize her work on the first re
search question above.
ACTION
Think about Storch's first research question. How would you go about
collecting data to address this question? List the specific steps you would
take.
There were thirty-three students in the study. Storch (ibid.) describes them
as follows:
High Mutuality
4 1
Expert/Novice Collaborative
3 2
Dominant/Passive Dominant/Dominant
Low Mutuality
Extract 16:
1 C: this (reads instructions) . . . what is this?
2 M: from the chart
3 C: this chart about
[
4 M: the data
5 C: with percentage and eh . . .
6 M: describe describe the percentage of
7 C: English language fluency
[
8 M: English language fluency between two countries yeah?
9 Vietnam and Laos
10 C: yes and the compare before they came here and now
11 M: yes . . .
12 C: you can separate it here
13 M: yeah . . . first we . . . mm the
14 C: perhaps you should write
15 M: yeah I write yeah from the information of the chart yeah
16 ... ((writing)) information of the chart
17 C: no from figure 3
[
18 M: ah figure . . . figure 3? From figure 3
figure 3 ah
19 C: show the information
20: M: show the information ... it it's
21: C: yeah it's ok it shows
22: M: it shows the . . . the data or the percentage?
23: C: should be the percentage (Storch, 2002, p. 131)
Storch notes that resolutions to problems are often attained "via a process of
pooling resources. For example, in line 20 Mai notes a problem with subject-
verb agreement and Charleysuggests the appropriate correction.Thus, the talk
shows a pattern of interaction that is high on equality and mutuality" (ibid.).
Storch (ibid.) provides additional extracts that show dominant/dominant,
dominant/passive, and expert/novice patterns. The studyconcluded that the col
laborative interaction pattern was the most common in the pair workdata. She
notes that the patternsof interaction were "fairly stable. Once they were estab
lished early in the semester . . . they remained so regardless of the passage of
time" (pp. 144-145). Although not all the students worked collaboratively, the
collaborative pattern was the most common and one dyad became more collab
orative as time went by. She concluded that the learners did indeed scaffold one
another's learning, and that that scaffolding was more likely in the collaborative
or expert/novice pairs.
We find this report to be intriguing for a number of reasons. First, the re
search was conducted by a teacherin her own class. The data collection proce
dures were based on activities normally used during lessons. The literature
review is current and wide-ranging, and the analysis of the interactions is clear
and convincing. Storch alsoreports that 33% of the data were analyzed by a sec
ond researcherusingthe four descriptive categories, and that this process yielded
90% agreement.
The ways of analyzing interaction during lessons have evolved over the years—
partly because technological developments have improved our ability to record
and transcribe and partly because advances in discourse analysis have enhanced
the possible means of investigating classroom interaction. Transcribing and
analyzing classroom interaction are powerful tools in understanding how talk
during lessons promotes (or impedes) language learning.
Still there are some pitfalls associated with these kinds of analyses. As noted
in earlierchapters, usingelectronicrecordingdevices can be a bit disruptive. You
will probablyneed to record classroom interactionsoften enough that the learn
ers grow accustomed to having a digital recorder or a video camera operating
during lessons.
If you do make tape- or video-recordings, you must decide whether the re
search question demands that you transcribe the data. In some instances, it may
be possible to code data directly from the recordings, but for many research
CONCLUSION
371
and analysis methods have been used along with quantitative methods. In
addition, quantitative data have been used in classroomstudies in both the action
research and naturalistic inquiryapproaches to research.
Quantitative data can be analyzed and displayed in many different ways.
Some familiar ways of reporting numerical data include percentages and
proportions. Such data can also be provided in helpful graphic displays such
as bar graphs and pie charts. For example, Scales, Wennerstrom, Richard, and
Wu (2006) used a piechart to show the percentages of English language learn
ers in their sample who correctly identified an American English accent.
Twenty-nine percent of the listeners correctly attributed a speaker's accent to
the United States, but the pie chartclearly demonstrates the remaining percent
of listeners who attributed the accent to eleven other countries or regions.
Thesesame authors use bargraphs to good advantage to display other compar
isons in their data.
In language classroom research, authors often use statistics to express their
findings. What do we mean bystatistics} The term has at least three meanings.
First, it is the labelgiven to a group of procedures by which researchersmake de
cisions about accepting or rejecting hypotheses in the experimental approach.
Secondly, statistics can refer to the actual mathematical formulae bywhich those
procedures are carried out. Third, statistics refers to the results of those mathe
matical procedures—the numerical findings of a study. These ideas maysound
veryesoteric, but our purposes in this chapter are twofold: We want to help you
understand both how to interpret some of the statistics that are commonly used
in language classroom research and also howto usesome of themwithyour own
data.
To accomplish those goals, we willfirst discuss descriptive statisticsand then
discuss the specialized meaning ofsignificant (as in significant differences or signif
icant coirelations) as the term relates to probability. We will also explore the
notion of degrees of freedom and explore one commonly used statistic (the
chi-squared test) in some detail before reading about other inferential statistics,
includingsome tests of significant differences and some correlation procedures.
Finally, we will briefly consider some technological advances for working with
quantitative data.
You will recall from Chapter 4 that when we think about representing a
group of people through quantitative data, we can use the measures of central
tendency (the mean, mode, and median) and the measures of dispersion about
the mean (range, standard deviation, and variance). These measures are called
descriptive statistics becausethey describe the group in terms of the variables that
have been measured or counted.
There is also an important group of statistical procedures called inferential
statistics—so named because they help us make inferences about the population
based on what is known about the sample. In general, inferential statistics are
used to reach conclusions about significant differences between or among
groups, or about significant relationships between variables. We will consider
both types in turn after we first explore descriptive statistics.
In Chapter 4, we learned aboutfrequency polygons, pfou will recall that these are
figures that depict the measurement ofthe trait being investigated on the hori
zontal axis (also called the abscissa) and the frequency on the vertical axis (some
times referred to as the ordinate). (Look back at Figure 4.8 for a reminder.) In
language classroom research, the measurement on the horizontal axis often in
volves a range ofpossible scores (for instance, on a language test). The frequency
on the vertical axis often represents the number of people who received each
particular score in the possible score distribution (see Figures 4.7, 4.8, and 4.9).
When we have a large number of scores or measurements represented in a
data set, the shape of the frequency polygon may resemble a bell. This image is
referred to as the normal distribution or the bell-shaped curve (or just the bell curve).
Keep thisvisual image in mind as you read about measures of central tendency
and measures of dispersion. Central tendency is the propensity of scores to group
around the middle of a data set. The effects of central tendency are visible in the
large central hump ofthebell-shaped curve. Dispersion refers to thepropensity of
scores to spread out away from the mean. This tendency is reflected in the taper
ing tails of the bell curve. (See Figures 4.8 and 4.^9 for a visual image of these
ideas.)
X = 2X/n i
The capital X with the line above it (X-bar) stands for the mean, the mathe
matical average. The capital Greek letter sigma (2) means "sum up the follow
ing" and the capital X represents "score(s)." The slash (/) indicates division.
So, this formula says that to get the mean, we sum the scores and divide by the
number (n).
To find the mode, you simply arrange the scores from highest to lowest and
look to see which score was obtained most oftenj So, for example, the control
C-l so E-l 85
C-2 82 E-2 87
C-3 78 E-3 83
C-4 77 E-4 82
C-5 83 E-5
C-6 80 E-6 85
C-7 76 E-7 SI
C-8 S4 E-8 89
C-9 75 E-9 84
C-10 85 E-10 86
Mean 80 85
group obtained the following scores: 85, 84, 83, 82, 80, 80, 78, 77, 76, and 75.
We can see that the score of 80 points was obtained by two students in the con
trol group, and every other score was obtained just once. Therefore, in this small
data set, the mode is 80.
REFLECTION
What is the mode for the experimental group data in Table 13.1?
It is not unusual to find two modes in a data set. When that happens, it is
called a bimodal distribution. Sometimes there is no mode in a data set because no
particular score is obtained by more people than any other score. This situation
often occurs with small data sets.
The median is "the score which is at the center of the distribution" (Hatch
and Lazaraton, 1991, p. 161). (Think of the median strip that divides a highway.)
Another way to understand the median is to say that "50% of the scores fall at
or below [the median) and 50% of the scores fall above that value" (Jaeger, 1993,
p. 37).
If you have an odd number of scores in your data set, the median will be the
score that is right in the middle. If you have an even number of scores in the data
set, the median is the "midpoint between the two middle scores" (ibid.).
Look at the control group data in Tible 13.1 for a moment. There are ten
people in the control group. Therefore, the median score is found between the
fifth and sixth scores (the two middle scores): 85, 84, 83, 82, 80, 80, 78, 77, 76,
REFLECTION
What is the median for the scores of the experimental group in Table 13.1?
Here are the ten scores arranged in order from the highest to the lowest
score.
89, 88, 87, 86, 85, 85, 84, 83, 82, and 81
What is the median for this slightly larger data set?
89, 88, 87, 86, 85, 85, 84, 83, 82, 81, 80, and 79
The median "is often used as the measure of central tendency when the
number of scores is small and/or when the data are obtained by ordinal measure
ment" (Hatch and Farhady, 1982, p. 4). The median is sometimes used to divide
a group into two groups through a procedure called themedian split. Imagine, for
instance, that you want to investigate the effects of a particular teaching method
on students who have a high or a low aptitude for language learning. You might
operationally define high and low aptitude by administering a language learning
aptitude test at the beginning of the experiment, finding the median score, and
using that point to divide the subjects into two equal (or very nearly equal)
groups. This would be an application of the median split.
REFLECTION
Can you think of any possible problems with the median split as a tech
nique for defining groups?
Imagine two cases wherea researcher used the median split technique
to operationally define high and low aptitude students in an experiment.
On a language aptitude test with a possible score of 100 points, the mean
is 63 and the median is 60.
Scenario 1: The highest student in the low aptitude group scores 55.The
lowest student in the high aptitude group scores 65.
Scenario 2: The highest student in the low aptitude group scores 59. The
lowest student in the high aptitude group scores 61.
ACTION
Determine the inclusive range and the exclusive range for the experimen
tal group data in Table 13.1.
S(X - X)2
N- 1
Start your calculations from within the parentheses. (This starting point is
a mathematical convention.) The X-minus-X-bar component inside the paren-
theses is telling us that we subtract the mean from each individual score. We do
this step to find the distance between each individualscore and the mean of that
group of scores.
The superscript (2) to the right of the parentheses tells us to square each of
the differences we find bysubtracting. The purpose of squaring is to get rid of
the minus signs thatwill appear when we subtract] the mean from each individ
ual score. We do this step because minus signs areiinconvenient to workwith—
especially if you are doing calculations by hand. (There will always be minus
signs in this step because, logically, some scores are higher than the mean and
some are lower than the mean.)
After we have done all the subtraction and the squaring, the capital sigma
(2) in the numerator of the standard deviation formula tells us to sum those
amounts. So the numerator in this equation says, "First, subtract the mean from
eachscore. Square those results and add them all t p." The denominator tells us
to subtract one from the number of data points (the scoreshere) that contributed
to the mean.
So far, so good. But remember that we squared all the differences to get
rid of the minus signs. So now we must undo that step by finding the square
root of the results. When we include the square root signwe have the complete
formula:
S(X - X)2
N- 1
To apply this formula to any data set (e.g., the data for the control group
in Table 13.1), it is convenient to create a table with column headings that
represent the steps in the equation. When we do so, we get the set-up shown in
Table 13.2.
The last step is to sum the values in the right column. When we do that
we get 108 as the value in the numerator of the equation. We then divide it by
N - 1, which is 9 in this case (10 - 1 = 9): \
108/9 = 12
C-l so 0
C-2 82 4
C-3 78 -1
C-4 "7 9
C-5 83 9
C-6 80 0
C-7 76 16
C-8 84 16
C-9 75 25
C-10 85 25
ACTION
REFLECTION
Degrees of Freedom
Notice that the denominator of the standard deviation formula says X - I. Why
is 1 subtracted from the N in this formula? The answer is based on a concept
called degrees of freedom. This is an abstract notion that is easier to illustrate than
to define. Think about a simple algebraic equation, like this:
3 + 4 + X = 12
You can determine that X in this formula represents the whole number 5. You do
this by adding 3 + 4, which gives you 7. You then subtract 7 from 12 and get 5.
There is nothing else that X can be in this formula. In other words, once we
know that the sum of the three values in the equation is 12, and that two of the
values are 3 and 4, then the other value must be 5. That one particular value is
not free to vary. It is predetermined by the other values in the formula. So, once
you know all but one of the values, the last one is accounted for. 'Flic mathemat
ical way of saying this is X —1.
Here's another way to think about this concept. Imagine that you are teach
ing a class with twenty students. There are twenty desks in the room. Seventeen
students are present and there are three empty chairs when the lesson begins. A
student enters the room and selects one ofthe three desks, sotwo remain empty.
Another student enters the room and chooses one of the remaining chairs.
When the twentieth student enters the room, how many chairs are available?
I low many choices does he have?
If you said one chair, you are correct. But we hope you also said that he had
no choice! Since only one desk remained, he had no choice but to sit there
(unless he wished to sit on the floor, which takes him outside of the equation).
The point is that when all the quantities except one of the quantities are known
and the total is known, then the lastquantity7 can have onlyone particular value,
and that value is not free to vary. Many statistical formulae include "N - 1" to
account for this fact.
In summary, we have seen that the three main measures of dispersion are the
range, the standard deviation, and the variance, while the three main measures of
central tendency are the mean, the mode, and the median. These descriptive
statistics are extremely important in language classroom research. They are
regularly used in experimental research, but they are frequently provided in
reports of action research and naturalistic inquiry as well. In addition, the de
scriptive statistics underpin the inferential statistics and the concept of statistical
significance—the topic of our next section.
In the discussion that follows, we willrepeatedlyuse the term significant, and you
will see it used in many research reports that employ quantitative analyses. What
does this mean? Very briefly, a significant difference or a significant relationship
is one that is too substantial to have occurred by chance. By convention, in our
field the outcomes of statistical analyses are typicallyconsidered to be significant
if there are fewer than five chances out of a hundred that the results have oc
curred by chance. (In some instances, the standard is set more stringently—for
instance, at one chance out of a hundred.) This standard is set at the beginning
of a study, and it is called alpha—the first letter of the Greek alphabet, which is
in factthe symbol that is used (a). You maysee this symbol, whichlookslikea fat
little fish swimming to the left, in research reports—it is part of the "code" of
quantitative analyses.
When we do quantitative analyses in studiesseekingeither significant differ
ences or significant relationships, we checkfor statistical significance—the confi
dence youcanhave that the finding isstable or trustworthy. In reading a research
report, you maycomeacross a note that says "p < .05."Here the lower-case "p"
stands forprobability. It represents the likelihood, or probability, that the findings
are erroneous or fluky or atypical. The little carat lyingon its side (<) means"is
less than." (When it faces the oppositedirection [>], it means"is greater than".)
So, if a value is given and the probability is represented as "p < .05," it means
that ifwe repeated the study100times, wewould onlyget substantially different
results fewer than 5 times out of 100. In other words, if "p < .05," it is likelythat
the result is very stable.
Statistics books include tables called the critical values tables. These days, the
information in those tables is also contained in statistical software packages.
Over the ages, statisticians have determined what the critical values are for the
various statistical procedures. These tables help us decide whether the results of
our own quantitative analyses are likely to have occurred by chance. The out
comes of the inferential statistics formulae that you use with your own data are
called the observed values. In general, if the observed value from your data equals
or exceeds the predetermined critical value printed in the table, then you can say
that your results are statistically significant.
However, this much symmetryrarely occurs in real life! We are more likely
to get different numbers in the various language groups, but at whatpoint could
we saythat there is a significant difference in the number of studentswho chose
French, Spanish, or German? The data set shownin Table 13.4 would probably
not suggesta remarkable difference in the students' choices:
And what about the data in Table 13.6? When are the differences in group
sizebig enough for us to saythat they are statistically significant?
df
1 2.706 3.841 5.024 6.635 10.828
(The actualchi-square critical values table is much longer. It usually goes all the
way to df = 100. Here we have just reproduced a small portion of the table.)
To determine whether our Xobserved value is statisticallysignificant, we com
pare it to the Xcritical value in the appropriate row (in this case, where df = 2) and
under the probability level selected at the beginning of the study (in this case,
.05). If the observed value is equal to or greater than the critical value at that
point, we can saythat the results are statistically significant.
ACTION
Use Table 13.8 to determine whether the results ofour chi-square analysis
are statistically significant. All the information you need is given above.
REFLECTION
What do you notice as you read down each column of numbers in the table
of values of chi-square critical (Table 13.8)?
For example, in the column headed by .10 (meaning the 10% proba
bility level), the critical values are 2.706. 4.605, 6.251, 7.779, and 9.236.
Does the same pattern hold in the remaining columns in the table?
What does this mean, in practical terms, about the results needed in your
Xobserved in order to get statistically significant results?
You will recall that probability is the likelihood that our results are due to
chance. By convention in our field, we usuallyset that level (called the alpha level
because it is determined at the beginning of an investigation) at .05—meaning
that we are only willing to be wrong five times out of 100.
REFLECTION
Skim the remaining rows. Does the same pattern hold true? What does
this mean, in practical terms, about the Xobserved you need to find in order
to get statistically significant results?
Once again, we can analyze these data to see if there are significant differ
ences in this enrollment pattern from what we would expect purely by chance—
that is, if there were no connection between students' gender and their tendency
to choose French, Spanish, or German as the foreign language they would study.
We can see from the data above that there are 325 female students and 275
male students. Their choices of what language to study constitute the observed
frequencies in this data set. The firststep in calculating chi-square then is to de
termine what the expected frequencies would be if there were no particular dif
ference between male and female students in their choice of what language to
study. The row and column totals in Table 13.9help us to do this. Here's how.
The values in the row and the column labeled "Totals" are called the mar
ginalfrequencies, or justthe marginals, because theyappear in the margins of the
table. (Note that the cell in the lower right-hand corner of this table should show
a number equalto the total number of studentsin the study. We get that value—
600 in this case—by addingup the numbers in the "Totals" column. We should
get the same numberwhen we add the figures in the "Totals" rowaswell.)
We use the marginal frequencies to get the expected frequencies with the
following formula:
njnj
E •• = —-
What does this mean? Well, the E just represents the expected frequency—the
thing we need to determine before we can calculate the chi-square value. The
uppercase TV represents the numberin the study—the one in the lower right cor
ner of the table (in this case, 600). The lower-case n just represents the smaller
numbers in each cell and the subscripts i andj are mathematical symbols that
refer to "whatever row and whatever column" in the table, where / represents
row data andy representscolumn data.
Let's look at an example. The first cell in Table 13.9 tells us that 100 males
registered for French. If we read downthat column, wewill see that the column
total is 300 Ss. If we read to the far right of that row, we see that the row total is
275 Ss. The numerator in the equation for calculating expected frequencies in a
two-way chi-square studytells us to multiply the value for the rowtotal (nj) times
Female 162.50
ACTION
Using the data from Table 13.9 and the examples in Tible 13.10 above,
calculate the expected frequencies for the number of male students
studying German and the number of female students studying Spanish and
German.
Given the observed and expected frequencies, we can now do the subtrac
tion to calculate the difference between the observed frequencies in the data
set and the expected frequencies (O —E), just as we did in the one-wav chi
square calculations above. Once again, this process will yield some negative
numbers, so for convenience, the statistic tells us to square those values so we
can get rid of the minus signs. We then divide the resulting values by the value
ACTION
Xobserved
of E (the expected frequencies). Finally, we sum those values to get our chi-
square observed. Each of these steps is represented in the column headings of
Table 13.11.
In order to determine whether the value of this chi-square observed is statis
tically significant, we go back to the table of critical values (Table 13.8above).To
use the table, we need to know' the degrees of freedom. In a two-way chi-square
analysis, the degrees of freedom is equal to the number of columns minus one
times the number of rows minus one. In our enrollment data, we were working
with three columns (enrollment in French or Spanish or German) and two rows
(male and female students). So we calculate degrees of freedom like this:
(3 - 1X2 - 1) = 2 x 1 = 2.
ACTION
Compare the value of chi-square observed (which you got when you
totaled the last column in Table 13.11) with the values of chi-square criti
cal given in Table 13.8. Assume that alpha wasset at .05. Are your findings
statistically significant? How do you interpret the result?
What does all this calculating have to do with classroom research issues that
concern teachers and students? Let's revisit the sample study that was summa
rized at the end of Chapter 3. This was the investigation by Sato (1982)—the
ESL teacher who wanted to investigate the perception that Asian students did
not participate as much in ESL classes as non-Asian students. Sato used the
1. The 19 Asian students took 107 (36.4%) of the turns. The 12 non-
Asian students took 186 (63.5%) of the turns. Sato reported that the
chi-square observed for these data was 75.78, df = 1, p < .001. (p. 17)
2. The 19 Asian students self-selected for turns 52 times (33.99% of
the time)' while the 12 non-Asian students
i
self-selected 101 times
(66.01% of the time). The reported chi-square observed was 48.89,
df= l,p<.001.(p. 18)
3. The two teachers (including the researcher;—an Asian-American her
self) allocated 37 turns to the 19 Asian students (39.66% of the turns)
and 57 turns (60.44%) to the 12 non-Asian students. The reported
chi-square observed was 19.04, df = 1, p < .001. (p. 18)
REFLECTION
Given what you now know about the chi-sqiare statistic, how do you
interpret Sato's findings? What do these results say to us as language
teachers?
X, - x2
tobs -
SD(X,-X:)
In this formula, the subscripts / and 2 refer to the first and second group.
The subscripts e and c are also used sometimes—referring to the experimental
and control groups in an experiment.
ACTION
Based on what you know already, interpret the formula above. What steps
are done and in what sequence? Talk through the steps with a classmate or
colleague.
When there is just one group contributing two sets of data, as in the one-
group pre-test post-test design, the post-test scores cannot be said to be inde
pendent of the pre-test scores since the same individuals are providing both sets
of data. In that case, we use a slightly different formula for the t-test, which is
called the dependent samples t-test. (Vou may also see the labels matched-pairs t-test
or correlated t-test, but we will not use those terms here as they are not so com
mon in our field.)
The dependent samples t-test is also used in another context. That is when
there are two groups, but the members of the two groups are intentionally
matched on some criterion. For example, adults in an experiment might be
matched in terms of their scores on a language learning aptitude test adminis
tered at the beginning of the study. This step would allow us to check that the
REFLECTION
Here is an example from a study that two teachers conducted (Bailey and
Saunders, 1998). The data are from university students who were lower-
intermediate EFL learners in Hong Kong. Over the course of an academic year,
the two teachers taught six different sections of a speaking and listening course.
They wanted to know if their students had made substantial improvement in
their listening skills, so they used a dependent samples t-test to see if there were
significant differences in the students' scores on a video-based listening test
before and after the fifteen-week course. The pre-test and post-test data are
displayed in Table 13.12.
Pre-Test Post-Test
REFLECTION
What do the means in Table 13.12 tell you about the students' pre-test
scores and their post-test scores? What do the two standard deviations say?
Compare your interpretations with those of a classmate or colleague.
These teachers used a software package called SPSS, the Statistical Package
for the Social Sciences, to calculate the dependent samples t-test in order to
compare the pre-test and post-test means. The results showed that that the stu
dents' mean post-test scores were indeed statistically significantly higher than
their mean pre-test scores (p < .001).
REFLECTION
How do you interpret the finding that the difference between the pre-test
and post-test means was statistically significant (p < .001)?
There are a few things to keep in mind here. First of all, as these authors
noted in their report, this was a one-group pre-test post-test design (see Chap
ter 4). There was no control group for comparison. So, the teachers cannot claim
for certain that the English course was what made the difference in the students'
listeningtest scores. Secondly, there wasonly one form of the listeningtest avail
able, so the results might have been influenced by the practice effect. (Fortu
nately, the students didn't know at the beginning of the course that they'd be
retested with the same instrument at the end of the semester.) Also, because
the t-test compares the means of two sets of scores, we cannot infer that every
single student made significant progress. We can only conclude that—overall—
the students' post-test scores were significantly higher than their pre-test scores.
n=15 n=12 ! n = 10
I
In studies that use the one-way ANOVA to compare three or more group,
the authors will sometimes report on what are calledpost hoc comparisons. These
are statistics that systematically calculate all the possible comparisons of the
group means in the design. For example, if the one-wayANOVA used in this
Sheltered Regular AP
Social Studies Social Studies Social Studies
Course Course Course
(n = 15) (n=12) (n = 10)
Female n = 6 n = 7 n = 6
Students
Male n = 9 n = 5 n = 4
Students
The principal knows she cannot use a t-test because of the moderator vari
able as well as the three levels of the independent variable. So, she uses an adap
tation of the ANOVA for factorial studies. This procedure is called a two-way
ANOVA because the different groups are being compared in terms of two
variables (the particular social studies course they took and their gender). And in
cases where one or more statistically significant differences are detected by
the two-way ANOVA, a post-hoc comparison can be used to pinpoint those
differences.
Using the two-way ANOVA with factorial designs not only allows us to test
for significant differencesin the levelsof the independent variable (here, the type
Sheltered Regular AP
Social Studies Social Studies Social Studies
Course Course Course
Female
Students
= interaction effects (A X B)
When you read a research report that uses two-way ANOVA, there will
often be a table that shows the "main effects for A" and the "main effects for B"
and any possible interaction effects that may have been found. Being able to test
for possible interaction effects using the two-way ANOVA is important because
18- * *
17- *
16- *
15- * * *
S 14- *
* *
813"
& 12- *
•an- *
H 10- * *
be 9. * *
* *
2 7- * *
3 6- * *
5- * *
4- * *
3-
2-
1-
nu -
i i i i i i i i i i i i i i i i i i i i i i i i i
There can also be negative coirelations. In that case, as scores on one variable
increase, the scores on the other variable decrease. Suppose you noticed that
your language students who had broad vocabulary knowledge always seemed to
finish in-class readings faster than students whose vocabulary knowledge wasn't
quite as strong. Youdecide to investigate this phenomenon, so you correlate the
relationship between your students' scores on a 100-point vocabulary test and
the speed with which they can read a passage in the target language.The scatter
plot for a group of thirty-five students might look like the one in Figure 13.5.
In Figures 13.4 and 13.5, the relationship between two variablesunder con
sideration is depicted by the points on the scatterplot. However, we often need a
more concise way of representing this relationship. When the two variables are
both measured on interval scales, we use the Pearson's correlation statistic to cal-
culate a numerical index that represents the relationship. The outcome of that
statistic is called the correlation coefficient. It is obtained by using this formula
(the rawscoreformula) with our data:
N(2XY) - (2X)(2Y)
V[N2X2 - (2X)2][NSY2 - (2Y)2]
This formula may seem like a monster, but if ybu look closely, you'll see that
you have the knowledge to figure it out. You already know that N stands for
**
8 80- ** *
o
o *** *
C/5
ti 60- * * *
-a 40-
•a * * * * *
o
> 20- * * *
* * *
n -
i i 1 1 1 1 i i
number and 2 tells us to sum what follows. Here X represents the score on the
X-variable and Ystands for the score on the Y-variable. Remember that paren
theses tell you where to start. The square brackets work like parentheses.
In order to solve this formula by hand, it is convenient to set up a table, just
as we did for the chi-square test, which shows us the steps. We will use the fol
lowing (hypothetical) data to illustrate the process.
X Variable: Vocabularytest scores out of 30 points possible
Y Variable: Reading speed in seconds
Table 13.13 shows the column headings and uses a small data set to illustrate:
5 8 64 70 4,900 560
5(3,310) - (75)(250)
V[5(l,333) - (75)2][5(13,500) - (250)2]
And continuingwith the calculations, we get the following values:
16,550 - 18,750 -2,200
V[6,665 - 5,625][67,500 - 62,500] V[l,040][5,000]
-2,200 -2,200
= -96.65 = r
V5,200,000 2,280.35
So our correlation coefficient is —96.65—but what does it mean? The
minus sign indicates that we have a negative correlation, and the magnitude
(96.65) indicates that it is a very strong correlation. When we check a table
of critical values for Pearson's correlation coefficient (e.g., J. D. Brown, 1988),
we see that when alpha is set at .05 and n = 5, the rcrjtjcai is .8783, so we can say
that this reserved is statistically significant. In other words, we can have confi
dence (p < .05) that our findings are stable. As vocabulary scores increase,
reading speed decreases.
REFLECTION
REFLECTION
There are several issues to keep in mind as we calculate and interpret quantita
tive analyses. The concern that is first and foremost is whether the right statisti
cal procedure has been chosen. Table 13.14 below provides a summary of the
issues that influence a decision about what sort of statistic to use when you are
looking for significant differences between or among groups. For instance,
havingposeda hypothesis or research questionabout significant differences, the
next concern is what type of data is being compared. We have simphfied this
initial presentation to involve just interval-scale measurements and frequency
counts. (There are other statisticsthat determine, for instance, significantdiffer
ences between groups where the dependent variable is based on ordinal data.)
These are just a few of the most basic statistical tests of significant differ
ences. A good statistics course will teach you much more about how to analyze
quantitative data to test various kinds of hypotheses and answer research ques
tions. However, the procedures discussed here are often used in language class
room research and you are likely to come across them in your reading of the
research literature in our field.
The choice of the right statistic to use is just as important in correlation
studies. The characteristics of the three correlation statistics we have studied are
summarized in Table 13.15. Once again, the choice of formula depends on the
type of data involved.
REFLECTION
This chapter has described just a few of the inferential statistics that are
commonlyused in language classroom research, and for even these few, we have
just presented introductory concepts. If you would like to do some substantial
quantitative analyses, we recommend that you take an introductory course, but
there are alsoseveral good booksthat you can consultto help you makethe right
choices. (See the Suggestions for Further Reading at the end of this chapter.)
Chapter 13 Quantitative DataAnalysis 403
A SAMPLE STUDY
As you can see from Table 13.16, all three groups improved between the
pre-test and the post-test. (Unfortunately, the researcher did not report how
many points there were on the total test, so we cannot tell what these mean
scores indicate in any absolute sense—onlywhat they suggest in comparison to
one another.)
We can see from the descriptive statistics in Table 13.16 that the students in
the whole-class group started out the highest (their mean pre-test scores) and
ended up the highest (their mean post-test scores). But did they make the most
progress? Answering that question shows the benefit of a pre-test post-test
design like the one Bejarano used. Having pre- and post-test data allows us to
calculate the groups' gain scores—the difference between their performances on
the pre-test and the post-test.
REFLECTION
Bejarano's report also provided some data about how well the students did
on the various subtests. One set of" data that is particularly interesting consists of
the three groups' listening subtest scores. These are shown in Table 13.18. (The
researcher reports that there wasa range of 0 to 57 points in these scores, but she
does not tell us the range for each group or the total points possible.)
REFLECTION
ACTION
Now calculate the gain scores on the listening subtest for these three
groups. Fill in the blanksin the row labeled"Gain Scores"in Table 13.18.
Which group(s)made the greatest improvement?
Bejarano (1987, p. 492) reported the following findings regarding the data
presented in Table 13.16 and Table 13.18, respectively:
Pupils in the discussion group had greater gains than those in the whole-
class situation on the total test: F (1, 465) = 4.23 (p < .05).
Pupils in the discussion group classes had greater gains than those in
the whole-class situation on the listening subtest: F (1, 465) = 11.99
(p < .001).
If you were reading Bejarano's report, how would you interpret these
statements? They are written in a kind of code or shorthand that is familiar to
ACTION
There are several payoffs associatedwith analyzing quantitative data. The first is
that this approach to investigation is widelyused around the world and is under
stood by researchers trained in the psychometric tradition. Once you are famil
iar with the various statistics and how to interpret them, you will find that you
can understand (well-written) quantitative research articles in professional books
and journals.
The second is that, for better or for worse, people often find statistical re
sults convincing. (This fact is unfortunate because statistics can be misused.
Newspaper reports are notorious for providing partial data and/or the results of
inappropriate statistical procedures. They can get away with such shoddy prac
tices because interpreting statistical evidence is so foreign to most people.)
Another benefit of quantitative analyses is that they are compact. Reporting
the mean and the standard deviation for a group of learners speaks volumes to
those who can interpret descriptive statistics, so a great deal of information can
be conveyed in a small space—animportant economical fact for the publishers of
books and print-based journals. For example, look back at the sample study in
Chapter 9, where the authors (Lynch and Maclean, 2000) characterized two
students, Alicia and Daniela, very succinctly in terms of their scores on the
TOEFL, the IELTS, and a dictation.
In addition, regardless of the approach you choose—psychometric, natura
listic, or action research—numerical results are informative. We will return to
this concept in Chapter 15, where we consider mixedmethods studies—those that
involve both quantitative and qualitative data analyses.
As usual, the pitfalls are related to the payoffs. The first is that while people
trained in interpreting statistics can understand quantitative analyses, people
1. Research has been conducted that found no other toothpaste was more
effective than Shiny-White, because Shiny-yVhite performed the same as
the other brands.
2. In the research that was conducted, Shiny White scored lower than the
others but the difference was not statistically) significantly different.
3. No research has been conducted comparing Shiny-White to other brands
of toothpaste—hence, no other toothpaste has been shown to be more
effective.
REFLECTION
Watch for examples in the popular press when; statistics are used to sup-
port a certain point of view. Do you find the ejxamples credible? Why or
why not?
Another issue is that almost all of the inferential statistics are based on as
sumptions about the data—assumptions that may not be met in language class
room research. For example, as we noted above, the independent samples t-test
can only be used to compare the means of two groups on an interval scalewhen
the two groups' scores are independent of (not influenced by) each other.
Another example is found with the chi-square test. In cases where df = 1 (as in
Sato's [1982] data), a variation of the chi-square statistic called Yates' correctionfor
continuity, or Yates' collection factor, must be used, though some researchers have
skipped this step. Likewise, in working with Pearson's correlation coefficient,the
measures on the two variables being correlated must meet the following assump
tions (Hatch and Lazaraton, 1991, pp. 549-550):
Pupils in the student teams group had greater gains than those in
the whole-class situation on the listening subtest: F(l, 434) = 8.60
(p < .005).
14
412
What Is/Are Qualitative Data?
Both quantitative and qualitative data are important in language classroom re
search, but when it comes to making sense of research, qualitative data come
first. In saying this, we mean that while qualitative data can be quantified, all
quantitative research must ultimately be referenced against the qualitative
sources that gave rise to them in the first place. For example, a researcher inves
tigating the 'good language learner' might collect test scores from a sample of
secondary school language learners. 'Good learners' might be operationally de
fined as those students who scored better than two standard deviations above the
mean of the sample as a whole. In order for the study to have any value, the re
searcher would then need to identify what it was that gave rise to the superior
scores in the firstplace. (Did thesestudentsdevote more time to language study?
Did they attempt to activate their language out of class? Did they use a greater
rangeof learning strategies?) Answers to these questions can onlybe determined
through the analysis of qualitative data: learner diaries, focused interviews,
responses to open-ended itemson questionnaires, and so on.
In the subheading to thissection of the chapter] weuseboth the singular and
plural form of the verb to be separated bya backslash. We do so to reflect a use
ful distinction drawnby Holliday(2002), whosuggests that quantitative research
consists of counting occurrences across large populations:
Observation/ Time "real" What is the How are data How is the
field notes versus ex researcher's linked to interpretation of
Interviews post facto relationship analysis? the data arrived
collection to the study? at and by whom?
Documentary Relation of Participatory Linear/ Emic
analysis researcher to 1 Iterative 1
data 1 1
Stimulated 1 Grounded
recall 1 1
(videotape) Emic 1 1
I
Classroom Self-generated Collaborative Negotiated
discourse I
language Collaborative I
data Guided
(audiotape) Documentary
I
Survey data / I Declaratory A priori
questionnaire Etic I
Etic
REFLECTION
What are some of the things that could be coiinted in the following data
sets? Why might a researcher want to count those particular things?
© A narrative account of atypical school day b|r ateacher orastudent
© Observers'notes on a lesson
• Lesson plans and teachers'notes
Meaning Condensation
Workingwith qualitative data is a different matter from usingthe kinds of quan
titative analytical procedures described in Chapter 14. When we are analyzing
numerical data, we can use mathematical procedures to calculate the mean, the
standard deviation, and the other descriptive statisticsthat concisely characterize
the sample in a study. The parallel activity in analyzing qualitative data is a major
challenge: reducing large amounts of text (whether spoken,written, or graphic)
to manageable proportions that allowfor patterns in the data to emerge.
One way to accomplish this data reduction is through a technique known as
meaning condensation, which involves abridging free-form questionnaire re
sponses, interview transcripts, observers' field notes, and so on into shorter for
mulations. Long statements are compressed into briefer statements in which the
main sense of what is said is rephrased in a few words. Meaning condensation
thus involves a reduction of large quantities of text into briefer, more succinct
formulations. This process results in condensed statements that are then
subjected to further analysis.
Here is an example of a condensed narrative. It was constructed from a life-
history interview with a second language learner called 'Gloria' (a pseudonym).
In this study, Gloria was identified as a "good language learner" because she had
received a grade of A on the Use of English exam at the end of high school. Her
retrospective story was part of an investigation into the lifelong language learn
ing experiences of fifty language learners (Nunan, 2007b). The interview ran to
thousands of words. The condensed narrativeis just over a thousand words.
I had no contact with English at all out of class, unless you consider doing
English homework as contact. Extra-curricular activities after school were
mainly sports. There was nothing in English.
When I got to years 5 and 6,1 still didn't think that English was very important.
We prepared for the Academic AptitudeTest, but the emphasis was on Chinese
and mathematics. We didn't have any special preparation for English or extra
homework, so I didn't think that it was important. I remember that the focus in
class was on grammar—memorizing tenses and that sort of thing.
After primary school, I went to an English-medium secondary school. In the
beginning, what that meant was that for many subjects the textbook was in
English. In class, the teachers spoke Chinese because their job was to makesure
we understood, and the best way to do that was through Chinese.
Although we had a School English Society, my friends and I never thought of
joining it on our own initiative. We thought more about what sports we would
play when we joined the Sports Club. English wasn't an activity that you could
use or have fun with, it was a subject that you had to study and learn.
When I started in high school,I had more contact with English becauseit wasan
English-medium school and the teacher more-or-less had to speak English.
Then my view of English began to change. I began to see that in addition to
being a subject to be studied, it could also be used as a tool to study other sub
jects. For example, I studied history, and classes were conducted in English, so
English became more important. In most classes the teachers used a mixture of
Cantonese and English—probably fifty-fifty. There was a lot of switching be
tween languages. Some people say this is bad, but the main thing is that the
teachers use language that we can understand. What's the point of teaching a
perfect lessonin English if we can't understand? So Chinese played an important
part, even in English class.
In senior high school, the most important influences were the public examina
tions and preparing for them. English was now more important than other sub
jectsbecauseI needed it to learn the other subjects. Also, the English exams were
different. In the past, you only had to know grammar and vocabulary, but now
you needed a much deeper understanding because you were tested on listening
and speaking. The public exams completely dominated my life because my fu
ture depended on getting good results, and getting good results required good
English. Everything we did was based on the exams. What it tested, we learned!
But I also started to see the importance of English out of class. I realized that I
needed the language if I wanted to communicate with other people. When I was
REFLECTION
A'Grounded'Approach to DataAnalysis
Lincoln and Guba (1985) called dieir pioneering approach to qualitative re
search grounded because the analytical categories emerge from the data rather
than being imposed on them. Analysts working within this tradition use induc
tive reasoning processes, in contrast with deductive approaches. Deductive rea
soning begins with a theory and looks for data to confirm or disconfirm that
theory (as in experimental research). In contrast, inductive reasoning beginswith
data and ends up with a theory:
[In the] grounded theory approach, the researcher begins with the data
and through analysis (searching for salient themes or categories and ar
ranging these to form explanatory patterns) arrives at an understanding
of the phenomenon under investigation. These themes and patterns do
not simply jumpout at the researcher—discovering them requires a sys
tematic approach to analysis based on familiarity with related literature
and research experience. (Ellis and Barkhuizen, 2005, pp. 254-255)
Ellis and Barkhuizen go on to point out that inductive and deductive approaches
should be seen as either end of a continuum, radier than as a pair of binary
opposites. Qualitative research (or any other kind of research, for that matter)
cannot be entirely based on one or the other.
ACTION
The following statements are some data David Nunan collected in a work
shop he ran for English teachers in Hong Kong. (The teachers were asked
to describe three beliefs they have about language development that influ
ence the way they teach.) Do a key word analysis and assign the data to cat
egories. (Hint: In the original study, these statements were organized into
six categories.)
Here is the categorization of these comments from the original study (Nunan,
1993):
IMMERSION
Children need to be immersed in all types of writing/reading literature.
All children benefit from immersion of [sic] the written print.
LEARNING BY DOING/EXPERIENTIAL LEARNING
Children's language develops through experiences so in order for the children to
gain the most out of any given lesson, man)- experiences should be given.
Children learn by using the language.
LANGUAGE ACROSS THE CURRICULUM
It occurs across the curriculum and therefore should not be seen as a separate
subject.
Language develops through all curriculum areas.
GRAMMAR, STRUCTURE, CORRECTNESS
A child needs to be aware of basic grammatical structures.
I believegrammar, spelling and reading are die basis for language development.
ORAL/WRITTEN LANGUAGE RELATIONSHIPS
Spoken language should be mastered before written.
REFLECTION
Compare the categories you developed with those listed above. How simi
lar werethey? Did they overlap or diverge? In cases where there was diver
gence, whatexplains the differences? If you are working with classmates or
colleagues, compare your categories to theirs.
REFLECTION
Think about a research question that has interested you as you think
about conductingyour ownclassroom-based or cjlassroom-oriented studies.
How might you use the card sort technique to vork with the data in that
investigation?
Data Tagging
Data tagging is a procedure in which information about a pieceof text is tagged and
embeddedinto the text.Wlien the textisanalyzed, the computer can be instructed
to find and extractitems of interest. For example, particulargrammatical features
The tags can then be retrieved using a similar procedure to that used in concor
dancing (see below). This lists all of the errors bearing a certain code and then
lists these along with the immediate linguistic environment in which they
appear. Here are some lines from the output of a search for errors bearing the
code XNPR.
1. He couldn't turn the water on. And he got badly burned. It hap
pened in Mar
2. anyone in the neighborhood who got broken into recently? I
know
3. any extra precautions since the car got broken into last time?
Er well, I
4. he jilted her at the altar. So she got brought up by her
grandmother
5. she's been a bit nervous ever since we got burgled and dark nights
6. you know of? They got burgled. They got burgled once. Yeah.
That was a while
7. by crime that you know of. They they got burgled. They got bur
gled once.
8. done that so I suppose I could have got caned. Yeah. And as
you've gone
9. fool for being honest. You know he got called an idiot for being
honest
Coder AgreementIndices
One concernaboutanalyzing qualitative datathat relates to reliability is the issue
of subjectivity. How much of what we find has come out of the data and how
much of the interpretation has been insertedby the researcher? Woulddifferent
researchers find the samething in a data set? One way to sort out this problem is
to determine intercoder agreement—an index of the consistency withwhich differ
ent people categorize the same data. (This construct is analogous to interrater
reliability in quantitative research.) Asimplepercentageis calculated by dividing
the number of items upon which coders agree by the total number of items that
were coded.The general rule of thumb is that intercoder agreement should be at
least 85% for readers to have confidence in the reported findings (Allwrightand
Bailey, 1991).
Using these steps to establish indices of coder agreement can help you locate
holes or ambiguities in your coding categories. In addition, acceptably high
inter- or intracoder agreement indices can give your readers confidence in the
categories you use to analyzeyour data.
You can use the card sort technique as a quality control mechanism.
When youhave used the card sort technique to establish categories, youcan
have a colleague sort the same cards and see if the same categories emerge. As a
different way ofusing the card sort to check yourcategories, youcan provide your
colleague with the descriptors for the categories you wish to use and have that
person distribute all the cardsinto those categories.' Where there are cardsthat do
not fit well in an existing category, discuss these items with yourcolleague. You
may find that you need to adjust the descriptors and/or add new ones.
Member Validation
One procedure for checking the validity of quahtative data analyses is called
member validation or member checking. It is used to determine whether qualitative
data are convincing as evidence. (Someresearchers [e.g., Dornyei, 2007] alsouse
the terms respondent feedback or respondent checking.) This step involves asking
people (the members) in the culture under investigation (whether it is a school,
a department, or a classroom) to review the data and the interpretation thereof
to provide the researcher with feedback. So, for example, when Nunan (1993)
had Gloria and the other good language learnersin his study verify his conden
sations of their lengthy narratives, he was using a member checking strategy.
According to Richards, such validation "involves more than simply asking
A SAMPLE STUDY
REFLECTION
Answer the following questions about this stud)', based on whatyou know
so far:
1. What is the design of the study? (This is a trick question. Even though
the data were qualitatively collected and analyzed, the investigation
used one of the research designsfrom the ps rchometric approach.)
2. What is the independent variable and how rrany.levels did it have?
3. What are some control variables in this studjr?
4. Based on your reading and your own expe; lences as a teacher and a
student, what do you think the NNS TAs' communication problems
might have been?
Math Basic
Math {V
Math
Math
Math
V
Math Advanced
Math Totals
Physics Basic
Physics A
Physics
Physics
Physics V
Physics Advanced
Physics Totals
..
r \ /" "\
Profile
of TA 1
f \ r \ \
In the process, five clear types of teaching styles emerged among these
twenty-four teaching assistants. As mentioned above, these were the (1) inspir
ing cheerleaders, (2) entertaining allies, (3) knowledgeable helpers and casual
friends, (4) mechanical problem solvers, and (5) active but unintelligible TAs. In
the research report, each type is described and then illustrated by a profile of
one of the TAs who represented that type. This process of condensation and
REFLECTION
Based on the five categorylabels above, which typesof TAs do you predict
would be seen as successful and which would be seen as less successful in
the opinion of the undergraduate students whom these TAs taught?
As noted at the outset of this chapter, qualitative data are powerful. They are
more accessible to readers without statistical training than are the kinds of so
phisticated quantitative analyses that appear in many published journal articles.
They can be used to explain important concepts to teachers, administrators,
journalists, and parents in human terms, while quantitative data sometimes seem
too abstract and detached or—conversely—too concrete and impersonal.
There are many pitfalls in choosing to collect and then in analyzing qualita
tive data. The first, especially for novice researchers, is the possibility of getting
overwhelmed by the sheer volume of data. Detailed field notes about an hour-
long lesson can run to thousands of words.
If you are generating handwritten observational field notes during a lesson,
you maychooseto word process the data beforeyou analyze it. (We recommend
that you do so.) While word processing is initially time-consuming, having the
data stored electronically can save you time in the long run and will make the
analysis easier.
The length of qualitative data sometimes createsproblems when researchers
try to publish their findings. Most journals and many anthologies impose page
restrictions on manuscripts submitted for publication. It is very difficult to pro
vide a convincing analysis of reams and reams of data in a short article. Some
times researchers choose to focus on very particular issues in reporting their
findings. In somecases, lengthystudies have been published as journalarticles or
book chapters. For example, Schmidt and Frota's (1986) analysis of Schmidt's
learning of Portuguese in Brazil is eighty-nine pages long.
Sometimes qualitative data analyses don't pan out. For example, Bailey
originally coded her observational field notes on math and science teaching
assistants' lessons using a category system designed to analyze classroom dis
course (Sinclair and Coulthard, 1975). The categories had seemed promising in
terms of addressing the research question, but they did not really reveal the core
issues separating the successful and lesssuccessful TAs.It wasnot until she tried
the summarizing process described above that Bailey was able to arrive at a
3. Think about a study you would like to conduct. Focus on the research
question(s).What kinds of qualitative data would you want to collect? How
would you want to analyze those data? Sketch out your ideas and share
them with a classmate or colleague.
4. Ask a language learner you know to tape-record or write his or her
language learning history. Then create a condensed narrative from that
narrative.
5. Analyze some of your own qualitative data using the card sort technique. If
possible, compare your results with another reader. What similarities and
differences are there between the two analyses?
6. If you are teachingor taking a language class, try to write a report of a typ
ical day. (For an example, turn to Chapter 7 for van Lier's [1996a] descrip
tion of a typical day in a bilingualschool in Peru.) As you do so, be aware
of the mental processes you use in deciding what information to include
and what to exclude.
15
In this concluding chapter, we build upon some of the main themes that have
been presented earlier in the book. Looking back, |you will recall thatafter pro
viding an overview of language classroom research in Chapters 1, 2, and 3, we
turned to issues related to planning and implementing research (Chapters 4
through 8).The focus then shifted to data collection (in Chapters 9, 10, and 11)
and data analysis (Chapters 12, 13, and 14). We realize that we have covered
quite a bit of material in the foregoing pages, so the intent in this chapter is to
help you put it all together.
To that end, in this chapter we review some issues raised earlier and discuss
ways of combiningqualitative and quantitative procedures in your own proposals
and subsequent investigations. We also look at the practicalities of doing re
search and suggest some steps that you can take to make the enterprise more
successful and more satisfying. Finally, we address issues related to reporting on
437
your research findings, whether in formal or informal contexts, both in writing
and in oral presentations.
Research as it is presented in scholarly books, journal articles, and formal
monographs is in many ways a misrepresentation. These publications seem to
suggest that research is neat and tidy, and that it flows logicallyand irrevocably
from abstract to conclusion. But such reports are products. Some of the
processes by which the products were arrived at are reported, but some can only
be inferred. The missteps, blind alleys, false starts, and frustrations that the re
searchers encountered in the process of arriving at the final products are rarely
discussed (see Gass and Schachter, 1996). Dingwall (1984) contrasts the messi-
ness of the research process with the resulting product (be it a book, paper, or
conference presentation) as follows:
ACTION
REFLECTION
Are you familiar with any mixed methods studies? If so, think about one
that interests you. What combination of qualitative and quantitative proce
dures did it involve?
Some studies that are predominantly naturalistic in their approach have nev
ertheless used quantified data to support the authors' claims and provide infor
mation about the participants. For example, in the sample study by Lynch and
Quahtative Field notes written during and Field notes summarized and
after observations in the regularly profiles generated for each TA;
scheduled classes of 24 TAs 24 profiles compared in order
(12 NSs and 12 NNSs) to identify types of TAs
Quantitative Students' end-of-course Mean scores computed for each
evaluations of the TAs collected TA type; testing for statistically
using the university's regular significant differences across the
numerical rating scales ratings of the various types of TAs
REFLECTION
What pattern do younotice in Table 15.2 wljen you examine the mean
scoresfrom the students' evaluations acrossthe! five TA groups in terms of
their overall effectiveness and outside helpfulness?
The stable pattern of change across the mean scores in the student evalua
tion data shows that the students did indeed evaluate teaching assistants who
used some teaching styles in the typology more favorably than they did others.
That is, the inspiringcheerleaders were rated more highlythan the entertaining
If you are completing a master's degree or doctoral studies, you will probably
have to create a research plan for your proposedstudy and get it approved by a
committee or your research supervisor before you proceed. If you are a re
searcher seeking either internal or external funding, you will have to convince
the fundingagencythat you have a clear plan as wellas the knowledge and skills
to carry it out. Even if you are working completely independendy and do not
need approval from any other individual or group to carry out your investiga
tion, we encourage you to develop a research plan before you begin. Doing so
will save you time and trouble later.
A formal plan for research is called a proposal because the author proposes
the study to a group or an advisor by submitting a document that clearly articu
lates the plan. In graduate programs, the proposal must typicallybe approved by
a faculty committee before the researcher begins the study. In other contexts,
Look back at the research on learning styles by Nunan and Wong (2006),
which served as the sample study in Chapter 5. What research question(s)
did the authors pose? Did they use qualitative or quantitative data collec
tion and analysis procedures, or both?
ACTION
Look back at the sample studies that conclude the chapters of this book.
Try to categorize two or three of them using Grotjahn's classification
system for identifying research paradigms (see Table 15.3).
Paradigm Number
and Label ResearchDesign Data Collection Data Analysis
ACTION
ACTION
ACTION
Think of a study that you would like to conduct. Draft the statement about
the research that you would give potential participants in order to obtain
their informed consent to be involved in the study.
DataAnalysis Procedures
Does the research entail statisticalanalyses, interpretive analyses, or both?
Given the research questions and the type(s) of data,what statistical and/or
analytical tools are appropriate?
Is it necessary or desirable to quantify qualitative data? If so, howwill this
quantification be done?
Do you, as a researcher, havethe necessary skills to carry out the statistical
analysis? If not, is consultinghelp available?
Results
What are the actual outcomes of the research?
Does the investigation answer the research question(s) originally posed?
Does it answer other questions as well (or instead)?
Are the results consistent with the findings of similar studies?
Are there any contradictory findings? If so, how can these be accounted for?
What are the implications of the findings for practice?
What additional questions and suggestions or further research areprompted by
the research?
Presentation
How can the research best be presented?
Who is/are the appropriate audience(s) for this study?
What form(s) will the research report take—monograph, thesis, journalarticle,
conference presentation?
Our research students have found it useful to get evaluative feedback from
fellow students at different stages in the research process. Peer feedback can be
extremely valuable, particularly if it is from students who are a stage or two
ahead in the research process. In their guide to writing qualitative research pro
posals, Marshall and Rossman (1995) state,
The experiences of our graduate students suggest that the support of
peers is crucial for the personal and emotional sustenance that students
find so valuable in negotiating among faculty whose requests and
demands may be in conflict with one another. Graduate seminars or
advanced courses in qualitative methods provide excellent structures
for formal discussions as students deal with issues arising from role
REFLECTION
SAMPLE STUDY
REFLECTION
CONCLUSION
In this final chapter, we have revisited some of the themes, issues, and concerns
that run through this book. We have considered combining quantitative and
qualitative data collection and data analysis procedures to produce mixed meth
ods studies. We have suggested a sequence of steps to take in developing a re
search plan and reviewed the steps involved in planning and conducting
language classroom research. Our hope is that this chapter has both introduced
new ideas and reminded you of concepts introduced earlier in order to help you
put it all together in designing and carrying out your own language classroom
research.
I. Background
General area
Title
The problem or issue
Purpose of the study
Likely importance of the study
Background to the study
The aims and justification of the study
Limitations of the study
What the study proposes to do
II. Literature review
A plan for the literature review including headings and sub
headings
Resources (books, journals, Web sites, etc.) to be consulted
HI. Research design
The research questions or hypotheses
Definitions of terms
Sampling strategies
Subjects, participants, or informants
Data collection methods
Data analysis procedures
IV. Presentation and dissemination
Statement on how the results will be presented
Suggestions as to which conferences and/or publications would
be appropriate
In an earlier chapter, we suggested you read the book Second Language Classroom
Research: Issues and Opportunities, edited by Schachter and Gass (1996). We re
mind you of that book here becausethe authors in liiat volume intentionally dis
cussed the problems they encountered in their studies in hopes that other
researchers would benefit from a candid discussion of the sorts of difficulties that
arise. In particular, the chapters by Markee (1996) and Polio (1996) are useful.
For helpful advice on writing research proposals and reports, we recom
mend Dornyei (2007), Galvan (1999), and Hatch and Lazaraton (1991). For
guidance on planning qualitative research, see Marshall and Rossman (1995;
1999; and 2006).
Some ethical considerations in proposing and conducting language class
room research are discussed by McKay (2006).
463
Applewhite, A., Evans HI, W, & research insecond language acquisition
Frothingham, A. (2003). And I quote: (pp. 67-102). Rowley, MA: Newbury
The definitive collection ofquotes, sayings, House.
andjokesfor the contemporary speechmaker. Bailey, K. M. (1984). A typology of
New York: Thomas Dunne Books. teachingassistants. In K. M. Bailey,
Asher,J. J., Kusudo,J. A., & de la Ibrre, R. F. Pialorsi, & J. Zukowski/Faust
(1993). Learning a second language (Eds.), Foreign teaching assistants in
through commands: The second field U.S. universities (pp. 110-125).
test. In J. W Oiler, Jr. (Ed.), Methods Washington, DC: NAFSA.
that work: Ideasfor literacy and language Bailey, K. M. (1990). The use of diary
teachers (2nd ed., pp. 13-21). Boston: studiesin teacher education programs.
Heinle & Heinle. In J. C. Richards & D. Nunan (Eds.),
Auerbach, E. (1994). Participatory action Second language teacher education
research. In A. Cumming, Alternatives (pp. 215-226). Cambridge: Cambridge
in TESOL research: Descriptive, University Press.
interpretive and ideological orienta Bailey, K. M. (1991). Diary studies of
tions. TESOL Quarterly, 28(4), classroom languagelearning:The
673-703. doubting game and the believing
Bachman, L. E, & Palmer, A. S. (1996). game. In E. Sadtono (Ed.), Language
Language testing inpractice: Designing acquisition andthe second/foreign
and developing useful language tests. language classroom (Anthology Series28),
Oxford: Oxford University Press. (pp. 60-102).Singapore: SEAMEO
Bailey, K. M. (1981).An introspective Regional Language Center.
analysis of an individual's language Bailey, K. M. (1992). The processesof
learning experience. In S. Krashen & innovation in language teacher devel
R. Scarcella (Eds.), Issues in second opment: What, why and how teachers
language acquisition: Selectedpapers of change. In J. Flowerdew, M. Brock, &
the Los Angeles Second Language S. Hsia (Eds.), Perspectives onsecond
Research Fonim(pp. 58-65). Rowley, language teacher education (pp. 253-282).
MA: Newbury House. Hong Kong: City Polytechnicof Hong
Bailey, K. M. (1982).Teaching in a Kong.
second language:The communicative Bailey, K. M. (1996). The best laid plans:
competence of non-native speaking Teachers' in-classdecisionsto depart
teaching assistants.Ph.D. dissertation from their lesson plans. In K. M. Bailey
in Applied Linguistics, University of & D. Nunan (Eds.), Voicesfrom the lan
California, Los Angeles. guage classroom: Qualitative research on
Bailey, K. M. (1983a). Some illustrations language education (pp. 15-40). New
of Murphy's Law from classroom York: Cambridge UniversityPress.
centered research on language use. Bailey, K. M. (1998a).Approaches to
TESOL Newsletter, 17(4),August. empirical research in instructional
(Reprinted in Selected articlesfrom the languagesettings. In H. Byrnes (Ed.),
TESOL Newsletter, 1966-1983, Learningforeign and second languages:
pp. 81-85, byJ. F. Haskell, Ed., Perspectives in research andscholarship
Washington, DC: TESOL). (pp. 75-104). New York: Modern
Bailey, K. M. (1983b). Competitiveness Language Associationof America.
and anxiety in adult second language Bailey, K. M. (1998b). Learning about
learning: Looking at and through the language assessment: Dilemmas, decisions
diary studies. In H.W Seliger & and directions. Boston: Heinle &
M. H. Long (Eds.), Classroom oriented Heinle.
464 Reference
Bailey, K. M. (2001a). Action research, coursel Paper presented atthe
teacher research, and classroom LanguageTesting Research
research in language teaching. In M. Colloquium,Monterey,California.
Celce-Murcia (Ed.), Teaching English Barlow, M. (2005). Computer-based
asa second orforeign language (3rd ed., analyses of learner language.In R.
pp. 489^198). Boston: Heinle & Ellis & G. Barkhuizen,Analyzing
Heinle. learner language (pp. 335-357).
Bailey, K. M. (2001b). What my EFL Oxford: Oxford University Press.
students taught me. The PACJournal, Bateson, G. (1972).Steps toan ecology of
7, 7-31. mind. New York: Ballantine.
Bailey, K. M. (2005). Lookingbackdown Beatty,K (2003). Teaching andresearching
the road: A recent history of language computer-assisted language learning.
classroom research. Review ofApplied London: Longman.
Linguistics in China: Issues in Language Beaumont, M., & O'Brien, T. (2000).
Learning andTeaching, 1, 6-47. Collaborative research insecond language
Bailey, K. M. (2006).Language teacher education. Stoke on Kent, England:
supervision: A case-based approach. New Trentham Books.
York: Cambridge University Press. Bejarano, Y. (1987). A cooperativesmall-
Bailey, K. M., Bergthold, B., Braunstein, group methodology in the language
B., Fleischman, N. J., Holbrook, classroom. TESOL Quarterly, 21(3),
M. P., Tuman, J., Waissbluth, X., & 483-5b4.
Zambo, L. (1996).The language Bell,J. (1987). Doingyourresearch project.
learner's autobiography: Examining Milton Keynes,England: Open
the "apprenticeship of observation." University Press.
In D. Freeman & J .C. Richards
Benson, P., & Nunan, D. (2005). Learners'
(Eds.), Teacher learning in language
stories: Difference anddiversity in lan
teaching (pp. 11-29). New York:
guage learning. New York: Cambridge
Cambridge University Press.
University Press.
Bailey, K. M., Curtis, A., & Nunan, D.
(2001). Pursuingprofessional develop
Biber, D.L Conrad, S., &Reppen, R.
(1998). Corpus linguistics: Investigating
ment: The selfassource. Boston:
language structure and use. Cambridge:
Heinle & Heinle.
Cambridge University Press.
Bailey, K M., & Nunan, D. (Eds.)
Birch, G. J. (1992). Language learning
(1996). Voicesfrom the language class
case smdy approach to second
room: Qualitative research onlanguage
language teacher education. In J.
education. New York: Cambridge
University Press.
FlowJrdew, M. N. Brock, &S. Hsia
(Eds.)L Perspectives on second language
Bailey, K. M., & Ochsner, R. (1983).A teacher education (pp. 283-294). Hong
methodological review of the diary Kong City Polytechnic of Hong
studies: Windmill tilting or social Kong
science? In K. M. Bailey, M. H. Long,
Bley-Vroman, R., & Chaudron, C.
& S. Peck (Eds.), Second language
(1994). Elicited imitation as a measure
acquisition studies (pp. 188-198).
of second-language competence. In
Rowley, MA Newbury House.
E. E. Tarone, S. M. Gass, & A. D.
Bailey, K. M., & Saunders, S. (1998, Cohen (Eds.), Research methodology in
March). Relationships among course second-language acquisition (pp.
objectives, self-assessment, and 245-262). Hillsdale, NJ: Lawrence
achievement in a learning strategies Erlbaum Associates.
Reference 465
Block, D. (1996). A window on the class (pp. 121-134). Washington, DC:
room: Classroom events viewed from TESOL.
different angles. In K. M. Bailey & D. Brown, C. (1985b). Requestsfor specific
Nunan (Eds.), Voicesfrom the language language input: Differencesbetween
classroom: Qualitative research on lan older and younger adult language
guage education (pp. 168-194). New learners. In S. Gass & C. Madden
York: Cambridge University Press. (Eds.), Input insecond language
Block, D. (1998). Tale of a language acquisition (pp. 272-284). Rowley, MA:
learner.Language Teaching Research, Newbury House.
2(2), 148-176. Brown, H. D. (2004). Language assess
Borg, S., & Farrell, T. S. C. (Eds.). (2007). ment: Principles andclassroom practices.
Language teacher research in Europe. White Plains, NY: Longman.
Alexandria, VA: TESOL. Brown, J. D. (1988). Understanding
Bowers, R. (1980). Verbal behaviour in research insecond language learning: A
the language teaching classroom. teacher's guide tostatistics andresearch
Unpublished doctoral dissertation, design. Cambridge: Cambridge
University of Reading, England. University Press.
Braine, G. (1994). Comments on A. Brown,J. D. (2001). Using surveys in
Suresh Canagarajah's "Critical language programs. Cambridge:
ethnography of a Sri Lankan Cambridge University Press.
classroom." TESOL Quarterly, 28(3), Brown, J. D. (2005). Testing in language
609-623. programs: A comprehensive guide to
Brinton, D., & Holten, C. (1989). What English language assessment. New York:
novice teachers focus on: The McGraw-Hill.
practicum in TESL. TESOL Brown, J. D., & Rodgers, T. S. (2002).
Quarterly, 23, 343-350. Doing second language research. Oxford:
Brinton, D. M., Snow, M. A., & Wesche, Oxford University Press.
M. (2003). Content-based second Brown, R. (1973). Afirst language: The
language instruction. Ann Arbor: early stages. London: Allen and Unwin.
University of Michigan Press. Brumfit, C, & Mitchell, R. (1990a). The
Brock, C. (1986). The effect of referential language classroom as a focus for
questions on ESL classroom research. In C. Brumfit & R. Mitchell
discourse. TESOL Quarterly, 20(1), (Eds.), Research in the language class
47-58. room: ELT Documents, 133 (pp. 3-15).
Brock, M. N., Yu, B., & Wong, M. London: Modern English Publica
(1992). Journaling' together: tions and British Council.
Collaborative diary-keeping and Brumfit, C, & Mitchell, R. (Eds.).
teacher development. In J. Flowerdew, (1990b). Research in the language class
M. N. Brock, & S. Hsia (Eds.), room: ELT Documents, 133. London:
Perspectives on second language teacher Modern English Publications and
development (pp. 295-307). Hong British Council.
Kong: City University of Hong Kong. Bruner,J. (1983). Child's talk: Learning to
Brown, C. (1985a). Two windows on the use language. New York: Norton.
classroom world: Diary studies and Burling, R. (1978). Language develop
participant observation differences. ment of a Garo and English-speaking
In P. Larson, E. L. Judd, & D. S. child. In E. M. Hatch (Ed.), Second
Messerschmitt (Eds.), On TESOL '94: language acquisition: A book ofreadings
Brave new worldfor TESOL
466 Reference
(pp. 54-75).Rowley, MA: Newbury National Centre for English
House. Language Teaching and Research,
Bumaford, G., Fischer, J., & Hobson, D. Macquarie University.
(1996). Teachers doing research: Practical Burns, A., de Silva Joyce, H., & Hood, S.
possibilities. Mahwah, NJ: Lawrence (Eds.). I(1999b). Staying learner-centred
Erlbaum Associates. in a competency-based curriculum:
Burns, A. (1995). Teacher-researchers: Teachers' voices 4. Sydney:National
Perspectives on teacher action Centre for English Language
research and curriculum renewal. Teaching and Research,Macquarie
In A. Bums & S. Hood (Eds.), Teachers University.
voices: Exploring course design ina chang Bums, A.,|de SilvaJoyce, H., &Hood, S.
ingcurriculum (pp. 3-29). Sydney: (Eds.). (1999c). Teaching casual conver
NCELTR, Macquarie University. sation: Teachers' voices 6. Sydney:
Bums, A. (1997).Valuing diversity: National Centre for English Language
Action researching disparate learner Teaching and Research, Macquarie
groups. TESOL Journal, 7(1), 6-9. University.
Bums, A. (1999). Collaborative action Bums, A., de SilvaJoyce, H., & Hood, S.
researchfor English language teachers. (Eds.). (2000).A new look at reading
Cambridge: Cambridge University practices: Teachers' voices 5. Sydney:
Press. National Centre for English
Bums, A. (2000). Facilitating collabora LanguageTeachingand Research,
tive action research: Some insights Macquarie University.
from the AMEP. Prospect: AJournal of Burns, A., & Richards, J. C. (in press).
Australian TESOL, 15(3), 23-34. Cambridge guide tolanguage teacher
Bums, A. (2004). Action research. In E. education. Cambridge: Cambridge
Hinkel (Ed.), Handbook ofresearch in University Press.
second language teaching andlearning Burton, J., & Bums, A. (Eds.). (2008).
(pp. 241-256). Mahwah, NJ: Language teacher research inAustralia
Lawrence Erlbaum Associates. and New Zealand. Alexandria, VA:
TESOL.
Burns, A., & Burton, J. (Eds.). (2008).
Language teachei' research inAustralia Busch, M, (1993). Using Likert scalesin
and New Zealand. Alexandria, VA: L2 research. TESOL Quarterly, 24(4),
TESOL. 733-736.
Bums, A., de Silva Joyce, H., & Hood, S. Campbell, C. C. (1996). Socializing
(Eds.). (1995).Exploring course design with the teachers and prior language
in a changing curriadnm: Teachers' voices learning experience: A diary study. In
1. Sydney: National Centre for K. M. Bailey & D. Nunan (Eds.),
English LanguageTeachingand Voicesfrom the language classroom:
Research, Macquarie University. Qualitative research onlanguage
Burns, A, de Silva Joyce, H, & Hood, S. education (pp. 201-223). New York:
(Eds.). (1997). Teaching disparate Cambridge University Press.
learner groups: Teachers' voices 2. Campbell D. T, & Stanley, J. C. (1963).
Sydney: National Centre for English Experimental andnon-experimental
Language Teachingand Research, designsfor research. Washington, DC:
Macquarie University. American Educational Research
Association.
Bums, A., de SilvaJoyce, H., & Hood, S.
(Eds.). (1999a). Teaching aitical Canagarajah,A. S. (1993). Critical
literacies: Teachers' voices 3. Sydney: ethnography of a Sri Lankan
Reference 467
classroom: Ambiguities in student Chan, Y. H. (1996). Action research as
opposition to reproduction through professional development for ELT
ESOL. TESOL Quarterly, 27(4), practitioners. Working Papers in ELT
601-625. and Applied Linguistics, 2(1), 17-28.
Carless, D. (1999). Catering for individ Hong Kong: Hong Kong Polytechnic
ual learnerdifferences: Primary school University.
teachers' voices. Hong KongJournal of Chaudron, C. (1986). The interaction
Applied Linguistics, 4(2), 15-40. of quantitative and qualitative
Carr,W, & Kemmis, S. (1986). Becoming approaches to research: A view of the
critical: Education, knowledge andaction second language classroom. TESOL
research. London: Falmer. Quarterly, 20(4), 709-717.
Carrasco, R. L. (1981). Expanded Chaudron, C. (1988). Second language
awarenessof student performance: A classrooms: Research on teaching and
casestudy in appliedethnographic learning. Cambridge: Cambridge
monitoring of a bilingual classroom. University Press.
In H. T. Trueba, G. P. Guthrie, & Choi,J. (2006). A narrative analysis of
H. P. Au (Eds.), Culture and the second languageacquisition and iden
bilingual classroom: Studies in classroom tity formation. Unpublished master's
ethnography (pp. 153-177).Rowley, of science dissertation, Anaheim
MA: Newbury House. University, Anaheim, California.
Carroll, M. (1994).Journal writing as a Chrisrison, M.A (2003). Learning styles and
learning and research tool in the strategies. In D. Nunan (Ed.), Practical
adult classroom. TESOL Journal, 4, English language teaching (pp. 267-288).
19-22. New York: McGraw-Hill.
Carson,J. G., & Longhini, A. (2002). Chrisrison, M. A., & Bassano, S. (1995).
Focusing on learning styles and strate Action research: Techniques for
gies:A diary study in an immersion collectingdata through surveys and
setting. Language Learning, 52(2), interviews. The CATESOL Journal,
401-438. 8(1), 89-103.
Carter, R., Goddard, A, Reah, D., Chrisrison, M. A., & Nunan, D. (2001,
Sanger, K., & Bowering, M. (2001). March). Pedagogical functions in
Working with texts: A core introduction synchronous e-learning interaction:
to language analysis (2nd ed.). London: The online conversation class.
Roudedge. Paper presented at the international
Carter, R., & Nunan, D. (Eds.). (2001). TESOL Convention, St. Louis,
The Cambridge guide toteaching Missouri.
English tospeakers ofother languages. Clark,J. L. D. (1969). The Pennsylvania
Cambridge: Cambridge University Project and the audiolingual vs.
Press. traditional question.Modern Language
Celce-Murcia, M. (1978). The simultane Journal, 53, 388-396.
ous acquisition of English and French Clarke, M. A.(1995, March). Ideology,
in a two-year-old child. In E. M. method, style: The importance of par-
Hatch (Ed.), Second language acquisi ticularizability. Paper presented at the
tion: A book ofreadings (pp. 38-53). international TESOL Convention,
Rowley, MA: Newbury House. Long Beach, California.
Chamot, A. U. (1995). The teacher's Cleghorn, A., & Genesee, F. (1984). Lan
voice: Research in your classroom. guages in contact: An ethnographic
ERIC/CLL News Bulletin, 19(2), 1, 5. study of interaction in an immersion
468 Reference
school. TESOL Quarterly, 18(4), Curtis, A. (1999). Use of action research
595-625. in exploring the use of spokenEnglish
Coffey, A., & Atkinson, P.(1996). Making in Hong Kong classrooms. In Y. M.
sense ofqualitative data. Thousand Cheah1 &S. M. Ng (Eds.), IDAC
Oaks, CA: Sage. Monograph: Language instructional
issues in theAsian classroom (pp. 78-88).
Cohen, A. D., & Hosenfeld, C. (1981).
Some uses of mentalistic data in
Newark, DE: International Reading
Association.
second language research.Language
Learning, 31(2), 285-313. Curtis, Al, &Bailey, K. M. (In press).
Cohen,J. M., & Cohen, M.J. (1980). The Diary studies. On CUEJournal.
Penguin dictionary ofmodern quotations Dagneaux, E., Denness,S., & Granger, S.
(2nd ed.). London: Penguin. (1998).Computer-aided error analysis.
Cohen, L., & Manion, L. (1985). Research Systenl 26, 163-174.
methods in education. London: Croom Damon, W., & Phelps, E. (1989). Critical
Helm. distinctions among three approaches
Cole, R., Raffier, L. M., Rogan, P., &
to peer education. InternationalJour
Schleicher, L. (1998). Interactive nalofEducational Research, 58, 9-19.
group journals: Learningasa dialogue Danielson, D. (1981, March). Views of
among learners. TESOL Quarterly, language learning from an "older
32(3), 556-568. learner" (Part H). CATESOL
Newsletter, 12(5), 6 and 16.
Cooley,L., & Lewkowicz, J. (2003).
Dissertation writing inpractice: Turning Davidson, F. (1998). The ordinal-interval
ideas into text. Hong Kong: Hong distinction reconsidered. Language
Kong University Press. Testing Update, 23, 56-64.
Coombe, C, & Barlow, L. (Eds.). (2007). Davis, KJ A., & Lazaraton, A. (Eds.).
Language teacher research in the Middle (1995). Qualitative research in ESOL.
East. Alexandria, VA: TESOL. TESOL Quarterly, 29(3).
Crago, M. B. (1992). Communicative Delaney, A. E., & Bailey, K. M. (2000,
interaction and second language March). Teaching journals: Writing
acquisition: AnInuit example. TESOL for professional development. ESL
Quarterly, 26(3), 487-505. Magazine, 16-18.
Crookes, G. (1993). Action research for Denzin, N. K. (1978). The research act:
second language teaching: Going A theoretical introduction tosociological
beyond teacher research.Applied methods (2nd ed.). New York:
Linguistics, 14, 130-142. McGraw-Hill.
Reference 469
Donato, R. (2000). Sociocultural contri Duff, P.A.(1991b). The efficacy ofdual-
butions to understanding the foreign language education in Hungary: An
and secondlanguage classroom. In J. P. investigation ofthree Hungarian-English
Lantolf (Ed.),Sociocultural theory and programs. (Final Report for Year-Two
second language learning (pp. 27-50). [1990-91] of the project). Los Angeles,
Oxford: OxfordUniversity Press. CA: University of California, Los
Donato, R., & Adair-Hauck, B. (1992). Angeles, The Language Resource
Discourse perspectiveson formal Program.
instruction. Language Awareness, 1(2), Duff, P.A. (1995). An ethnography of
73-89. communication in immersion class
Donato, R., Antonek, J. L., & Tucker, rooms in Hungary. TESOL Quarterly,
G. R. (1994).A multiple perspectives 29(3), 505-537.
analysis of aJapanese FLES program. Duff, P. A.(1996). Different languages,
Foreign Language Annals, 27(3), different practices: Socialization of
365-378. discourse competence in dual-language
Dornyei, Z. (2003). Questionnaires in school classrooms in Hungary. In K.
second language research: Construction, M. Bailey & D. Nunan (Eds.), Voices
administration, andprocessing. Mahwah, from the language classroom:
NJ: Lawrence Erlbaum Associates. Qualitative research on language
Dornyei, Z. (2007). Research methods in education (pp. 407-443). New York:
applied linguistics. Oxford: Oxford Cambridge University Press.
University Press. Duff, P. A. (2008). Case study research in
Dornyei, Z., & Murphey, T. (2003). applied linguistics. New York: Lawrence
Group dynamics in the language Erlbaum Associates/Taylor & Francis.
classroom. Cambridge: Cambridge Dulay, H., & Burt, M. (1973). Should we
University Press. teach children syntax? Language
Doughty, C, & Pica, T. (1986)."Infor Learning, 23, 245-258.
mation gap" tasks: Do they facilitate Duterte, A. (2000).A teacher's investiga
second language acquisition? TESOL tion of her own teaching. Applied
Quarterly, 20(2), 305-325. Language Learning, 11(1), 99-122.
Dowsett, G. (1986). Interaction in the Edge, J. (Ed.). (2001).Action research.
semi-structured interview. In M. Alexandria, VA: TESOL.
Emery (Ed.), Qualitative research Edge, J., & Richards, K. (Eds.). (1993).
(pp. 50-56). Canberra: Australian Teachers develop teachers research:
Association of Adult Education. Papers onclassroom research andteacher
Duff, P. A. (1990). Developments in the development. Oxford: Heinemann
case study approach to SLA research. International.
In T. Hayes & K. Yoshioka (Eds.), Edwards,J. A., & Lampert, M. D. (Eds.).
Proceedings ofthe 1st Conference on (1993). Talking data: Transcription and
Second Language Acquisition and coding in discourse research. Hillsdale,
Teaching (pp. 34-87). Tokyo: NJ: Lawrence Erlbaum Associates.
International University ofJapan. Ehlich, K. (1993). HIAT: A transcription
Duff, P. A. (1991a). Innovations in system for discourse data. In J. A^
foreign language education: An Edwards & M. D. Lampert (Eds.),
evaluation of three Hungarian-English Talking data: Transcription andcoding
dual-language programs. Journalof in discourse research (pp. 123-148).
Multilingual andMulticultural Hillsdale, NJ: Lawrence Erlbaum
Development, 12, 459-476. Associates.
470 Reference
Ellis, R. (1984).Second language classroom suggested three-stage approach to
development. Oxford: Pergamon. exploratory practice.In S. Gieve &
Ellis, R. (1985). Understanding second I. K. Miller (Eds.), Understanding the
language acquisition. Oxford: Oxford language classroom (pp. 175-199).
University Press. Basingstoke, Hampshire, England:
Ellis, R. (1988). Classroom second language
Palgrave Macmillan.
development. London: Prentice Hall. Farrell, T S. C. (Ed.). (2006).Language
teacherresearch in Asia. Alexandria, VA:
Ellis, R. (1989). Classroom learning styles
TESOL.
and their effect on second language
acquisition: A study of two learners. Fetterman, D. M. (1989).Ethnography:
System, 17, 249-262. Step bystep. Newbury Park,
Ellis, R. (1990a).Researching classroom CA: Ssige.
language learning.In C. Brumfit & Flanders, N. (1970).Analyzing teaching
R. Mitchell (Eds.), Research in thelan behavior. Reading, MA: Addison-
guage classroom (pp. 54-70).London: Wesley.
Modern English Publications. Flick, U. (1998). An introduction to
Ellis, R. (1990b). Instructed second language qualitative research. London: Sage.
acquisition: Learning in the classroom. Fowler, P. (1988). Survey research methods.
Oxford: Basil Blackwell. Newbury Park, CA: Sage.
Ellis, R, & Barkhuizen, G. (2005). Fox, K. (2004). Watching the English:
Analysing learner language. Oxford: The hidden rules ofEnglish behaviour.
Oxford University Press. London: Hodder and Stoughton.
Ericcson, K. A, & Simon, H. A. (1984). Fraser, BJ,Rintell, E., & Walters, J.
Protocol analysis: Verbal reports asdata. (1980). An approach to conducting
Cambridge, MA: MIT Press. research on the acquisition of
Ericcson, K A., & Simon, H. A. (1987). pragmatic competence in a second
Verbal reports on thinking. In language. In D. Larsen-Freeman
C. Fajrch & G. Kasper (Eds.), Intro (Ed.), Discourse analyses in second
spection in second language research language research (pp. 75-91). Rowley,
(pp. 24-53). Clevedon, England: MA: Newbury House.
Multilingual Matters. FreemanJ D. (1989). Teacher training,
Ericcson, K. A, & Simon, H. A. (1993). development, and decision making:
Protocol analysis: Verbal reports asdata Amodel ofteaching and related
(Rev. ed.). Cambridge,MA:MIT Press. strategies for language teacher
Fajrch, C, & Kasper, G. (Eds.). (1987). education. TESOL Quarterly, 23(1),
Intrvspection insecond language research. 27-451
Clevedon, England: Multilingual FreemanJ D. (1992). Collaboration:
Matters. Constructing shared understandings
Fanselow, J. E (1977). Beyond in a second language classroom. In
Rashomon—conceptualizing and D. Nunan (Ed.), Collaborative language
describing the teaching act. TESOL learning andteaching (pp. 56-80).
Quarteriy, 11(1), 17-39. Cambridge: Cambridge University
Press.
Fanselow, J. F. (1987).Breaking rides—
generating andexploring alternatives in Freeman, D. (1996a). Redefining the
language teaching. White Plains, NY: relationship between research and
Longman.
what teachers know. In K. M. Bailey
& D. Nunan (Eds.), Voicesfrom the
Fanselow,J. E, & Barnard, R. language classroom: Qualitative research
(2006).Take 1, take 2, take 3: A
Reference 471
on language education (pp. 88-115). New Gass, S. M., & Mackey, A. (2007). Data
York: Cambridge University Press. elicitationforsecond andforeign language
Freeman, D. (1996b). The "unstudied research. Mahwah, NJ: Lawrence
problem": Research on teacher Erlbaum Associates.
learning in language teaching. In D. Gass, S. M., & Schachter,J. (Eds.).(1996).
Freeman & J. C. Richards (Eds.), Second language classroom research: Issues
Teacher learning in language teaching andopportunities. Mahwah, NJ:
(pp. 351-378). Cambridge: Lawrence Erlbaum Associates.
Cambridge University Press. Gieve, S., & Miller, I. K. (Eds.). (2006).
Freeman, D. (1998). Doing teacher Understanding the language classroom.
research: From inquiry to understanding. Basingstoke, Hampshire, England:
Boston: Heinle & Heinle. Palgrave Macmillan.
Fry,J. (1988). Diary studies in classroom Giroux,H. (1983). Theory andresistance in
SLA research: Problems and prospects. education: A pedagogyfor the opposition.
JALTJournal, 9, 158-167. South Hadley, MA: Bergin and Garvey.
Gaies, S.J. (1980,July). Classroom cen Gliksman, L., Gardner, R. C, & Smythe,
tered research: Some consumer guide P. C. (1982).The role of the integra
lines. Paper presented at the TESOL tive motive on students' participation
Summer Meeting, Albuquerque, NM. in the French classroom. Canadian
Gaies, S.J. (1983).The investigation of Modern Language Review, 38, 625-647.
language classroom processes. TESOL Grandcolas, B., & Soule-Susbielles, N.
Quarterly, 17(2), 205-217. (1986). The analysis of the foreign
Galda, D. (in press)."My words is big language classroom.Studies in Second
problem": The life and learning Language Acquisition, 8, 293-308.
experiences of three elderly Eastern Green, J., & Wallat, C. (Eds.). (1981).
European refugees studying ESL at a Ethnography andlanguage in educational
community college. In K. M. Bailey & settings. Norwood, NJ: Ablex Publish
M. G. Santos (Eds.), Research on ing Corporation.
English asa second language in U.S. Grotjahn, R (1987). On the method
community colleges: People, programs and ological basis of introspective
potential. Ann Arbor, MI: Universityof methods. In C. Faerch & G. Kasper
Michigan Press. (Eds.), Intrvspection insecond language
Galvan,J. L. (1999). Writing literature research (pp. 54-81). Clevedon,
reviews: A guidefor students ofthe social England: Multilingual Matters.
andbehavioral sciences. Los Angeles: Gu, P. Y, & Wen, Q. (2005). How often
Pyrczak Publishing. is often? Reference ambiguities of the
Garner, H. (2006). The rules of engage Likert scale in language learning strat
ment: Paul Greengrass's United 93. egy research. Review ofApplied Linguis
TheMonthly Online. Retrieved tics in China: Issues in Language Learn
January 25, 2008, from http://www. ingandTeaching, 1, 61-80.
themonthly.com.au/tm/node/2 71 Halbach, A. (2000). Finding out about
Gass, S. M. (1997).Input, interaction, and students' learning strategies by look
the second language learner. Mahwah, ing at their diaries: A casestudy.
NJ: Lawrence Erlbaum Associates. System, 28, 85-96.
Gass, S. M., & Mackey, A. (2000). Halliday, M. A. K. (1975). Learning
Stimulated recall methodology insecond how tomean: Explorations in the
language research. Mahwah, NJ: development oflanguage. London:
Lawrence Erlbaum Associates. Edward Arnold.
472 Reference
Hammersley,M., & Atkinson, P. (1983). learning. Mahwah, NJ: Lawrence
Ethnography: Principles inpractice. Erlbaum Associates.
London: Tavistock Publications. Ho, B., & Richards, J. (1993). Reflective
Harklau, L. (1994). ESL versus main thinking through teacher journal
stream classes: Contrasting L2 learn writing: Myths and realities. Prospect:
ing environments. TESOL Quarterly, A Journal ofAustralian TESOL, 8(3),
28(2), 241-272. 7-24.
Harklau, L. (2000). From the "good kids" Holliday, A. (2002). Doing andwriting
to the "worst": Representations of qualitative research. London: Sage.
English language learners across edu Holmes, 0. W (1906). The professor at
cational settings. TESOL Quarterly, thebreakfast table. London: J. M. Dent
34(1), 35-67. &Co|
Harrison, I. (1996). Looks who's talking Hornber£er, N. (1988). Bilingual
now: Listening to voices in curriculum education andlanguage maintenance:
renewal. In K. M. Bailey& D. Nunan, A Southern Peruvian case study.
(Eds.), Voicesfrom the language class Dordrecht: Foris Publications.
room: Qualitative research on second Huang, J, (2005).A diary study of
language education (pp. 283-303). New difficulties and constraints in EFL
York: Cambridge University Press. learning. System, 33, 609-621.
Hatch, E. M. (Ed.). (1978). Second Hughes,A.(1989). Testingforlanguage
language acquisition: A book ofreadings. teachers. Cambridge: Cambridge
Rowley, MA: Newbury House. University Press.
Hatch, E. M., & Farhady, H. (1982). Hunt, K. W. (1970).Syntactic maturity in
Research design andstatisticsfor applied school children and adults. Chicago,IL:
linguistics. Rowley, MA: Newbury University of Chicago Press.
House.
Jaeger, R, (1988). Survey research
Hatch, E. M., & Lazaraton, A. (1991).
methodsin education. In R. M. Jaeger
The research manual: Design andstatis (Ed.), ^Complementary methodsfor
ticsfor applied linguistics. New York: research in education (pp. 303-338).
Newbury House. Washington, DC: American
Heath, S. B. (1983). Ways with words. Educational Research Association.
Cambridge: Cambridge University Jaeger, R, M. (1993). Statistics—a spectator
Press.
sport (2nd ed.). Newbury Park, CA:
Henze, R. C. (1995). Guides for the Sage.
novice qualitativeresearcher. TESOL Jarvis,J. (1992).Using diaries for reflec
Quarterly, 29(3), 595-599. tion on in-service courses. English
Hilleson, M. (1996). "I want to talk to Language TeachingJournal, 46(2),
them but I don't want them to hear": 133-143.
An introspective study of second- Jepson, K. (2005). Conversations—and
language anxietyin an English- negotiated interaction—in text and
medium school. In K. M. Bailey& voice chat rooms. Language Learning
D. Nunan (Elds.), Voicesfrom the language andTechnology, 9(3), 79-98.
classroom: Qualitative research in second
Johnson, D. (1992).Approaches toresearch
language education (pp. 248-275).
in second language learning. White
Cambridge: Cambridge University
Press.
Plains, NY: Longman.
Hinkel, E. (Ed.). (2005). Handbook of Johnson, K. E. (1992a). Learning to
teach: Instructional actions and
research insecond language teaching and
Reference 473
decisions of preservice ESL teachers. Kebir, C. (1994). An action research look
TESOL Quarterly, 26(3), 507-535. at the communication strategies of
Johnson, K. E. (1992b). The instructional adult learners. TESOL Journal, 4(1),
decisions of pre-service English as a 28-31.
second language teachers: New Kemmis, S., & Henry, C. (1989). Action
directions for teacher preparation research. IATEFL Newsletter, 102, 2-3.
programs. In J. Flowerdew, M. Brock, Kemmis, S., & McTaggart, R. (1982). The
& S. Hsia (Eds.), Perspectives on action research planner. Victoria: Deakin
second language teacher development University.
(pp. 115-134). Hong Kong: City Kemmis, S., & McTaggart, R. (1988).
University of Hong Kong. The action research planner (3rd ed.).
Johnson, K. E. (1995). Understanding Victoria: Deakin University.
communication insecond language Kennedy, M. (1990). Policy issues in teacher
classrooms. Cambridge: Cambridge education. East Lansing, MI: National
University Press. Center for Research on Teaching.
Johnson, K. E. (1996). The vision versus Kincheloe, J. L. (1991). Teachers as
the reality: The tensions of the researchers: Qualitative inquiry asa path
TESOL practicum. In D. Freeman & toempowerment. London: The Falmer
J. C. Richards (Eds.), Teacher- learning Press.
in language teaching (pp. 30-49).
Knezedvoc, B. (2001). Action research.
Cambridge: Cambridge University
IATEFL Teacher Development SIG
Press.
Newsletter, 1(1), 10-12.
Johnson, K. E. (1998). Teachers under
Knowles, T. (1990). Action research: A
standing teaching. Boston: Heinle &
way to make our ideas matter. The
Heinle.
Language Teacher; 14(7).
Johnson, K. E. (1999). Understanding
Kramsch, C. (2000). Social discursive
language teaching: Reasoning inaction.
constructions of self in L2 learning.
Boston: Heinle & Heinle.
In J. P. Lantolf (Ed.), Socioadtural
Jones, F. R. (1994).The lone language theory andsecond language learning
learner: A diary study. Systan, 22(4), (pp. 133-153). Oxford: Oxford
441-454. University Press.
Jones, F. R. (1995). Learning an alien Krishnan, L. A., & Hoon, L. H. (2002).
lexicon: A teach-yourself casestudy. Diaries: Listening to 'voices'from the
Second Language Research, 11(2), multicultural classroom.English
95-111. Language TeachingJournal, 56(3),
Jourdenais, R. (2001). Cognition, 227-239.
instruction, and protocol analysis. In Kuhn, T. (1996). The structure ofscientific
P. Robinson (Ed.), Cognition andsecond revolutions (3rd ed.). Chicago:
language learning (pp. 354-375). University of Chicago Press.
Cambridge: Cambridge University
Kumaravadivelu, B. (1994). The
Press.
post-method condition: (E)merging
Katz, A. (1996). Teachingstyle:A way to strategiesfor second/foreign language
understand instruction in language teaching. TESOL Quarterly, 28(1),
classrooms. In K. M. Bailey & 28-49.
D. Nunan (Eds.), Voicesfrom the
Kumaravadivelu, B. (2003). Beyond
language classroom: Qualitative research
methods: Macrostrategiesfor language
on language education (pp. 57-87). New
teaching. New Haven, CT: Yale
York: Cambridge University Press.
University Press.
474 Reference
Kwan, T. Y. L. (1993). Contexts for Education Conference,Universityof
action research development: The Hong Kong, Hong Kong.
case for Hong Kong. Curriculum Leopold, W. F. (1978).A child's learning
Forum, 3(3), 11-23. of two languages.In E. M. Hatch
Labov,W. (1972). Some principles of (Ed.), Second language acquisition: A
linguistics methodology. Language in book ofreadings (pp. 23-32). Rowley,
Society, 1, 97-120. MA: Newbury House.
Lankshear, C, & Knobel, M. (2004). Levine, H, Gallimore, R., Weisner, T. S.,
A handbookfor teacher research: From & Turner, J. L. (1980).Teaching par
design toimplementation. New York: ticipant observation research methods:
Open University Press. A skills-buildingapproach. Anthropol
Lantolf, J. P. (Ed). (2000). Socioadtural ogy andEducation Quarterly, 11(1),
theory andsecond language learning. 38-45:
Oxford: Oxford University Press. Lewin, Ki (1946). Action research and
Larimer, R. E., & Schleicher L. (Eds.). minority problems.Journal ofSocial
(1999). Newways in using authentic Issues, 2, 34-46.
i
Reference 475
pp. 3-35, by H. W. Seliger and M. H. Mackey, A., & Gass, S. M. (2005). Second
Long, Eds., 1983, Rowley, MA: language research: Methodology andde
Newbury House). sign. Mahwah, NJ: Lawrence Erlbaum
Long, M. H. (1983a). Does second lan Associates.
guage instruction make a difference? Markee, N. (1996).Making second
TESOL Quarterly, 17(3), 359-382. language classroom research work. In
Long, M. H. (1983b). Linguistic and J. Schachter & S. Gass (Eds.), Second
conversational adjustments to non- language classroom research: Issues and
native speakers. Studies in Second opportunities (pp. 117-155). Mahwah,
Language Acquisition, 5, 177-193. NJ: Lawrence Erlbaum Associates.
Long, M. H. (1984). Process and product Markee, N. (2000). Conversational analysis.
in ESL program evaluation. TESOL Mahwah, NJ: Lawrence Erlbaum
Quarterly, 18(3), 409-425. Associates.
476 Reference
English-speaking teachers. CATESOL Miles, M. B., & Huberman, A. (1994).
Journal, 13(1), 109-121. Qualitative data analysis: An expanded
Matsumoto, K. (1987). Diary studies of sourcebook. Thousand Oaks, CA: Sage.
second language acquisition: A critical Mingucci,M. (1999). Actionresearchin
overview. JALTJournal, 9, 17-34. ESL staff development. TESOL
Matsumoto, K (1989).An analysis of a Matters, 9(2), 16.
Japanese ESL learner's diary: Factors Mitchell, M., & Jolley, J. (1988). Research
involved in the L2 learning process. design explained. New York: Holt
JALTJournal, 11, 167-192. Rinehart Winston.
Maxwell, J. A. (2005). Qualitative research Mok, A. (1997). Student empowerment
design: An interactive approach (2nd ed.). in an English language enrichment
Thousand Oaks, CA: Sage. programme: An action research proj
McCarthy, M., & Walsh, S. (2003). ect in Hong Kong. Educational Action
Discourse. In D. Nunan (Ed.), Research, 5(2), 305-320.
Practical English language teaching Moore, T. (1977). An experimental lan
(pp. 173-195). New York: guage handicap: Personal account.
McGraw-Hill. Bulletin ofthe British Psychological
McKay,S. L. (2006).Researching second Society, 30, 107-110.
language classrooms. Mahwah, NJ: Morgan,' D. (1997). Focus groups asquali
Lawrence Erlbaum Associates. tativeresearch (2nd ed.). Thousand
McDonough, J. (1994).A teacher looks at Oaks, CA: Sage.
teachers' diaries. English Language Moskowitz, G. (1968). The effects of
TeachingJournal, 48(1), 57-65. training foreign language teachers in
McGarrell, H. M. (Ed.). (2007). Language Interaction Analysis. Foreign Language
teacher research in the Americas. Annals, 1(3), 218-235.
Alexandria, VA: TESOL. Moskowitz, G. (1971). Interaction
McPherson, P. (1997). Action research: Analysis—a new modern language for
Exploring learner diversity. Prospect: supervisors. Foreign Language Annals,
A Journal ofAustralian TESOL, 12(1), 5, 211-221.
50-62. Moskowitz, G. (1976). The classroom
Merriam, S. B. (1988). Case study research interaction of outstanding foreign
in education: A qualitative approach. San language teachers.Foreign Language
Francisco:Jossey-Bass. Annals, 9, 125-143 and 146-157.
Merriam, S. B. (1998). Qualitative Nassaji, H., & Cumming, A. (2000).
research andcase study applications in What's in a ZPD? A casestudy of a
education. (2nd ed.). San Francisco: young ESL student and teacher inter
Jossey-Bass. acting through dialogue journals. Lan
Michonska-Stadnik, A., & Szulc-Kurpaska, guage Teaching Research, 4(2), 95-121.
M. (Eds.). (1997).Action research in Nisbett, R., & Wilson, T. (1977). Telling
the lower Silesia cluster colleges: more than we can know: Verbal re
Developing learnerindependence. ports on mental process. Psychological
Orbis Linguarum, 2. Legnica,Poland: Review, 84,231-259.
Nauczycielskie KolegiumJezkw Numrich, C. (1996). On becoming a lan
Obcych and the BritishCouncil. guage teacher: Insights from diary
Miles, M. B., & Huberman, A. (1984). studies. TESOL Quarterly, 30(1),
Qualitative data analysis: A sourcebook 131-151.
ofnew methods. BeverlyHills, CA: Nunan, D. (1988). The learner-centred
Sage. curriculum: A study insecond language
Reference 477
teaching. Cambridge: Cambridge Nunan, D. (1997a). Developingstandards
University Press. for teacher-research in TESOL.
Nunan, D. (1989). Understanding TESOL Quarterly, 31(2), 365-367.
language classrooms: A guidefor teacher- Nunan, D. (1997b). Research, the teacher
initiated action. New York: Prentice and classrooms of tomorrow. In G. M.
Hall. Jacobs (Ed.),Language classrooms
Nunan, D. (1990). Action research in the oftomorrow: Issues andresponses
language classroom. In J. C. Richards (pp. 183-194). Singapore: SEAMEO
& D. Nunan (Eds.), Second language Regional Language Center.
teacher education (pp. 62-81). Nunan, D. (1999).Second language teach
Cambridge: Cambridge University ingandlearning. Boston:Heinle &
Press. Heinle.
Nunan, D. (1991a). Methods in second Nunan, D. (Ed.). (2003). Practical English
language classroom-oriented research: language teaching. New York:
A critical review. Studies in Second McGraw-Hill.
Language Acquisition, 13(2), 249-274. Nunan, D. (2004). Task-based language
Nunan, D. (1991b). Second language teaching. Cambridge: Cambridge
acquisition research in the language University Press.
classroom. In E. Sadtono (Ed.), Lan Nunan, D. (2005). Classroom research.
guage acquisition and the second/foreign In E. Hinkel (Ed.), Handbook of
language classroom (AnthologySeries research in second language teaching and
28, pp. 1-24). Singapore: SEAMEO learning (pp. 225-240).Mahwah, NJ:
Regional Language Center. Lawrence Erlbaum Associates.
Nunan, D. (1991c). Language teaching Nunan, D. (2007a). What is this thing
methodology. London: Prentice Hall. called language? London: Palgrave
Nunan, D. (1992). Research methods in Macmillan.
language learning. Cambridge: Nunan, D. (2007b, September). Diverse
Cambridge University Press. voices: What we can learn from listen
Nunan, D. (1993a). Action research in ing to our learners? Plenary presenta
language education.In J. Edge & tion at the English Australia Confer
K Richards (Eds.), Teachers develop ence, Sydney, Australia.
teachers research: Papers onclassroom Nunan, D., & Lamb, C. (1996). The self-
research andteacher development directed teacher: Managing thelearning
(pp. 39-50). Oxford: Heinemann process. Cambridge: Cambridge
International. University Press.
Nunan, D. (1993 b). Teachers' interactive Nunan, D., & Wong, L. (2003, Decem
decision-making. Sydney:National ber). Learning styles and strategies:
Centre for English Language An empirical investigation. Paper
Teaching and Research. presented at the Chulalongkorn
Nunan, D. (1996). Hidden voices: University Language Institute
Insiders' perspectiveson classroom International Conference, Bangkok,
interaction. In K. M. Bailey & Thailand.
D. Nunan (Eds.), Voicesfrom the Nunan, D., & Wong, L. (2006). The
language classroom: Qualitative research good language learner: An empirical
onlanguage education (pp. 41-56). investigation. Unpublished manu
New York: Cambridge University script, University of Hong Kong, The
Press. English Centre.
478 Reference
O'Farrell, A. (2003).The language of Peck, S. (1996). Language learning
nonproliferationstudies.Unpublished diaries as mirrors of students' cultural
manuscript, Monterey Institute of sensitivity. In K. M. Bailey& D.
International Studies, Monterey, Nunan (Eds.), Voicesfrom the language
California. classroom: Qiuilitative research in second
Ohta, A. S. (2000). Rethinking interac language education (pp. 236-247).
tion in SLA: Developmentally appro Cambridge: Cambridge University
priate assistance in the zone of proxi Press.
mal development and the acquisition Pennington, M. C, & Richards,J. C.
of L2 grammar. In J. P. Lantolf (Ed.), (1997). Reorienting the teaching uni
Sociocultural theory andsecond language verse: [The experience of five first-year
learning (pp. 51-78). Oxford: Oxford English teachers in Hong Kong.
University Press. Language Teaching Research, 1(2),
Otto, F. M. (1969). The teacher in the 149-178.
Pennsylvania project. Modern Lan Perecman, E., & Curran, S. (2006). A
guage Journal, 53, 411-420. handbookfor social sciencefield research.
Oxford, R. (2001). Language learning Thousand Oaks, CA: Sage.
strategies. In R. Carter & D. Nunan Perry, F. L. (2005). Research in applied
(Eds.), The Cambridge guide toteaching linguistics: Becoming a discerning
English to speakers ofother languages consumer. Mahwah, NJ. Lawrence
(pp. 166-172). Cambridge: Erlbaum Associates.
Cambridge University Press. Peyton, ji K. (1990). Beginning atthe
Oxford,R. (2002). Languagelearningstyles beginning: First-grade ESL students
and strategies. In M. Celce-Murcia learning to write. In A. Padilla, H. H.
(Ed.), Teaching English asa second or Fairchild, & C. M. Valadez (Eds.),
foreign language (3rded.,pp. 359-383). Bilingual education: Issues andstrategies
Boston: Heinle & Heinle. (pp. 195-218). NewburyPark, CA:Sage.
Palmer, C. H. (1992). Diaries for self- Pica, T. (1994). Researchon negotiation:
assessment and INSET programme What does it reveal about second lan
evaluation. European Journal of Teacher guage learning conditions, processes
Education, 15(3), 227-238. and outcomes. Language Learning, 44,
Palmer, G. M. (1992). The practical 493-5b.
feasibilityof diary studies for INSET. Pica,T. (1997). Second language teaching
European Journal of Teacher Education, and research relationships: A North
15(3), 239-254. American view. Language Teaching
Parkinson, B., & Howell-Richardson, C. Research, 1(1), 48-72.
(1990). Learner diaries. In C. Brumfit Pica, T, & Doughty, C. (1985a).The
& R. Mitchell (Eds.), Research in the role of group work in classroom
language classroom: ELTDocuments, second language acquisition. Studies in
133 (pp. 128-140). London: English Second Language Acquisition, 7(2),
Publications and the British Council. 233-248.
Peck, S. (1980). Language play in child Pica, T, & Doughty, C. (1985b). Input
second language acquisition. In D. and interaction in the communicative
Larsen-Freeman (Ed.), Discourse language classroom: A comparison of
analysis insecond language research teacher-fronted and group activities.
(pp. 154-164). Rowley, MA: Newbury In S. Gass & C. Madden (Eds.),
House. Input insecond language acquisition
Reference 479
(pp. 115-132). Rowley, MA: Newbury In J. Edge (Ed.), Action research
House. (pp. 81-91). Alexandria, VA: TESOL.
Pike, K. L. (1964). Language in relation to Radecki, W. (2002). Student and teacher
a unified theory ofstructures ofhuman preferences in the high-tech class
behavior. The Hague: Mouton. room Unpublished manuscript, Zayed
Piore, M. (2006). Combining qualitative University, Dubai, United Arab
and quantitative tools: Qualitative Emirates.
research—does it fit in economics? In Reid,J. (1990). The dirty laundry of ESL
E. Perecmann & S. Curan (Eds.), A survey research. TESOL Quarterly,
handbookfor social sciencefield research 24(2), 323-338.
(pp. 143-157). London: Sage. Richards, J. C, & Lockhart, C. (1994).
Plummer, K. (1983). Doamtents oflife: An Reflective teaching in second language
introduction to theproblems andlitera classrooms. Cambridge: Cambridge
ture ofa humanistic method. London: University Press.
George Allen and Unwin. Richards, K. (1992). Pepys into a TEFL
Polio, C. (1996). Issues and problems in course. English Language Teaching
reporting classroom research. In J. Journal, 46(2), 144-152.
Schachter & S. Gass (Eds.), Second Richards, K. (2003). Qualitative inquiry in
language classroom research: Issues and TESOL. Houndsmill, UK: Palgrave,
opportunities (pp. 61-79). Mahwah, NJ: Macmillan.
Lawrence Erlbaum Associates.
Rivers, W (1979). Learning a sixth
Polio, C, & Wilson-Duffy, C. (1998). language:An adult learner's diary.
Teaching ESL in an unfamiliar Canadian Modern Language Review,
context: International students in a 36(1), 67-82.
North American TESOL practicum. Rivers, W. (1983). Communicating
TESOL Journal, 7(4), 24-29. naturally in a second language: Theory
Popper, K. (1968). The logic ofscientific andpraaice in language teaching.
discovery. London: Hutchinson. Cambridge: Cambridge University
Popper, K. (1972). Objeaive knowledge. Press.
Oxford: Oxford University Press. Roebuck, R. (2000). Subjects speak out:
Porter, P. A., Goldstein, L. M., How learners position themselves in a
Leatherman, J., & Conrad, S. (1990). psycholinguistictask. In J. P. Lantolf
An ongoing dialogue:Learner logs for (Ed.), Socioadtural theory andsecond
teacher preparation.In J. C. Richards language learning (pp. 79-95). Oxford:
& D. Nunan (Eds.), Second language Oxford University Press.
teacher education (pp. 227-240). Rounds, P. L. (1987). Characterizing
Cambridge: Cambridge University successful classroom discourse for
Press. NNS teaching assistant training.
Porto, M. (2007). Learning diaries in the TESOL Quarterly, 21(4), 643-671.
English as a foreign language class Rounds, P. L., & Schachter, J. (1996).
room: A tool for accessing learners' The balancing act: Theoretical,
perceptions of lessons and developing acquisitional and pedagogical issues
learner autonomy and reflection. in second language research. In J.
Foreign Language Annals, 40(4), Schachter & S. Gass (Eds.), Second
672-696. language classroom research: Issues
Quirke, P. (2001). Hearing voices: A andopportunities (pp. 99-116).
robust and flexible framework for Mahwah, NJ: Lawrence Erlbaum
gathering and using student feedback. Associates.
480 Reference
Rowntree, D. (1981). Statistics without study of an adult. In N. Wolfson &
tears: A primerfor non-mathematicians. E. Judd (Eds.), Sociolinguistics and
London: Longman. language acquisition (pp. 137-174).
Rowsell, L. V, & Libben, G. (1994). The Rowley, MA: Newbury House.
sound of one hand clapping: How to Schmidt, R. W (1984). The strengths
succeed in independent language and limitations of acquisition: A case
learning. Canadian Modern Language study of an untutored language
Review, 50(4), 668-687. learner. Language Learning and
Rubin, J., & Henze, R. (1981, February). Communication, 3(1), 1-16.
The foreign language requirement: A Schmidt, R. W, & Frota, S. N. (1986).
suggestion to enhance its educational Developing basicconversational abil
role in teacher training. TESOL ity in a second language:A case study
Newsletter, 15, 17, 19,24. of an adult learner of Portuguese. In
Ruiz de Gauna, P., Diaz, C, Gonzalez, R. R. Day (Ed.), Talking tolearn: Con
V, & Garaizar, I. (1995). Teachers' versation in second language acquisition
professionaldevelopment as a process (pp. 237-326). Rowley, MA: Newbury
of critical action research. Educational House.
Aaion Research, 3(2), 183-194. Schrank, A (2006). Bringing it all back
Ruso, N. (2007). The influence of task home: Personal reflections on friends,
based learning on EFL classrooms. findings and fieldwork. In E. Perecman
Asian EFLJournal, 18. Retrieved & S. Gurran (Eds.), A handbookforsocial
July 11, 2007, from http://www. sciencefield research (pp.217-225).
asian-efl-journal.com/pta_Febmary_ Thousand Oaks, CA Sage.
2007_nr.php Schumann, E (1980). Diary of a language
Santana-Williamson, E. (2001). Early learner: A further analysis. In S.
reflections:Journaling a wayinto Krashen & R. Scarcella (Eds.),
teaching. In J. Edge (Ed.), Action Research in second language acquisition:
research (pp. 33-44). Alexandria, VA: Selectedpapers ofthe Los Angeles Second
TESOL. Language Research Forum (pp. 51-57).
Sato, C. (1982). Ethnic styles in class Rowley, MA: Newbury House.
room discourse. In M. Hines & W. Schumann, E E., & Schumann, J. H.
Rutherford (Eds.), On TESOL '81 (1977). Diary of a language learner:
(pp. 11-24). Washington, DC: TESOL. An introspective study of second
Scales,J., Wennerstrom, A., Richard, D., language learning. In H. D. Brown,
& Wu, S. H. (2006). Language learn R. H. :Crymes, & C. A. Yorio (Eds.),
ers' perceptions of accent. TESOL On TESOL '77: Teaching andlearning
Quarterly, 4(4), 715-737. English asa second language—trends in
research andpractice (pp. 241-249).
Schachter, J., & Gass, S. (Eds.). (1996).
Washington, DC: TESOL.
Second language classroom research: Issues
andopportunities. Mahwah, NJ: Schumann,J. (1978a). The Pidginization
Lawrence Erlbaum Associates.
Hypothesis: A modelforsecond language
acquisition. Rowley, MA: Newbury
Scherer, G. A. C, & Wertheimer, M.
House.
(1964).A psycholinguistic experiment in
foreign language teaching. New York: Schumann, J. (1978b). Second language
McGraw-Hill.
acquisition: The Pidginization
Hypothesis. In E. M. Hatch (Ed.),
Schmidt, R. W (1983). Interaction,
Second language acquisition: A book of
acculturation, and the acquisition of
readings (pp. 256-271). Rowley, MA:
communicative competence: A case
NewburyHouse.
Reference 481
Seliger,H., & Shohamy,E. (1989).Second language education: Models andmethods
language research methods. Oxford: (pp. 259-281). Washington, DC:
Oxford University Press. Georgetown University Press.
Seliger,H. W (1977). Does practice Shuy, R. W (1993). Usinglanguage
make perfect? A study of interaction functions to discover a teacher's implicit
patterns and L2 competence. theory of communicating with students.
Language Learning, 27, 263-278. In J. K. Peyton &J. Staton (Eds.),
Seliger,H. W. (1983a). Classroom- Dialoguejournals in the multilingual
centered research in language teach classroom: Building languagefluency and
ing: Two articles on the state of the writing skills through written interaction
art. TESOL Quarterly, 17(2), 189-190. (pp. 127-154). Norwood, NJ: Ablex.
Seliger,H. W (1983b).The language Simard, D. (2004). Using diaries to pro
learner as linguist: Of metaphors and mote metalinguistic reflection among
realities. Applied Linguistics, 4, elementary school students. Language
179-191. Awareness, 13(1), 34-48.
Shamim, P (1996). In or out of the action Sinclair, J., & Coulthard, M. (1975).
zone: Location as a feature of interac Toward ananalysis ofdiscourse. London:
tion in large ESL classesin Pakistan. Oxford University Press.
In K. M. Bailey & D. Nunan (Eds.), Smith, L. M., & Geoffrey,W (1968). The
Voicesfrom the language classroom: complexities ofan urban classroom: An
Qualitative research on language analysis toward a general theory ofteach
education (pp. 123-144). New York: ing. New York: Holt, Rinehart and
Cambridge University Press. Winston.
Shavelson, R. J. (1981). Statistical reason Smith, J. K., & Heshusius, L. (1986).
ingfor the behavioral sciences. Boston: Closing down the conversation:The
Allyn and Bacon. end of the quantitative-qualitative
Shavelson, R. J. (1996). Statisticalreason debate among educational inquirers.
ingfor the behavioral sciences (3rd ed.). Educational Researcher, 15(1), 4—12.
Boston:Allynand Bacon. Smith, P. D. (1970). A comparison ofthe
Shaw, P. A. (1983). A sociolinguistic cognitive andaudiolingual approaches to
analysis of spoken discourse in foreign language instruction: The
undergraduate engineering classes. Pennsylvaniaforeign language project.
Unpublished doctoral dissertation, Philadelphia: Center for Curriculum
University of Southern California, Development.
Los Angeles. Snow,M. A., Hyland, J., Kamhi-Stein, L.,
Shaw, P. A. (1996). Voices for improved & Yu, J. H. (1996). U.S. languagemi
learning: The ethnographer as co- nority students: Voicesfrom the junior
agent of pedagogic change. In K. M. high classroom. In K. M. Bailey& D.
Bailey& D. Nunan (Eds.), Voices Nunan (Eds.), Voicesfrom the language
from the language classroom: Qualitative classroom: Qualitative research on lan
research onlanguage education guage education (pp. 304—317). New
(pp. 318-337).New York: Cambridge York: Cambridge University Press.
University Press. Spada, N. (1990). Observing classroom
Shaw, P. A. (1997). With one stone: behaviours and learning outcomesin
Models of instruction and their curricu- different second language classrooms.
lar implicationsin an advancedcontent- In J. C. Richards & D. Nunan (Eds.),
based foreign language program. In Second language teacher education
S. B. Stryker & B. L. Leaver (Eds.), (pp. 293-310). Cambridge:
Content-based instruction inforeign Cambridge University Press.
482 Reference
Spada, N., & Frohlich., M. (1995). Stewart, D., & Shamdasani, P. (1990).
Communicative Orientation ofLanguage Focus groups: Theory andpractice.
Teaching observation scheme: Coding Newbury Park, CA: Sage.
conventions andapplications. Sydney: Stewart, T. (2006). Teacher-researcher
NCELTR Macquarie University. collar, oration or teachers' research?
Spada, N., Ranta, L., & Lightbown, TESOL Quarterly, 40(2), 421-430.
P.M. (1996). Workingwith teachersin Storch, N. (2002). Patterns of interaction
second languageacquisition research. in ESL pair work. Language Learning,
In J. Schachter & S. Gass (Eds.), 52(1)\ 119-158.
Second language classroom research: Issues Strauss, A. (1988).Teaching qualitative
and opportunities (pp. 61-79). Mahwah, research methods: A conversation
NJ: Lawrence Erlbaum Associates. with Andrew Strauss. Qualitative
Spradley,J. P. (1979). The ethnographic Studiesin Education, 1(1), 91-99.
interview. New York: Holt, Rinehart Strong, M. (1986). Teachers' languageto
and Winston. limited English speakers and submer
Spradley, J. P. (1980).Participant observa sion classes. In R. R. Day (Ed.), Talk
tion. New York: Holt, Rinehart and ing to\learn: Conversation in second lan-
Winston. guage^acquisition (pp. 53-63). Rowley,
Springer, S. E. (2003). Contingent lan MA: Newbury House.
guage use and scaffoldingin a project- SullivanJ P. N. (2000). Playfulness as
based ESL course. Unpublished mediation in communicative language
manuscript,Monterey Institute of teaching in a Vietnameseclassroom.
International Studies, Monterey, In J. P. Lantolf (Ed.), Sociocultural
California. theory andsecond language learning
Springer, S. E., & Bailey, K. M. (2006, (pp. 115-131). Oxford: Oxford
April). Diversityin reflectiveteaching University Press.
practices: International survey results. Swaffar, J., Arens, K, & Morgan, M.
Paper presented at the CATESOL (1982). Teacherclassroom practices:
State Conference, San Francisco, Redefining method as task hierarchy.
California. Modern LanguageJournal, 66(1), 24-33.
Sreedharan, N. (2006). Sura's quotations Swain,M. (2000).The output hypothesis
of wit and wisdom. Chennai, India: and beyond: Mediating acquisition
Sura Books. through collaborative dialogue. In J. P.
Stake, R. E. (1988). Case study methods Lantolf (Ed.), Socioadtural theory and
in educational research: Seeking sweet second language learning (pp. 97-114).
water. In R. M.Jaeger (Ed.), Comple Oxford: Oxford University Press.
mentary methodsfor research in education Szostek, jC. (1994). Assessing theeffects
(pp. 253-266). Washington, DC: of cooperative learning in an honors
American Educational Research foreign language classroom. Foreign
Association. Language Annals, 27, 252-261.
Stake, R. E. (1995). The art ofcase Thorpejj. (2004). Coal miners, dirty
research. Newburg Park, CA: sponges, and the search for Santa:
Sage. Exploring options in teaching listen
Stenhouse, L. (1983). Case study in ing comprehension through TV news
educational research and evaluation. broadcasts. Unpublished manuscript,
In L. Bardett, S. Kemmis, & G. Monterey Institute of International
Gillard (Eds.), Case study: An overview Studies, Monterey, California.
(pp. 11-54). Geelong, Australia: Tinker Sachs, G. (2000). Teacher and re
Deakin University Press. searcher autonomy in action research.
Reference 483
Prospect: A Journal ofAustralian proficiency interviews as conversation.
TESOL, 15(3), 35-51. TESOL Quarterly, 23(3), 489-508.
Tinker Sachs, G. (2002). Learning van Lier, L. (1990a). Ethnography:
Cantonese: Reflections of an EFL Bandaid, bandwagon, or contraband?
teacher educator. In D. C. S. Li (Ed.), In C. Brumfit & R. Mitchell (Eds.),
Discourses in search ofmembers: In Research in thelanguage classroom: ELT
honor ofRon Scollon (pp. 509-540). Documents, 133 (pp. 33-53). London:
Lanham, MD: University Press of Modern English Publications and
America. British Council.
Trueba, G., Guthrie, P., & Au, K. H. P. van Lier, L. (1990b). Classroom research
(Eds.).(1981). Culture andthe bilingual in second language acquisition. An
classroom: Studies in classroom ethnogra nualReview ofApplied Linguistics, 10,
phy. Rowley, MA: Newbury House. 73-186.
Tsang, W. K. (2003).Journaling from in van Lier, L. (1992). Not the nine o'clock
ternship to practice teaching. Reflective linguisticsclass: Investigating contin
Practice, 4(2), 221-240. gency grammar. Language Awareness,
Tsui, A. B. M. (1995). An introdurtion to 1(2), 91-108.
classroom interaction. London: Penguin. van Lier, L. (1994a). Action research.
Tsui, A. B. M. (1996). Reticence and Sintagma, 6, 31-37.
anxiety in second language learning. van Lier, L. (1994b). Some features of a
In K. M. Bailey& D. Nunan (Eds.), theory of practice. TESOL Journal,
Voicesfrom the language classroom: 4(1), 6-10.
Qualitative research on language educa van Lier, L. (1996a). Conflicting voices:
tion (pp. 145-167). New York: Language, classrooms,and bilingual
Cambridge University Press. education in Puno. In K. M. Bailey&
Tsui, A. B. M. (2003). Understanding D. Nunan (Eds.), Voicesfrom the lan
expertise in teaching: Case studies of guage classroom: Qualitative research on
second language teachers. Cambridge: language education (pp. 363-387). New
Cambridge University Press. York: Cambridge University Press.
Tuckman, B. (1999). Conducting educational van Lier, L. (1996b). Interaaionin thelan
research (5th ed.). Ft. Worth, TX: guage airriadum: Awareness, autonomy,
Harcourt BraceCollege Publishers. andauthenticity. New York: Longman.
Turner, J. (1993). Another researcher van Lier, L. (1998). Constraints and re
comments. TESOL Quarterly, 24(4), sources in classroom talk: Issues of
736-739. equality and symmetry. In H. Byrnes
Tyacke, M., & Mendelsohn, D. (1986). (Ed.), Learningforeign and second
Student needs: Cognitive as well as languages: Perspeaives in research and
communicative. TESL Canada Journal scholarship (pp. 157-182). New York:
(Special Issue 1), 171-183. Modern Language Association of
America.
Ulichny, P. (1996). Performed conversa
tions in an ESL classroom. TESOL van Lier, L. (2000). From input to affor-
Quarterly, 30(4), 739-764. dance: Social-interactivelearning
van Lier, L. (1988). The classroom and the from an ecological perspective. In J. P.
language learner: Ethnography andsec Lantolf (Ed), Socioadtural theory and
ond language classroom research. Lon second language (pp. 245-259). Oxford:
don: Longman. Oxford University Press.
van Lier, L. (1989). Reeling, writhing, van Lier, L. (2001). Language awareness.
fainting and stretching in coils: Oral In R. Carter & D. Nunan (Eds.),
484 Reference
The Cambridge guide to teaching Wesche, M. B. (1983). Communicative
English to speakers ofother languages testing in a second language. Modern
(pp. 160-165). Cambridge: Cambridge Language Journal, 67(1), 41-55.
University Press. Whyte, W. F. (1981). Street corner society:
van Lier, L. (2005). Case study. In The social structure ofan Italian slum
E. Hinkel (Ed.), Handbook ofresearch (3rd ed.). Chicago: University of
insecond language teaching and learning Chicago Press.
(pp. 195-208;. Mahwah, N.J.: Wiersmal W (1986). Research methods in
Lawrence Erlbaum Associates. education. Boston: Allyn and Bacon.
Verity, D. P. (2000). Side affects: The Willing, K. (1988).Learning styles in adult
strategic development of professional migrant education. Sydney: National
satisfaction. In J. P. Lantolf (Ed.), Centre for English Language Teach
Sociocultural theory andsecond language ing and Research.
learning (pp. 179-197).Oxford: Willing, K. (1990). Teaching how to learn.
Oxford University Press. Sydney:National Centre for English
Vygotsky, L. S. (1978).Mindinsociety. Language Teaching and Research.
Cambridge, MA: Harvard University Winer, L. (1992). "Spinach to chocolate":
Press.
Changing awareness and attitudes in
Vygotsky, L. S. (1986). Thought andlan ESL writing teachers. TESOL
guage. Cambridge,MA:Massachusetts Quarterly, 26(1), 57-79.
Institute of Technology. Woodfield, H, & Lazarus, E. (1998).
Wajnryb, R. (1992). Classroom observation Diaries: A reflective tool on an INSET
tasks: A resource bookfor language language course.English Language
teachers andtrainers. Cambridge: TeachingJournal, 52(4), 315-322.
Cambridge University Press. Woods, D. (1989). Studying ESL
Wallace, M.J. (1998).Aaionresearchfor teachers' decision-making: Rationale,
language teachers. Cambridge: methodologicalissuesand initial
Cambridge University Press. results. Carleton Papers in Applied
Walsh, S. (2006). Investigating classroom Language Studies, 6, 107-123.
discourse. London: Roudedge Taylor Woods, D. (1996). Teacher cognition in
Francis Group. language teaching: Beliefs, decision
Wang, J. (2003). Students' needs and making andclassroom praaice.
teachers' dilemmas: A case of one Cambridge: Cambridge University
university. Hong KongJournal of Press.j
Applied Linguistics, 8(1), 33-50. Yahya, N. (2000). Keeping a critical eye
Warden, M., Swain, M., Lapkin, S., & on one's own teaching practice. EFL
Hart, D. (1995).Adolescent language teachers' use of reflective teaching
learners on a three-month exchange: journals.Asian Journal ofEnglish
Insights from their diaries. Foreign Language Teaching, 10, 1-18.
Language Annals, 28(4), 537-550. Yin, R. (1984). Case study research: Design
Watson-Gegeo, K A. (1988). andmethods. BeverlyHills, CA: Sage.
Ethnography in ESL: Defining the Yin, R. (2003). Case study research: Design
essentials. TESOL Quarterly, 22(4), and methods (3rd ed.). Thousand Oaks,
575-592. CA: Sage Publishing.
Weitzman, E. A, & Miles, M. B. (1995). Youngman, M. B. (1986).Analyzing
A software sourcebook: Computerpro questionnaires. Nottingham, UK:
gramsfor qualitative data analysis. University of Nottingham, School of
Thousand Oaks, CA: Sage. Education.
Reference 485
INDEX
Abstract 251, 338, 438,451, 460 Audio-recording 11, 16, 169, 182, 193, 200,
Acheson, K. A. 272 259-260, 263, 267, 271, 277, 281,
Achievement 16, 29, 31, 65, 125, 236, 258, 287-289, 306, 339,413,415,417,435,
320, 390, 393 455,457
Act265,271,341,347 Authenticity 325, 347, 391
ACTFL 324 Autobiographical research 284, 297, 300,
Action research 1, 5-7, 17-20, 23-25, 36, 307-308
44,47, 55, 63, 67, 77, 81-82, 120, 127, Bachman, L. F. 325, 335
129, 135, 158, 226-237, 239-241, Back translation 124, 146
243-253, 372, 379, 408,428, 438,445, Bacon, R. 371
452-453 Bailey, K. M. 9-11, 18, 19, 25, 32, 79,116,
Adair-Hauck, B. 162, 184 118,124, 130, 136,140,143, 154,
Adelman, C. 158, 159, 166, 184, 185 195-198, 258, 263, 266, 272-273, 277,
Affective factors 29 286, 289, 292-296, 299, 300, 302-306,
Aljaafreh,A.212 322,335, 363, 369, 391,401,416-417,
Allen, P.J. 14,268 428,430-435,440-442,456,458-459
Allison, D. 294 Barkhuizen, G. 349-350, 370,421
Allwright, D (R. L.) 48, 79, 166, 169, 202, Barlow, L. 462
253, 257, 263, 266, 275, 279, 283, 369, Barlow, M. 428,436
428,435 Barnard, R. 253
AIreck,P. 140, 142, 155 Bassano, S. 237
Alternative directional hypothesis 58, 59, 73 Bateson, G. 16
Alternative hypothesis 57, 58-59, 73, 114 Beany, K. 425
Amanpour, C. 186, 219 Beaumont, M. 462
American PsychologicalAssociation(APA) 34 Bejarano, Y.404-408, 410
Analysis of variance (ANOVA)338, 378, Bell curve 109, 112-113,373
392-395,401^102,407-408 BellJ. 132
Analyticalnomological paradigm 11,444-445 Benchley, R. 3
Andrews, R. 371 Benson, P. 199, 297, 332
Annotated bibliography2, 34-35 Bergthold, B. 299
Anthropology 8, 187-188, 211-212, 222, 297 Biber, D. 436
Antonek,J. L. 11, 163 Bilingual education 31, 125,435,440,454
Anxiety29,41, 204, 247, 306,416-417 Bilingual SyntaxMeasure (BSM) 325-326
Appel.J.295,296 Bimodal distribution 374
Applewhite, A. 26 Biographical research255, 284, 297, 300,
Applied linguistics 3, 39, 125, 133, 148, 307-308
157-158, 167, 183, 185, 187, 222, 285, Birch, G.J. 294, 295
305,411,426,436 Black box 13-14, 275
Arens, K. 13 Bley-Vroman, R. 334
Asher, J. J. 45 Blind review process 451
Asian Journal ofEnglish Language Teaching Block, D. 169, 253, 279, 296, 302,418
11,460 Borg, S. 462
Atkinson, P. 163,211,430 Bounded 161-164, 183, 190
Attrition 182, 205-206 Bowering, M. 427
Au,K.H.P. 189 Bowers, R. 263, 264, 265, 267, 341-343, 369
Auerbach, E. 236 Braine, G. 224
Audiolingualism 12-14 Braunstein, B. 299
486
Brinton, D. M. 296,461 Classroom observation 3, 33,49, 255,
Brock, C. 296, 359 257-259, 263, 268, 275, 282, 334,439
Brock, M. N. 296 Classroom-oriented research 9, 17, 55, 155,
Brown, C. 294, 300 175,315,332,425
Brown, H. D. 335 Cleghorn, A. 191
Brown, J. D. 33,63, 79, 87, 123, 126, 133,137, Code-mixing and code-switching 28
156,335,384,399,401,411 Coding 62|-63, 76, 87,176-178, 184, 255,
Brown, R. 167 260, 263, 266, 268, 271, 276, 278,
Brumfit, C. 24-25 281-282, 287, 339-340, 346, 362-363,
BrunerJ. 304 366, 3^9-370,424,426,428-429, 434,
Burling, R. 169 436,440
Burnaford, G. 462 Coffey,A. 430
Burns, A. 226, 253, 446,462 Cognitive code learning 13-14
Burt, M. 325 Cohen, A. D. 285
Burton, J. 462 Cohen,J.M.257,284
Busch, M. 133,136,156 Cohen, L. 10, 125,128,170, 228
Campbell, C. C. 294-295, 302,417,443 Cohen, MlJ. 257, 284
Campbell, D. T. 102 Cohort 3lk 316, 319,441,445
Canagarajah, A. S. 188, 191, 202-204, 206, Cole, R. 296
219,224 Collaboration 18-19,178, 228, 229, 366,
Card sort activity 329-330, 333,424-425, 427, 368,4l5,451
429,423,435 Communication feature 14
Carless, D. 168 Communicative language teaching 14—15,268,
Carr, W. 17, 18, 226, 253 362,4l4
Carrasco, R. L. 159 Communicative Orientation to Language
Carroll, M. 294 Teaching (COLT) 14-15, 268-270, 283
Carson, J. G. 294, 295 Comparability 76, 131, 172
Carter, R. 362,427,446 Comprehension checks 22,43, 326
Case study 8-9,47^18, 53, 55, 63, 81-82, 90, Comprehensive stage 192
92, 94, 100-102, 157-185, 190,198, Conceptual research 35
200-201, 209, 217,219,229, 234, 250, Concordancing 425-428,437
297,439 Conferences 3,279, 338, 353, 438,450-452,
Category system 260, 261, 311, 341, 363,434, 459-461
440-441 Confirmability 209
Causality 44,47,61, 67, 73, 162,198, 210, Confirmation checks 22,43
439, 449 Confounding (extraneous) variables60-61, 63,
Celce-Murcia, M. 169 66, 6^, 75, 78, 83-84, 89,103, 118, 120,
Chamot,A. U. 253 123,393
Chan, Y.H. 253 Conrad, si296, 436
Chaudron, C. 14, 25, 54, 194, 266, 334, 360, Construct 2, 39-42,44, 53, 56,60, 64-65,
429,442 75,126, 142-143,421,428,445^146,449,
Chi square 34, 330, 338, 372, 381-389, 398, 454
402,409-410 Content-based instruction 31,461
Choi, J. 297-299, 311 Contingency 212,214
Christison, M. A. 21,152,237 Control 60-61, 68, 75, 78,91, 93, 117, 123,
Clarification requests 22, 43, 326 160,162, 358, 366, 428, 431, 438
Clark, J. L. D. 13 Control group 44,46-47, 53, 70, 85-86,
Clarke, M. A. 172 93-l6o, 103-104, 106-107, 111, 114-115,
Classroom, definition of 15-16 117, 324, 374, 376-378, 380, 390-392,
Classroom context mode 343-344, 404,445
346, 369 Conversattonal analysis339, 347, 352-356,
Classroom management 249, 269 359, 363, 369-370, 412, 423
Classroom-based research 4,17, 55, 312, 334, Conversational replay 165, 356
335,361,364,425,445,453,456 Cooley, L| 33, 53, 443
Classroom-centered research 4, 16, 258 Coombe, C. 462
Index 487
Corpora 425, 427-428,436 Distance learning 15-16
Correction move 165, 356 Donato, R. 11-12,162-163,184, 363
Correlation 2, 56, 70-74, 77-78, 89, 100-101, Dornyei, Z. 126-127, 135, 138, 141-142,
114, 120, 338, 372, 382, 396-401,439, 147-148,154,156, 370,411,429-430,
447,449 436, 439,444,461
Coulthard, M. 270-271, 340-341, 343, 350, Doughty, C. J. 327,446
434,440 Dowsett, G. 314
Crago,M. B. 189 Duff, P. A. 158,161-162,165,167-168,170,
Credibility 201,209, 224, 248 172, 173, 175, 181-182,185,188,189,
Criterion groups design 2, 70, 71, 73, 77, 78, 191,214-217,223,349
89,100,150,319,324,381 Dulay, H. 325
Critical paradigm 350 Duterte, A. 253
Critical values 380, 384-385, 388, 399-400, Eckes, M. 335
407-408,410 EdgeJ.226,253,462
Crookes, G. 253 Edwards, J. A. 370
Culture 9, 23, 27-28, 31-32, 52, 73, 77, 84, Ehlich, K. 347
100,116,119,121,143,168-169,174, Elicitation, eliciting 41, 87,124-126,142,
187-192, 194-197, 206-208, 211, 217, 154, 232-237, 255-256,265-266,
222,224-225, 298, 303, 312,314-315, 307-308, 312-313, 315, 317-318, 320,
328,371,410,418,429,451,454 323,325-326, 330, 332-335, 341, 350,
Cumming, A. 175-181, 184, 212 359,423
Curran, S. 334 Ellis, R. 188, 275-277, 294, 302, 333,
Curtis, A. 197,253,293,303 348-350,370,421
Dagneaux, E. 426 Empirical (data-based) research 1, 3-6, 9-10,
Dailey-O'Cain, J. 359-360, 370 32-33, 35, 57, 90, 155, 161, 205,223,
Damon, W. 366 276,353
Danielson, D. 294, 295 English Language TeachingJournal (ELT
Data tagging 425-427 Journal) 460
Davidson, F. 156 Equality 366, 368
Davis, K. A. 225, 436 Equivalent time samples design 96-97, 100
Degrees of freedom (d0 372, 379, 384-385, Ericcson, K. A. 285, 287-288, 305, 311
388-389,407^108,410 Error treatment 116,198, 222, 346, 357,
de la Torre, R. 45 359-361,363,370,456
de SilvaJoyce, H. 253 Ethical issues 51, 82,124,141, 147-148, 240,
Delaney, A. E. 296 338,447-448,452,457,461
Denness, S. 426 Ethnography 8-9,28, 81-82,164-165,
Denzin, N. K. 211, 234, 279,446 173,186-195, 198-200,202-211,
Dependability 209, 224 214, 217-219, 222-223,244, 349,
Dependent variable 59-61, 63-64, 68-69, 71, 439-440
73, 75, 78, 84, 86, 88, 91,93, 97, 99,103, Ex post facto designs 56, 70-71, 77, 89,100,
117, 126, 149, 210, 319, 324, 381, 389, 150,381,396
402,404 Exchange271, 340, 343, 345-346,
Dialog journals 65,175-179, 181, 184 352-353,355
Diary study 169, 234, 255, 279, 284, 286, Expectancy threat (researcher or subject
292-299, 301-308, 310-311,416-417, expectancy) 85,119,141,153
439,443 Experience bias factors 84-85
Diaz, C. 253 Experimental group 2,47, 61, 68-69, 71, 95,
Dingwall, S. 438 100, 104, 106-107, 109, 111, 114-115,
Directing 265, 341,343 117-118, 373-376, 378-379, 391,404
Directional hypothesis 58, 59, 73 Experimental research 2, 6,18,44,46-47, 49,
Directionality 72,400 57, 67, 70, 81-82, 100-101,106,126,158,
Discourse analysis 21, 270, 340, 368, 370,412, 162, 166, 172-173,182-183, 186, 188,
423,434 194, 196,198, 200, 204-208, 210-211,
Discourse completion task 138-139, 227-231, 249, 252, 297, 371, 379, 421,
321-322,333 438-439, 454-456
488 Index
Exploratory interpretive paradigm 11, Gliksman, L. 147
444-445 GoddardJA.427
Exploratory practice 253 Goetz, J. 200-201, 203-207, 223
Evans in, W. 26 Goldstein, L. M. 296
Fratio393,407 Gonzalezi V. 253
Factorial studies 61, 78,100, 319, 394-396 Graffiti bbard 233
Fs:rch,C. 311 Grandcolas, B. 294-296
Falsification 57, 174,194 Granger, S. 426
Fanselow, J. F. 253, 267-268, 283 Green, J. 189
Farhady, H. 375 Grotjaha R. 11, 81, 83,186, 292, 311,
Farrell, T. S. C. 462 4444445,460
Fetterman, D. M. 334 Grounded theory 194, 218, 338,412,414-415,
Field notes 10,182,190,201, 203, 212-213, 421^123,440,451
216, 218, 258-259,274, 277-279, Gu, P. Y. 156
281-282, 289, 317,401,415,417^418, Guba,E.jG. 172,209,421,424
432-434,440-441, 453, 455^157,461 Guthrie, P. 189
Fischer, J. 462 Halbach,|A.294
Flanders, N. 266, 339 Halliday,M.A.K. 167
Fleischman, N.J. 299 Hammersley, M. 163,211
Flick, U. 314,334 Harklau, jL. 168, 188
Focus on Communications Used in Contexts Harrison, 1.418
(FOCUS) 267-268,283 Hart, D. 294
Foreign language contexts 48 Hatch, EM. 16, 27,110,156, 185, 374-375,
Foreign language in the elementary school 384,401,409,411,461
(FLES) 11-12, 164 Hawthorke effect 88, 123
Foreign Language Interaction Analysis (FLint) Heath, SJB. 187, 188, 196,200
266,339-340 HedgcockJ.42,43
Fowler, F. 129 Henry, C. 230, 253
Fowler, G. 26 Henze, Ru C. 9, 255, 294-295
Fox, K. 195 Heshusius, L. 462
Fraser, B. 322-323 High-inferencecategories202
Freeman, D. 53, 224, 274, 284,414-415, 453 Hilleson,|M.294,302,417
Frequency distribution 81 Hinkel, E. 446
Frohlich, M. 14,268,283 Histogram 108,123
Frota, S. N. 294-295, 302,417,434 History threat 85
Frothingham, A. 26 Ho, B. 296
Fry,J.311 Hobson, D. 462
Gaies, S.J. 24, 339 Holbroolc, M. P. 299
Gain scores 91, 94, 99,404-406 HollidayjA.413,436
Galda, D. 315-318 Holmes, O. W. 284
Gall, M. D. 272 Holten, C. 296
Gallimore, R. 219 Hood, S. 253
Galvan, J. L. 53,461 Hoon, L. H. 294
Garaizar, I. 253 Hornberger, N. 224
Gardner, R. C. 147 Hosenfeia, C. 285
Garner, H. 412 Howell-Richardson, C. 294
Gass, S. M. 54,137-138,153-154,162, 259, Huang, J 294
311,335,362,438,456,461 Hubermin, A. 161,458
Generalizability 64-65, 67, 69, 87, 123, Hughes, A. 335
170,172,174,184, 207, 245, Hunt, K. W. 177
250-251,458 Hyland, J. 328
Genesee, F. 189,191 Hypothesis 2,16, 18, 27-28,42, 56-58,69-70,
Geoffrey, W. 187 77, 91, 93, 99, 120-122,126,168,
Gieve, S. 462 174-175,193-194, 229, 231, 298, 350,
Giroux, H. 192 390,402,443,459
Index 489
Hypothesis-orientedstage 193 Jepson, K. 21,22, 25, 34,42-43, 51,63,
Hypothesis testing 2, 18, 28, 56, 77, 174, 229, 78, 326
372,381,402,409-410 Johnson, D. 125,161
IELTS 280, 324,408,440 Johnson, K. E. 185, 283, 352,363,
ILR Oral Proficiency Interview 324 369-370,462
Incorporation 22,43, 269, 326 Jolley,J.63,123, 135-136
Independent variable 59-61, 70, 73, 75, 78, 84, Jones, F. R. 294-295
91,93,99,103,117, 123, 149,319,324, Jourdenais, R. 311
380,394-395,431 Kamhi-Stein, L. 328
Individualized instruction 41 Kasper, G. 311
Insight 7, 19, 23, 67, 75, 155, 162-173, Katz, A. 454-459
184, 203, 207, 218, 238, 241-242, 245, Kebir, C. 253
247, 284, 289, 295, 297,299, 303, 307, Kemmis, S. 17-18, 158, 226, 228-230, 253
311,316-318,329,332,420-421,427, Kennedy, M. 300
436,439 Kincheloe,J.L.462
Instability of measures 87 Knezedvoc, B. 253
Institutional talk 353 Knobel, M. 462
Instructional episode 162 Knowles, T. 253
Instructions 42,145,155,221,261,288, 301, Kramsch, C. 363
304, 307-308, 321-323, 333-334, 367, Krishnan, L. A. 294
369,403 Kuhn, T. 438
Instrumentation threat 84-85, 87, 122 Kumaravadivelu, B. 236, 253
Intact groups design 46, 73, 77, 89, 91-94, Kusudo, J. A. 45
100, 102-103, 381 Kwan,T.Y.L.253
Interaction 4,6, 14, 16, 21-22, 28-29, 34, Labov, W. 196,281,305,447
87, 125, 159-160, 163-164, 166, 169, Lamb, C. 348
175-176, 180, 182, 186-191,200,203, Lampert, M. D. 370
208, 215-216, 249-250, 255, 259-266, Language Contact Profile 125
268, 270-274, 276, 280-283, 289, 291, Language Learning 11, 25, 34, 79
303,306,312,314,317-318,321, Language Teaching Research 11,460
326-328, 337, 339-341, 343, 345-347, Lankshear, C. 462
349, 353, 355-357, 359-370,413, 417, Lantolf,J.P212,271, 362
440, 443,454-456 Lapkin, S. 294
Interaction analysis266, 339, 412, 423 Larimer, R. E. 235
Interaction effect 87-88, 395-396 Larsen-Freeman, D. 67, 157,170,172
Interaction effects of selection bias 87-88 Law, B. 335
Interactive combination of factors 86 Lazaraton, A. 16, 27, 110, 136, 156, 225, 356,
Inter-coder agreement 63, 177, 277, 340, 370, 374, 384,401,409,411,436,461
363, 370 Lazarus, E. 294
Interlocutor 21,43, 125, 138,269,279, Leatherman, J. 296
326, 360 Learning strategies 29, 152,292, 310, 317,
Interpretive paradigm 350,444-445 332,413,416,443
Interval data 2, 38-39,40, 59, 114-115, LeCompte, M. 200-201,203-205, 207, 223
135-136, 381, 389-390, 393, 396-397, Lee, E. 296
399-403, 409 Legutke, M. 21
Interviews 47, 62, 186, 192-193, 205, 221, Lenzuen, R. 253
223, 312-321, 324, 328-330, 332, 334, Leopold, W. F. 163
350, 353,413-416,418, 421,430, Lesson plan 29,234, 249, 289, 291, 304,
444-445,454-455 319-320,332,413,415,436
Introspection 255, 259, 284-311, 439 Levine, H. 219
IRF (or IRE or QAC) pattern 270, 271, 340, Lew, L. 296
345, 350 Lewin, K. 226
Jaeger, R. M. 126, 161, 374,410-411 Lewin, L. 226
Jarvis, J. 296 Lewkowicz, J. 33, 53,443
Jenkins, D. 158 Libben,G. 294
490 Index
Lieblich,A.297,423 Member checking
che (also membervalidation)
Liebscher, G. 359-360, 370 363,429-430,461
Lightbown, P.M. 445 Mendelsohn, D. 294
Likert, R. 133 Merriam, k B. 35, 161,163
Likert scale 131, 133-134,136, 156 Michonska-Stadnik, A. 253
Lin,A.M.Y. 189 Microanalysis 356, 370
Lin, Y. 42,43 Micro-ethnography 208, 217
Lincoln, Y. S. 172, 209,421,424,446 Miles, M.|B. 161,426,436,458
Literature review 2, 27, 32-36,42,44, 52-53, Miller, I. K. 462
55, 59-60, 73, 127, 212, 235, 327, 368, MingucciJM.226,251,253
421,431,439,443,445-446,459 Mitchell, M. 63,123,135,136
Lockhart, C. 227 Mitchell, R. 24,25
Logic 438 Mixed methods research 11, 371,408,
Long, M. H. 4, 13-14, 24,42, 260, 275, 439-440,111 115,453-458
326, 362,446 Mode 110, 111, 112,114,168, 372-374, 379
Longhini, A. 294,295 Moderator variable 60-61,67, 70, 89, 100,
Longitudinal research 8, 69, 86,158, 163, 121-123, 319, 324, 381, 385-389
166-167,173, 178, 180, 182,189-190, Modern LanguageJournal 11, 460
193,198, 205, 208, 217-218, 224, 297, Mok, A. 253
314,316,350,417,449 Moore,Tl 294
Lowe, T. 294-295 Morgan, D. 315
Low-inference categories 202,204, 222, 266 Morgan, M. 13
Lozanov, G. 116 Mortality threat 69, 85-86, 182
Lynch, T. 279, 281,408,439 Moskowitz, G. 266, 339-340
Mackey,A. 137-138,153-154, 162, 259, Motivation 17, 29, 37, 41, 60, 103, 192, 202,
311,335 204, 231, 297, 300,443,445-446
Maclean, J. 279, 181,408,440 Move 22, |34,42^13,63, 271-272, 274,
Main effects 395 356, 370
Managerial mode 343, 346, 369 Multiple perspectives analysis 11-12, 350
Manion, L. 18,125,128,170, 228 Multiple treatment interaction 89
Marginal frequencies 386 MurpheyJT. 370
Markee, N. 340, 352-356, 370,461 Mutualityj 366, 368
Marshall, C. 225,450-451, 453,461 Nassaji, H. 175,176-181, 184, 212, 226, 328
Martyn, E. 326-328 Naturalistic inquiry (or naturalistic research)
MasonJ. 334 1, 7-j>, 11, 14,18,20, 23-24, 36,44,
Materials mode 343, 345, 369 47-48, 63, 67, 77,90,120,127, 155, 158,
Matsuda, A. 296 160, !l64,167,170,173,183,186,
Matsuda, P. 296 188-pO,199, 202, 228, 230-231, 334,
Matsumoto, K. 294, 309, 311 363, 372, 379,423,428,438-439,
Maturation threat 85, 86, 87, 206 444-JW5,453
Maxwell,J.A.363 Negative feedback 22,42-43, 233
McCarthy, M. 341, 343-346, 369 Negotiated interaction (negotiation of
McDonough,J. 296 meaning) 21-22, 34, 42^13, 326-327, 362
McGarrell, H. M. 462 Nisbett, R. 288
McKay, S.L. 25,461 Nominal data 37-39, 381, 383, 389,400,403
McPherson, P. 253 Nonequivalent comparisongroups design
McTaggart, R. 227, 229, 253, 288 93-94, 100, 102-103
Mean 86, 106, 110-115,118,121, 123, Nonparticipant observer 193-195, 197,213,
129, 372-375, 377-379, 389-390, 223-224, 259
392-394,404-406,408^109,413,418, Nonverbal behavior 250, 340-341, 350, 355,
440-442,447 362-363
Meaning condensation 412,418-421,440 Normal curve (normal distribution) 83, 110,
Measures of central tendency 110, 372-375 112-113,123,129,373
Measures of dispersion 110, 372-373, 376-379 Normative paradigm 349-350
Median 110-112, 114, 372-375, 379 Null hypothesis 57-59, 73,114
Index 491
Numrich, C. 296 Participant observer 165, 193-195, 197, 203,
Nunan, D. 9, 16-17, 21,25, 27, 66, 79,124, 213,219,223-224,259,279
148,150,152,154,161,194, 197,199, Particularization 171-172
201, 204, 206, 228, 231, 250, 253, 264, Peck, S. 168, 294
269, 276, 282, 289, 291, 297, 308, 326, Pennington, M. C. 296
328, 332-333, 342-343, 348, 361-363, Pennsylvania Project 25
369,416,418,420-423,427, 429,442, Perecman, E. 334
444, 446,448-450,452, 459, 462 Perry, F. L. 411
O'Brien, T. 462 Peyton, J. K. 175
Observation 3-4, 7,10-11, 14-15, 33, 35,49, Phelps, E. 366
50, 57, 76-77, 87-88, 102,161, 163, 165, Pica, T. 24, 327, 362
170, 173-174,182,186-187,189-199, Pike, K. L. 197
201, 205, 211-213, 221-223, 228, 234, Piloting 87, 124,140-141, 145
255, 257- 261,263-264, 266-270, Piore,M.437
272-273, 275-279, 281-283, 289, Plummer, K. 297,428
297-299, 305, 321, 325, 332-334, 365, Polio, C. 295, 296,461
415,432-434,439-440,447, 453, Popper, K. 57, 174
455-457 Population 2,44, 46^18, 53,63-67, 70-71,
Observation schedule (or scheme or system) 4, 83-84, 86, 88-89, 92, 94,98, 104-106,
11, 15, 87, 164, 212, 234, 255, 258-261, 110, 113-115,124-125,127-129,
264,266-268,270,279,339 144-145,150,152-153,158,170-172,
Observer's paradox (reactivity) 196, 205, 216, 183,196,199,207,372,381,393,
280-281, 306, 212, 234, 258, 447 413,456
O'Farrell, A. 424 Porter, P. A. 296
Ochsner, R. 292 Porto, 294
Ohta, A. S. 363 Post-test only control group design 98-99,
One-group pre-test post-test design 90, 92,94, 100, 380, 391
100, 365, 381, 392 Practice effect 85, 87, 123,192
One-shot case study 90, 92, 94, 100-102,158, Pre-experimental designs 100
183,229 Presenting 265, 341
Open-ended items 136-137,140-141, 153, Pre-test post-test control group design
155,413,416,444 99-100,103, 380
Operational definitions 2, 36, 41-42, 56, 63, Probability 372, 380, 384-385,400,410
120, 326, 358, 363, 370, 375, 404,413, Process studies 1,13-14, 117, 257, 275
443,445-446,449,454 Production tasks 256, 312, 321, 326-327, 333
Ordinal data 38-39, 108, 136, 156, 375, 381, Process-product studies 1, 14, 117, 172, 257,
399,402-403 275,404, 456
Ordinary conversation 353 Product studies 1, 14,117,172, 257,
Organizing 264-265, 341, 343 275,281
Otto, F. M. 25 Profiles 10,432-433,440,455
Outcome 7,13-15,19,45-46, 56, 58, 60-62, Prompt 140, 179-180, 263, 290, 321-323, 325,
66, 68-69, 85, 88, 94-95,97, 103,117, 330, 345
119-120,158, 167,171,186, 203-206, Proof 57, 67
227-229, 231, 242, 247-250, 275, 279, Proposal 18, 130, 225, 249, 338,437,440,
281, 302, 323, 359, 364, 370, 380, 391, 432-443,447,450-451,453,461
397,400, 418, 440,443,450, 454,457 Prospect-a JournalofAustralian TESOL 11,
Oxford, R. 243, 362 119,460
Palmer, A. S. 325,335 Psychometric research 6, 7,11, 23,46,67, 73,
Palmer, C. H. 295 202,230, 408,428, 431,438, 444
Palmer, G. M. 295 Publishing 11,13,16, 27, 33, 53, 75,120, 227,
Paradigm 11, 12, 81, 83, 170, 207, 211, 228, 337, 364,434-435,447^152,459-460
292, 349-350,438,442,111 115,460 Qualitative data analysis 10-11, 19, 153, 156,
Paraphrase 22,43,407 164,179, 190,198-199, 249, 292-293,
Parkinson, B. 294 304, 330, 337, 359,408,412-436,442,
Participant bias factor 85 444,447, 456-458
492 Index
Qualitative data collection 10-11, 19, 63, 75, Reactive effects of experimental arrangements
129-130, 163, 178,190, 198-199, 211, 87-8^, 123, 196, 205
249, 292, 356, 371,401, 435,442,444, Reah,D.427
449,453-454,456-458 Reference! list 20, 35,461
Qualitative research 5, 7,9-11, 14-15, 19,161, Reid, J. 156
170,173,178-179, 186, 189, 200-202, Reliability! 1-2, 62-63, 65,67, 75, 77, 79, 82,
207-209, 211, 218-219, 224-225, 314, 84, 87,137, 170, 183,187, 198-204,
350,412-437,439-442,446,450,453, 207-209, 222, 224, 277, 281-282, 288,
457-458,461 305-306, 324,409,428,449,453
Quality control 2,42,62, 65,67, 84, 87, Repair 22J 34,42^43, 63, 353, 359-360,
119,124,141,154,163,170,173, 368,370
183-184, 187,198,207,209,211, Replicability, replication 34,41-42,63, 65, 78,
222, 224, 226, 234-235, 255, 277, 300, 120, 155,170,199-201, 204, 222,428
339, 363,402-403,412,428-430, Reppen, R. 436
453-454 Research gaps 33, 178,443
Quantitative research 1, 5-7, 9-11, 14-15, 19, Research question 2, 5, 11,12, 22,24, 27-30,
34,48, 51, 56, 71, 73, 83, 126, 129, 141, 32-3J1,42,44-45,47,49, 52-53, 55-56,
153, 164, 166,178-179, 184, 190, 202, 58-59, 69-72, 75, 77-78, 81, 89,91, 93,
218-219, 236, 292, 318, 330, 337-338, 97,99, 117,121-122, 126, 129-130, 132,
350,408,413,428,437,439-442, 139,145, 149, 151, 155, 167, 177,
444-445,453-458,462 182-183, 191, 193-194, 196, 213, 216,
Quasi-experimentaldesigns 11,97-98, 100, 224, 239, 231-232, 234-235, 252, 260,
210,292,444-445 266, 275, 280, 282-283, 287, 290, 298,
Questionnaire 11, 16, 33,41-42,49, 55, 82, 307-308, 310, 314-315, 318, 328,
87-88, 102, 116, 124-156, 164,203,212, 333-334, 356, 360-361, 346, 363-364,
234-235, 237-238, 241, 243-246, 252, 368-369, 381,402,425,430,434-435,
256,279-280, 302, 312-314, 316, 439, JW2-445,449-450,452-454,
318-321, 332-334, 350,413,415^116, 458-461
418,428,443444,458 Respondents 126, 130-138, 140-144, 148,
Questions 29, 31, 37,43, 62, 76,100, 126, 151,153-156, 313, 317-320, 322-323,
130-133,136-138, 140-145,147, 151, 429, J445
153-154,162, 178-179, 202, 204, 206, Responding 341, 356, 359
215-216, 232-233, 236-237, 239, Response set 135
242-245, 251, 261-262, 264-266, Retrospection 213, 255, 284-286, 288-289,
269-270, 274, 279, 282, 291-292, 299, 297-299,305-307,311,418
304, 309-310, 313-314, 316-319, Richard, D. 372
325-326, 330-331, 340-343, 352-353, Richards,J.C.227,296,446
358-360,403,413-414,416,443, Richards, K. 9, 194-195,198, 207-209, 218,
455-456 225, 294-295, 430-431,436,462
Quirke, P. 232-234 Rintell, E 322
Radecki, W. 144 Rivers, W. 295
Raffier, L. M. 296 Rodgers,T.S.401
Raiseddesign (or two-phase design) 320, 324 Roebuck, R.363
Randomization 44,46, 53, 71, 85-86, 92, Rogan, P. 296
94, 97-101, 103-104, 116, 120, 127-128, Role play 64, 98, 256, 269, 312, 322-323, 329,
164, 172, 199, 205, 210, 236, 381, 391, 332-534
404,445,449 Rossman, G. B. 225,450-451,453,461
Range 106-107, 110-111, 114, 372-375, 376, Rounds, P. L. 119,442
379,405,410 Rowntree, D. 83, 105
Ranking 6, 39-41, 59,106-108, 132, 329, Rowsell, L. V. 294
393, 403 Rubin, J. 294-295
Ranta, L. 445 Ruiz de Gauna, P. 253
Rater reliability 62, 363, 428 Ruso, N. 294, 296
Rating 6, 59, 62,150-151, 232, 239-240, 324, Sample 2, 5, 10,44, 46-49, 53, 64-65, 67,
363, 403,440-442 69-7C1, 74, 83-84, 88-89, 101, 104-106,
Index 493
109-110,115,117,119,124-125, Social Sciences and Humanities Research
127-129,145-146, 150, 152, 154-155, Council of Canada 460
162,164,172,174,177, 182-183, 207, Sociating 341, 343
319-320, 334, 372, 381, 388-390,402, Socioculturaltheory 177-181, 271, 362
413,418,432 Sociolinguistics 50, 131, 134, 162, 214, 269
Sanger, K. 427 Soule-Susbielles, N. 294-296
Santana-Williamson, E. 296 Spada, N. 14-15, 268, 283,445
Sato, C. 74-78, 122, 250, 388-389 Spencer Foundation 460
Saunders, S. 391 Spradley,J. P. 193,314,334
Scaffolding 178-179, 181-182, 184,212-214, Springer, S. E. 49-52, 130,136, 154, 212-214
304-305, 362-363, 368 Sreedharan, N. 3
Scales, J. 372 Stake, R.E. 172-173, 183
Scatterplot, scattergram 396-398,400, 410 Standard deviation 106, 110-115, 153,
Schachter, J. 54,119,438,456,461 373, 376-379, 390, 392,405,408,413,
Scherer, G. A. C. 12-13,275 418,441
SchleicherL.235,296 Stanley,]. C. 102
Schmidt, R. W. 173-174, 294-295, 302, Statistical regression85-86
417,434 Statisticalsignificance6, 10, 12-13,15, 22-23,
Schrank, A. 312 34,44, 57-58,68, 71-73, 76, 83,114-117,
Schumann, F. E. 294-295, 302 120, 152, 281, 330, 348, 350, 359, 372,
Schumann, J. H. 168, 174, 294-295, 302 379-396, 399400,402,440
Scientific method 83,123, 188, 230 Statistics 10-11, 38, 51, 58, 61, 70-71, 73, 75,
SCORE data 272-273, 282, 339 77, 79, 81, 83, 86, 100-102, 104-115,
Second languageacquisition 6, 14, 20-21, 119-120, 123, 129,136,152, 154, 156,
30, 32,49-50, 124,138, 157, 167-168, 164-166, 172, 174,208,218-219,247,
174-175, 183, 185, 212, 268, 271, 292, 249, 292, 304, 330, 333, 338, 350,
297, 299, 311-312, 321, 326, 333-334, 372-381, 384, 393, 396411,418,447
349, 353 Stenhouse, L. 165
Second languagecontext 48, 356, 448 Stewart, D. 314
Secondaryresearch (or libraryresearch) Stewart, T. 462
33,35 Stimulated recall 35, 255, 259, 281, 284, 286,
Selection threat 85, 87-88, 103, 205-206 288-292, 306-308, 311, 350, 369,413,
Self-repetition 22,43 415-416
Self-report 42, 147, 150, 153, 280, 287, 302, Storch, N. 364-368, 370
305, 350 Strauss, A. 424
Seliger, H. W. 24, 30-31, 125, 311 Strong, M. 356
Semantic differential scale 133-135, 167 Subjectivity 7, 24, 119, 173, 195, 200, 243,
Settle, R. 140, 142, 155 270,323,428
Shamdasani, P. 314 Suggestopedia 116-119
Shamim, F. 274 Sullivan, P. N. 363
Shavelson, R. J. 100, 123,411 Summarizing 10, 33,421, 432-434,440
Shaw, P. A. 189,193-194 Survey 13, 50, 81-82, 102, 124-156, 162,
Shohamy, E. 30-31 164, 174, 222, 297, 313, 365,415,439,
Shuy, R.W. 176 445, 452
Simard, D. 194 Swaffar, J. 13-14
Simon, H. A. 285, 287-288, 305, 311 Swain, M. 294, 363
Sinclair, J. 270-271, 340-341, 343, 350, System 11
434,440 Szostek, C. 253
Skills and systems mode 343-345, 369 Szulc-Kurpaska, M. 253
SLEP 324 t-test 115, 338, 389-392
Smith,J.K.462 T-unit 177-179
Smith, L.M. 187 Task 3-6, 9, 14, 16, 24, 28-30, 35, 76-77, 114,
Smith, P. D. 25 135,138,154-155,159-160,169, 184,
Smythe, P.C. 147 202,212-214, 222, 237,242-243, 245,
Snow, M. A. 328, 330,461 248-249, 251, 256-257, 262-263,
494 Index
265-266, 272, 279, 282-283,286-289, Trouble source 359, 370
293, 304-308, 311-312, 323,325-330, True experimental design 44,46, 53, 70-71,
334, 340-341, 357, 361-366, 369, 73,77,89,92, 94, 98-100,380,445
396-397,410,414, 443 Trueba, G. 189
Teacher research 4, 19-20, 24, 53, 59, 82, 203, Tsang, W. K.I296
263,364,462 Tsui, A. B. M. 185, 250-252, 360, 370
Teaching assistants (TAs) 196, 273-274, Tucker, G.R. 11,163
417-418,430435,440-442,459 Tuckman, BJ 41-42, 84-87, 95-96, 123,136
Teaching style 10,89,222, 274,300,417,433, Tuman, J. 299
441,454,456,458 TurnerJ.L. 136,156,219
Technology 5,15, 20-22,24-25, 34,50, Turn-taking 63,74, 76, 122,125, 250,
212,314,318,338,340,401,412, 259, 3£3
425-428 Tuval-Mashiach, R. 297,423
TESOL 49, 119,436,462 Tyacke, Ml 294
TESOL Quarterly 11,225,436,447,460 Ulichny,P. 164-165,356
Testingthreat 86-87, 99, 103, 123 Validity l|2,61-70, 75, 77-79, 81-82, 84-85,
The International Research Foundation for 87, 89, 99-100,103,116,118-119,
EnglishLanguage Education (TIRF) 460 122^123, 128, 152 , 155, 157-158,
Theory 18, 19, 52, 57-58, 73,159, 175, 181, 1704171,173-174,182-184,187,194,
185,192,194-195, 208, 211, 213, 218, 198^200,204-210, 222-224, 249-250,
227, 235, 249, 421,436, 437,440,443, 277| 281, 305-306, 324-325, 333,
451 428-430,446,449,453
Think-aloud protocols 255, 284, 286-289, Variable's 1-2,6-7,15,17,27, 36-41,44,
305-308,311,350 46J48, 53, 56, 59-61,63-64,66-73, 75,
Thorpe, J. 135, 235, 236-246, 251 77f78, 83-84,86,88-89,91, 93,97,
Threats to validity and/or reliability 56, 67-70, 99-100,103,110,114,117-121,123,126,
75,77-78, 81, 84-89, 99,103, 116-119, 149,158,160,162,171-172,194,198,
122-123,128,152,155,171,173,182, 210,230,249,297,319,324,328,350,
202,204-206,333,449 366,372, 380-383, 385, 389, 393-402,
Timesampling 432
Time series design 94-97, 100-101 404,409-410,421,431,438-439,444,
Tinker Sachs, G.253, 294-295 4*19,460
TOEFL 90-96, 98-99, 280, 324, 399,408 Variance 110,114, 338,372,376,378-379
440 ' 3^92-393,407 '
TOEIC 324 Video1 recording 16, 74, 76, 87, 121 159
Topic-oriented stage 192 162-163, 187, 190, 200, 203, 208, 216
Total physical response (TPR) 45, 57.53 218,235-236, 238, 242, 244, 246, 252'
Transaction 8,271 258-260, 267-268, 271, 277-278
Transcripts, transcription 8, II,16, 20, 35 281-282, 287, 289-290, 306, 315 332
162-163, 165, 182,193, 199, 216-217 339-340, 34, 3686, 353, 362-363,' 39l'
236-237, 242-244, 259,263-264 396,, 413-416,424,446
270-271, 273, 280-281, 283, 289,' 291 Virtual classroom 15, 20, 21, 257
306, 337, 340, 344, 346-349, 352-356* van Lier, L. 7-8, 15,18, 20, 25, 55-56,
362-363, 365-367, 369-370, 401 67, 119, 170, 172, 185, 188, 198,203
413^114,418,424,429,435,454! 1207,208-212,218-219,221,223-224
455-457 j226, 229-231, 245, 246, 251, 253 ?57*
Transferability 171-173, 184,209,224 279,282, 322, 362-363, 366, 435* 440*
Treatment 4446, 53, 60-61, 63-64 66 (454,457 ' '
68-70, 74, 85-89, 92, 94-97, 100 ' Venty, D.P. 295-296, 304
Vygotsky, L. S. 178, 362
102-104, 115, 119, 126, 158, 202! 204 Waissbluth, X. 299
210,218,231,236,380-381,391 393' Wait-time 29, 37,47, 55, 202, 251 414
395,404 ' '
Wajnryb, R. 283
Triangulation 82,163, 202, 211-214 223 Wallace, M.J. 25, 226, 253
234-235, 278, 280-282, 302, 307, 318 Wallat, C. 189
320, 332, 350
Walsh, S. 341, 343-346, 369-370
Index 495
Walters, J. 322 Winer, L. 296
Wang, J. 168 Wong, L. 148, 150, 152, 154,444
Warden, M. 294 Wong, M. 296
Watson-Gegeo, K. A. 186, 189, 190-192, Woodfield, H. 194
194-195, 197, 207, 224 Woods, D. 289-290
Weisner.T. S. 219 Wu, S. H. 372
Weitzman, E. A. 426,436 Yahya, N. 296
Wen, Q. 156 Yates correction factor 409
Wennerstrom, A. 372 Youngman, 132
Wertheimer, M. 12-13, 275 Yin, R. 161, 166, 170-171,174
Wesche, M.B. 323,461 Yu, B. 296
Whyte,W.F. 187 Yu,J.H.328
Wiersma.W. 30-31,35 Zambo, L. 299
Willing, K. 150 Zilber, T. 297, 423
Wilson, T. 288 Zone of proximal development(ZPD)
Wilson-Duffy, C. 295-296 177-178,362
Text Credits
Ch. 3: P. 75, adapted from Nunan, D. 1992. Research Methods inLanguage Learning. Cambridge University Press.
Reprintedwith permission.
Ch. 7: P. 190, 204, 206, adapted from Nunan, D. 1992. Research Methods in Language Learning. Cambridge
UniversityPress. Reprintedwith pennission.P.213, adaptedfrom Springer,S. E. (2003). Contingent languageuse
and scaffolding in a project-based ESL course. Unpublished manuscript, Monterey Institute of International
Studies, Monterey, California. Reprintedwithpermission.
Ch. 8: P. 88,adapted from Quirke, P.(2001). Hearingvoices: Arobust andflexible framework forgathering andusing
student feedback. InJ. Edge (Ed.), Action research (pp. 81-91). Alexandria, VATESOL. Reprinted with pennission.
P. 75, adapted from Nunan, D.1992. Research Methods in Language Learning. Cambridge University Press. Reprinted
with pemiission. P. 231, adapted from Quirke, P. (2001). Hearing voices: Arobust and flexible framework for
gatheringand using student feedbacL InJ. Edge (Ed.), Action research (pp. 81-91). Alexandria, VA: TESOL. Reprinted
with Permission. P. 238-240, adapted from Thorpe, J.(2004). Coal miners, dirty sponges, and the search for Santa:
Exploring options in teaching hstening comprehension through TV news broadcasts. Unpublished manuscript,
Monterey Institute ofInternational Studies, Monterey, California. Reprinted with permission.
Ch.9:P.258,273,adaPtedfromBa^^
University Press. P. 111. Reprinted with permission. P. 269,276, adaptea rrom i^un
X^L^gXambWUniversity Press'^^.^^^^^6^^^.^
clm^UnUity Press, P. 51. Reprinted withpermission