Short Biographical Sketch: Fundamental Considerations in Language Testing. Oxford University Press, 1990
Short Biographical Sketch: Fundamental Considerations in Language Testing. Oxford University Press, 1990
to stu#y or a means of re'ie+ing material taught in +hich case no e'aluati'e #ecision is ma#e on the ,asis of the test results. $hey may also ,e use# for "urely #escri"ti'e "ur"oses. $he ma;ority of tests are use# for the "ur"ose of ma9ing #ecisions a,out in#i'i#uals. Evaluation can ,e #efine# as the systematic gathering of information for the "ur"ose of ma9ing #ecisions. )t is the collection of relia,le an# rele'ant information. $herefore it #oes not necessarily entail testing. )t is only +hen the results of tests are use# as a ,asis for ma9ing a #ecision that e'aluation is in'ol'e#. &o it is im"ortant to #istinguish the information*"ro'i#ing function of measurement from the #ecision*ma9ing function of e'aluation. Essential measurement ualities. 0elia,ility is a :uality of test scores an# a "erfectly relia,le score or measure +oul# ,e one +hich is free from errors of measurement. An# the most im"ortant :uality of test inter"retation or use is 'ali#ity or the e<tent to +hich the inferences or #ecisions +e ma9e on the ,asis of test scores are meaningful a""ro"riate an# useful. Fhile relia,ility is a :uality of test scores themsel'es 'ali#ity is a :uality of test inter"retation an# use. $hey are ,oth essential to the use of tests. )n summary B1easurement2 an# Btest2 in'ol'e the :uantification of o,ser'ations an# are thus #istinct from :ualitati'e #escri"tions. $ests are a ty"e of measurement #esigne# to elicit a s"ecific sam"le of ,eha'ior. B%'aluation2 in'ol'es #ecision ma9ing an# is thus #istinct from measurement +hich essentially "ro'i#es information. $hus neither measures nor tests are in an# of themsel'es e'aluati'e an# e'aluation nee# not in'ol'e measurement or testing. !ses o" language tests $he t+o ma;or uses of language tests are= >1? as sources of information for ma9ing #ecisions +ithin the conte<t of e#ucational "rogramsC an# >2? as in#icators of a,ilities or attri,utes that are of interest in research on language language ac:uisition an# language teaching. $he fun#amental use of testing in an e#ucational "rogram is to "ro'i#e information for ma9ing #ecisions that is for e'aluation. $he use of tests as a source of e'aluation information re:uires three assum"tions. First +e must assume that information regar#ing e#ucational outcomes is essential to effecti'e formal e#ucation. &econ# it is "ossi,le to im"ro'e learning an# teaching through a""ro"riate changes in the "rogram ,ase# on fee#,ac9. $hir# +e must assume that the e#ucational outcomes of the gi'en "rogram are measura,le. )n a##ition to these assum"tions +e must also consi#er ho+ much an# +hat 9in# of testing is nee#e# as +ell as the :uality of information "ro'i#e# ,y our tests. )n a +or# the main "oint of this cha"ter is that the most im"ortant consi#eration in the #e'elo"ment an# use of language tests is the "ur"ose or "ur"oses for +hich the "articular test is inten#e#. By far the most "re'alent use of language tests is for "ur"oses of e'aluation in e#ucational "rograms. Communicative language ability Accor#ing to Bachman communicati'e language a,ility >CLA? can ,e #escri,e# as consisting of ,oth 9no+le#ge or com"etence an# the ca"acity for im"lementing or e<ecuting that com"etence in a""ro"riate conte<tuali7e# communicati'e language use. $he frame+or9 of CLA inclu#es three com"onents= Language competence.
3no+le#ge &tructures
3no+le#ge of the +orl#
Language Com"etence
3no+le#ge of language
&trategic Com"etence
Psycho"hysiological 1echanisms
Conte<t of &ituation
> Com"onents of communicati'e language a,ility in communicati'e language use? Language Competence
-rgani7ational Com"etence
Pragmatic Com"etence
$e<tual Com"etence
Cohes.
0het.-rg.
> Com"onents of language com"etence ? Language com"etence inclu#es organi7ational com"etence +hich consists of grammatical an# te<tual com"etence an# "ragmatic +hich consists of illocutionary an# sociolinguistic com"etence. Furthermore Grammatical com"etence inclu#es those com"etencies in'ol'e# in language usage consisting of a num,er of relati'ely in#e"en#ent com"etencies such as the 9no+le#ge of 'oca,ulary mor"hology synta< an# "honology. gra"hology. $e<tual com"etence consists of cohesion an# rhetorical organi7ation. )llocutionary com"etence is relate# to four macro*functions= i#eational mani"ulati'e heuristic an# imaginati'e. A,ilities un#er sociolinguistic com"etence are sensiti'ity to #ifferences in #ialect or 'ariety to #ifferences in register an# to naturalness an# the a,ility to inter"ret cultural references an# figures of
s"eech. Strategic competence. $hree com"onents are inclu#e# in strategic com"etence= assessment "lanning an# e<ecution. #sychophysio$logical mechanisms. $hese are essentially the neurological an# "hysiological "rocesses an# characteri7e the channel >au#itory 'isual? an# mo#e >rece"ti'e "ro#ucti'e? in +hich com"etence is im"lemente#. Test methods $he characteristics of test metho#s can ,e seen as restricte# or controlle# 'ersions of these conte<tual features that #etermine the nature of the language "erformance that is e<"ecte# for a gi'en test or test tas9. Performance on language tests 'aries as function ,oth of an in#i'i#ual2s language a,ility an# of the characteristics of the test metho#. )t is also affecte# ,y in#i'i#ual attri,utes that are not "art of test ta9ers2 language a,ility. $he fi'e ma;or categories of test metho# facet are= >1? the testing en'ironment +hich inclu#es the facets= familiarity of the "lace an# e:ui"ment use# in a#ministering the testC the "ersonnel in'ol'e# in the testC the time of testing an# "hysical con#itionsC >2? the test ru,ric +hich consists of the facets that s"ecify ho+ test ta9ers are e<"ecte# to "rocee# in ta9ing the test. $hese inclu#e the test organi7ation time allocation an# instructionsC >5? the nature of the in"ut the test ta9er recei'esC >8? the nature of the e<"ecte# res"onse to that in"ut an# >5? the relationshi" ,et+een in"ut an# res"onse. $he frame+or9s #escri,e# here ha'e ,een "resente# as a means for #escri,ing "erformance on language tests an# they are inten#e# as a gui#e for ,oth the #e'elo"ment an# use of language tests an# for research in language testing. $hese frame+or9s "ro'i#e the a""lie# linguistic foun#ation that informs the #iscussions in the remain#er of the ,oo9. %eliability A high score on a language test is #etermine# or cause# ,y high communicati'e language a,ility an# a theoretical frame+or9 #efining this a,ility is thus necessary if +e +ant to ma9e inferences a,out a,ility from test scores. Performance on language tests is also affecte# ,y factors other than communicati'e language a,ility. $hese can ,e grou"e# into the follo+ing ,roa# categories= >1? test metho# facets as #iscusse# in Cha"ter 5C >2? attri,utes of the test ta9er that are not consi#ere# "art of the language a,ilities +e +ant to measure an# >5? ran#om factors that are largely un"re#icta,le an# tem"orary. Communicati'e language a,ility
$%&$ &C-0%
Personal attri,utes
0an#om factors
Fun#amental to the #e'elo"ment an# use of language tests is ,eing a,le to i#entify an# estimate the
effect of 'arious factors on language test scores. $est scores are influence# as much as "ossi,le ,y a gi'en language a,ility an# any factors other than the a,ility ,eing teste# that affect test scores are "otential sources of error that #ecrease ,oth the relia,ility of scores an# the 'ali#ity of their inter"retations. $herefore it is essential that +e ,e a,le to i#entify these sources of error an# estimate the magnitu#e of their effect on test scores. 1easurement theory "ro'i#es se'eral mo#els that s"ecify the relationshi"s ,et+een measures or o,ser'e# scores an# factors that affect these scores. Generali7a,ility theory is an e<tension of the classical mo#el that o'ercomes many of these limitations in that in ena,les test #e'elo"ers to e<amine se'eral sources of 'ariance simultaneously an# to #istinguish systematic from ran#om error. %stimates of relia,ility ,ase# on classical measurement theory are ina""ro"riate for use +ith criterion*reference# tests ,ecause of #ifferences in the ty"es of com"arisons an# #ecisions ma#e. &ystematic error such as that associate# +ith test metho# is #ifferent from ran#om error. &alidation $he "rimary concern in test #e'elo"ment an# use is #emonstrating not only that test scores are relia,le ,ut that inter"retations an# uses +e ma9e of test scores are 'ali#. )t has ,een tra#itional to classify 'ali#ity into #ifferent ty"es such as content criterion an# construct 'ali#ity. Hali#ity is a unitary conce"t an# it al+ays refers to the #egree to +hich that e'i#ence su""orts the inferences that are ma#e from the scores. )n a##ition to the test2s content an# metho# 'ali#ation must consi#er ho+ test ta9ers "erform. $he e<amination of content rele'ance an# content co'erage is a necessary "art of the 'ali#ation "rocess. )nformation a,out criterion relate#ness D concurrent or "re#icti'e D is ,y itself insufficient e'i#ence for 'ali#ation. $he "rocess of construct 'ali#ation of "ro'i#ing e'i#ence for Bthe a#e:uacy of a test as a measure of the characteristic it is inter"rete# to assess2 is a com"le< an# continuous un#erta9ing in'ol'ing ,oth >1? theoretical logical analysis lea#ing to em"irically testa,le hy"otheses an# >2? a 'ariety of a""ro"riate a""roaches to em"irical o,ser'ation an# analysis. 0elia,ility is a re:uirement for 'ali#ity an# the in'estigation of relia,ility an# 'ali#ity can ,e 'ie+e# as com"lementary as"ects of i#entifying estimating an# inter"reting #ifferent sources of 'ariance in test scores. Hali#ity is concerne# +ith i#entifying the factors that "ro#uce the relia,le 'ariance in test scores. 0elia,ility is concerne# +ith #etermining ho+ much of the 'ariance in test scores is relia,le 'ariance +hile 'ali#ity is concerne# +ith #etermining +hat a,ilities contri,ute to this relia,le 'ariance. Another +ay to #istinguish relia,ility from 'ali#ity is to consi#er the theoretical frame+or9s u"on +hich they #e"en#. $he most im"ortant :uality to consi#er in the #e'elo"ment inter"retation an# use of language tests is 'ali#ity +hich has ,een #escri,e# as a unitary conce"t relate# to the a#e:uacy an# a""ro"riateness of the +ay +e inter"ret an# use test scores +hereas relia,ility is a necessary con#ition for 'ali#ity in the sense that test scores that are not relia,le cannot "ro'i#e a ,asis for 'ali# inter"retation an# use. )n or#er to e<amine 'ali#ity +e nee# a theory that s"ecifies the language a,ilities that +e hy"othesi7e +ill affect test "erformances. !istinguishing ,et+een relia,ility an# 'ali#ity then in'ol'es #ifferentiating sources of measurement error form other factors that affect test scores. )n or#er to ma<imi7e the relia,ility of test scores an# the 'ali#ity of test use +e shoul# follo+ three
fun#amental ste"s in the #e'elo"ment of tests= >1? "ro'i#e clear an# unam,iguous theoretical #efinitions of the a,ilities +e +ant to measureC >2? s"ecify "recisely the con#itions or o"erations that +e follo+ in eliciting an# o,ser'ing "erformance an# >5? :uantify our o,ser'ations so as to assure that our measurement scales ha'e the "ro"erties +e re:uire. Some persistent problems and "uture directions $he challenge facing us is to utili7e insights from linguistics language learning an# language teaching to #e'elo" tests as instruments of research that can lea# to a ,etter un#erstan#ing of the factors that affect "erformance on language tests. As #e'elo"ers an# users of language tests our tas9 is to incor"orate this increase# un#erstan#ing into "ractical test #esign construction an# use. Another ma;or challenge +ill ,e either to a#a"t current measurement mo#els to the analysis of language test scores or to #e'elo" ne+ mo#els that are a""ro"riate for such #ata. 1eeting these challenges +ill re:uire inno'ation an# the re* e<amination of e<isting assum"tions "roce#ures an# technology. $he most com"le< an# "ersistent "ro,lems in language testing are those "resente# ,y the consi#eration of the relationshi" ,et+een the language use re:uire# ,y tas9s on language tests an# that +hich is "art of our e'ery#ay communicati'e use of language. -ne of the t+o #istinct a""roaches for attem"ting to #escri,e this 'ital relationshi" or test Bauthenticity2 is to i#entify the Breal*life2 language use that +e e<"ect +ill ,e re:uire# of test ta9ers an# +ith this as a criterion attem"t to #esign test tas9s that mirror this an# the other is to e<amine actual non*test communicati'e language use in an attem"t to i#entify the critical or essential features of such language use. $he author is sure there are "ressing nee#s for language tests suita,le for uses in ma9ing minimum com"etency #ecisions a,out foreign language learners an# language teachers an# in the e'aluation of foreign language teaching metho#s. First our highest "riority must ,e gi'en to the continue# #e'elo"ment an# 'ali#ation of authentic tests of communicati'e language a,ility. &econ# is the #e'elo"ment of criterion*reference# measures of communicati'e language a,ility. A thir# area of nee# is in secon# language ac:uisition research +here criterion measures of language a,ilities that can ,e use# to assess learners2 "rogression through #e'elo"mental se:uences are still largely a,sent. $his ,oo9 is an authoritati'e an# ins"iring monogra"h es"ecially suita,le for #octors an# "ost* gra#uates ma;oring in a""lie# linguistics an# foreign language teaching theory as +ell as those +ho s"eciali7e in the #e'elo"ment an# use of language tests.