The International Research Foundation: Bias in Language Assessment: Selected References (Last Updated 1 June 2012)
The International Research Foundation: Bias in Language Assessment: Selected References (Last Updated 1 June 2012)
Berk, R. A. (Ed.). (1982). Handbook of methods for detecting test bias. Baltimore, MD:
John Hopkins University Press.
Chen, Z. & Henning, G. (1985). Linguistic and cultural bias in language proficiency tests.
Language Testing, 2(2), 155-163.
Cole, N. S. & Moss, P. A. (1989). Bias in test use. In R. L. Linn (Ed.), Educational
measurement (3rd ed.) (pp. 201-219). New York, NY: American Council on
Education and Macmillan Publishing.
Holland, P. W. & Thayer, D. (1985). An alternative definition of the ETS delta scale of
item difficulty (Research Report RR-85-43). Princeton, NJ: Educational Testing
Service.
Holland, P. W. & Thayer, D. (1988). Differential item performance and the Mantel-
Haenszel procedure. In H. Wainer & H. I. Brown (Eds.), Test validity (pp. 129-145).
Hillsdale, NJ: Lawrence Erlbaum.
Kunnan, A. J. (2000). Fairness and justice for all. In A. J. Kunnan (Ed.). Fairness and
validation in language assessment (pp. 1-14). Cambridge, UK: Cambridge
University Press.
Linacre, J. M., & Wright, B. D. (1986). Item bias: Mantel-Haenszel and the Rasch
Model. Chicago, IL: University of Chicago, MESA Psychometric Laboratory,
Memorandum #9.
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications
for training. Language Testing, 12, 54-71.
Mantel, N. & Haenszel, W. (1959). Statistical aspects of the analysis of data from
retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-
748.
Pae, T. (2004). DIF for learners with different academic backgrounds. Language Testing,
21(1), 53–73.
Park, G-P. (2008). Differential item functioning on an English learning test across
gender. TESOL Quarterly, 42(1), 115-123.
Phillips, A., & Holland, P. W. (1987). Estimators of the variance of the Mantel-Haenszel
log-odds ratio estimate. Biometrics, 43, 425-431.
Raju, N. S., Bode, R. K., & Larsen, V. S. (1989). An empirical assessment of the Mantel-
Haenszel statistic for studying differential item performance. Applied Measurement
in Education, 2(1), 1-13.
Roznowski, M., & Reith, J. (1999). Examining the measurement quality of tests
containing differentially functioning items: do biased items result in poor
measurement? Education and Psychological Measurement, 59(2), 248-270.
Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing,
25, 465-493.
Tittle, C. K. (1982). Use of judgmental methods in item bias studies. In R. A. Berk (Ed.),
Handbook of methods for detecting test bias (pp. 31-63). Baltimore, MD: John
Hopkins University Press.