0% found this document useful (0 votes)
253 views14 pages

Pisa PDF

PISA is an international assessment that measures the literacy skills of 15-year-old students. It assesses reading literacy, mathematics literacy, and science literacy. PISA is administered every three years under the OECD to monitor the educational performance of countries and to help inform education policy. The assessment provides valuable comparative data on student performance across countries over time.

Uploaded by

Rista Wani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
253 views14 pages

Pisa PDF

PISA is an international assessment that measures the literacy skills of 15-year-old students. It assesses reading literacy, mathematics literacy, and science literacy. PISA is administered every three years under the OECD to monitor the educational performance of countries and to help inform education policy. The assessment provides valuable comparative data on student performance across countries over time.

Uploaded by

Rista Wani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

PISA

NCES HANDBOOK OF SURVEY METHODS

Program for International Student


Assessment (PISA)
Website: http://nces.ed.gov/Surveys/PISA/
Updated: August 2014

1. OVERVIEW
INTERNATIONAL
he Program for International Student Assessment (PISA) is a system of

T
ASSESSMENT OF
international assessments that measures 15-year-old students capabilities in 15-YEAR-OLDS:
reading literacy, mathematics literacy, and science literacy every three years.
PISA, first implemented in 2000, was developed and is administered under the Assesses literacy skills
in the following areas:
auspices of the Organization for Economic Cooperation and Development (OECD),
an intergovernmental organization of industrialized countries. 1 PISA 2012 was the Reading literacy
fifth in this series of assessments; the next cycle of data is being collected in 2015.
The PISA Consortium, a group of international organizations engaged by the Mathematics literacy
OECD, is responsible for coordinating the study operations across countries and
currently consists of the German Institute for Educational Research and the Science literacy
Educational Testing Service. The National Center for Education Statistics (NCES),
in the Institute of Education Sciences at the U.S. Department of Education, is
responsible for the implementation of PISA in the United States.

PISA was implemented in 43 countries and economies in the first cycle (32 in 2000
and 11 in 2002), 41 in the second cycle (2003), 57 in the third cycle (2006) and 75
in the fourth cycle (65 in 2009 and 10 in 2010). In PISA 2012, 65 countries and
economies participated. The test is typically administered to between 4,500 and
10,000 students in each country/economy. Economies are regions of a country that
participate in PISA separately from the whole country.

Purpose
PISA provides internationally comparative information on the reading,
mathematics, and science literacy of students at an age that, for most economies, is
near the end of compulsory schooling. The objective of PISA is to measure the
yield of economies, or what skills and competencies students have acquired and
can apply in reading, mathematics, and science to real-world contexts by age 15.
The literacy concept emphasizes the mastery of processes, the understanding of
concepts, and the application of knowledge and functioning in various situations.
By focusing on literacy, PISA draws not only from school curricula but also from
learning that may occur outside of school.

1
Countries that participate in PISA are referred to as jurisdictions or economies throughout this chapter.

PISA-1
PISA
NCES HANDBOOK OF SURVEY METHODS

Components the scores of economies highest performing and


Assessment. PISA is designed to assess 15-year-olds lowest performing students;
performance in reading, mathematics, and science
literacy. PISA 2012 also included a problem solving the standard deviation of scores in each
assessment, in which not all countries participated jurisdiction; and
because of technical issues, and computer-based
other measures of the distribution of
reading, mathematics, and financial literacy, which
performance within jurisdictions.
participating economies had the option of
administering.

PISA 2015 will include collaborative problem solving PISA also supports cross-jurisdictional comparisons of
and financial literacy assessments in addition to the the performance of some subgroups of students,
core assessment subjects. In 2012, PISA was including students grouped by sex, immigrant status,
administered as a paper-and-pencil assessment; in and socioeconomic status. PISA data are not useful for
2015, PISA will be entirely computer-based. Each comparing the performance of racial/ethnic groups
student takes a two-hour assessment. Assessment items across jurisdictions because relevant racial/ethnic
include a combination of multiple-choice questions, groups differ across jurisdictions. However, PISA
closed- or short- response questions (for which answers datasets for the United States include information that
are either correct or incorrect), and open-constructed can be used in comparing groups of students by
response questions (for which answers can receive race/ethnicity and school poverty level.
partial credit).
Contextual measures taken from student and principal
Questionnaires. Students complete a 30-minute questionnaires can be used to compare the educational
questionnaire providing information about their contexts of 15-year-old students across jurisdictions.
backgrounds, attitudes, and experiences in school. In Caution should be taken, however, in attempting to
addition, the principal of each participating school interpret associations between measures of educational
completes a 30-minute questionnaire on school context and student performance. The PISA assessment
characteristics and policies. Teacher questionnaires will is intended to tap factual knowledge and problem-
be a new addition to the PISA data collection starting in solving skills that students learn over several years,
2015. whereas PISA contextual measures typically reference
students current school context. In the United States,
Periodicity for example, data collection occurs in the fall of the
school year; therefore, contextual measures may apply
PISA operates on a three-year cycle. Each PISA
to schools that children have attended for only 1 or 2
assessment cycle focuses on one subject in particular,
although all three subjects are assessed every year. In months.
2000, PISA focused on reading literacy; in 2003, on
Through the collection of comparable information
mathematics literacy (including problem solving); and
across jurisdictions at the student and school levels,
in 2006, on science literacy. In 2009, the focus was
PISA adds significantly to the knowledge base that was
again on reading literacy, and PISA 2012 focused on
previously available only from official national
mathematics (including problem solving and financial
statistics.
literacy). In 2015, PISA will focus on science literacy
(including collaborative problem solving and financial
literacy). 3. KEY CONCEPTS
The types of literacy measured by PISA are defined as
2. USES OF DATA follows.
PISA provides valuable information for comparisons of Reading literacy. An individuals capacity to
student performance across jurisdictions and over time understand, use, reflect on and engage with written
at the national level and for some jurisdictions at the texts, in order to achieve ones goals, to develop ones
subnational level. Performance in each subject area can knowledge and potential, and to participate in society.
be compared across jurisdictions in terms of:
Mathematics literacy. An individuals capacity to
economies mean scores; identify and understand the role that mathematics plays
in the world, make well-founded judgments, and use
the proportion of students in each jurisdiction and engage with mathematics in ways that meet ones
reaching PISA proficiency levels;

PISA-2
PISA
NCES HANDBOOK OF SURVEY METHODS

needs as a constructive, concerned, and reflective 2, 2012, to November 30, 2012) to accommodate
citizen. school requests, for PISA 2012. In the United States,
students born between July 1, 1996, and June 30, 1997,
The PISA mathematics framework was updated for the were eligible to participate in PISA 2012.
2012 assessment. The revised framework is intended to
clarify the mathematics relevant to 15-year-old The U.S. PISA 2012 national school sample consisted
students, while ensuring that the items developed of 240 schools. This number was increased from the
remain set in meaningful and authentic contexts, and international minimum requirement of 150 to offset
defines the mathematical processes in which students school nonresponse and reduce design effects. Schools
engage as they solve problems. These processes, were selected with probability proportionate to the
described above, are being used for the first time in schools estimated enrollment of 15-year-olds. The data
2012 as a primary reporting dimension. Although the for public schools were from the 200809 Common
framework has been updated, it is still possible to Core of Data and the data for private schools were from
measure trends in mathematics literacy over time, as the the 200910 Private School Universe Survey. Any
underlying construct is intact. school containing at least one 7th- through 12th-grade
class was included in the school sampling frame.
Science literacy. An individuals scientific knowledge Participating schools provided a list of 15-year-old
and the use of that knowledge to identify questions, students (typically in August or September 2012) from
acquire new knowledge, explain scientific phenomena, which the sample was drawn using sampling software
and draw evidence-based conclusions about science- provided by the international contractor.
related issues; an understanding of the characteristic
features of science as a form of human knowledge and International Sample Design
inquiry; an awareness of how science and technology
The sample design for PISA 2012 was a stratified
shape our material, intellectual, and cultural
systematic sample, with sampling probabilities
environments; and a willingness to engage in science-
related issuesand with the ideas of scienceas a proportional to the estimated number of 15-year-old
reflective citizen. students in the school based on grade enrollments.
Samples were drawn using a two-stage sampling
process. The first stage was a sample of schools, and
4. SURVEY DESIGN the second stage was a sample of students within
The survey design for PISA data collections is schools. The PISA international contractors responsible
discussed in this section. for the design and implementation of PISA
internationally (hereafter referred to as the PISA
Target Population consortium) drew the sample of schools for each
economy.
The desired PISA target population consisted of 15-
year-old students attending public or private A minimum of 4,500 students from a minimum of 150
educational institutions located within the jurisdiction, schools was required in each country. Following the
in grades 7 through 12. Jurisdictions were to include PISA consortium guidelines, replacement schools were
15-year-old students enrolled either full time or part identified at the same time the PISA sample was
time in an educational institution, in a vocational selected by assigning the two schools neighboring the
training or related type of educational program, or in a sampled school in the frame as replacements. The
foreign school within the jurisdiction (as well as international guidelines specified that within schools, a
students from other jurisdictions attending any of the sample of 42 students was to be selected in an equal
programs in the first three categories). It was probability sample unless fewer than 42 students age 15
recognized that no testing of persons schooled in the were available (in which case all 15-year-old students
home, workplace, or out of the jurisdiction occurred; were selected).
therefore, these students were not included in the
international target population. International within-school exclusion rules for students
were specified as follows:
The operational definition of an age population depends
directly on the testing dates. International standards Students with functional disabilities. These were
required that students in the sample be 15 years and 3 students with a moderate to severe permanent
months to 16 years and 2 months at the beginning of physical disability such that they could not
the testing period. The technical standard for the perform in the PISA testing environment.
maximum length of the testing period was 42 days, but
the United States requested and was granted permission Students with intellectual disabilities. These
to expand the testing window to 60 days (from October were students with a mental or emotional

PISA-3
PISA
NCES HANDBOOK OF SURVEY METHODS

disability who had been tested as cognitively process) to increase the response rate once the 65
delayed or who were considered in the percent benchmark had been reached.
professional opinion of qualified staff to be
cognitively delayed such that they could not Each sampled school was to be assigned two
perform in the PISA testing situation. replacement schools in the sampling frame. If the
original sampled school refused to participate, a
Students with insufficient language experience. replacement school was asked to participate. One
These were students who met the three criteria sampled school could not substitute for another
of (1) not being a native speaker in the sampled school, and a given school could only be
assessment language, (2) having limited assigned to substitute for one sampled school. A
proficiency in the assessment language, and (3) requirement of these substitute schools was that they be
having received less than a year of instruction in in the same explicit stratum as the original sampled
the assessment language. In the United States, school. The international guidelines define the response
English was the exclusive language of the rate as the number of participating schools (both
assessment. original and replacement schools) divided by the total
number of eligible original sampled schools. 2
A school attended only by students who would be
excluded for functional, intellectual, or linguistic Student response rates. A minimum response rate of
reasons was considered a school-level exclusion. 80 percent of selected students across participating
International exclusion rules for schools allowed for schools was required. A student was considered to be a
schools in remote regions or very small schools to be participant if he or she participated in the first testing
excluded. School-level exclusions for inaccessibility, session or a follow-up or makeup testing session.
feasibility, or other reasons were required to cover
fewer than 0.5 percent of the total number of students Within each school, a student response rate of 50
in the international PISA target population. percent was required for a school to be regarded as
International guidelines state that no more than 5 participating: the overall student response rate was
percent of a jurisdictions desired national target computed using only students from schools with at least
population should be excluded from the sample. a 50 percent response rate. Weighted student response
rates were used to determine if this standard was met;
A minimum of 150 schools (or all schools, if there were each students weight was the reciprocal of his or her
fewer than 150 in a participating jurisdiction) had to be probability for selection into the sample.
selected in each jurisdiction. Within each participating
school, a sample of the PISA-eligible students was Sample Design in the United States
selected with equal probability. In total, a minimum The design of the U.S. school sample for PISA 2012
sample size of 4,500 assessed students was to be was developed to achieve each of the international
achieved in each jurisdiction. If a jurisdiction had fewer requirements set forth in the PISA sampling manual.
than 4,500 eligible students, then the sample size was The U.S. school sample was a stratified systematic
the national defined target population. The national sample, consisting of two stages, and was intended to
defined target population included all eligible students approximate a self-weighting sample of students, with
in the schools that were listed in the school sampling each 15-year-old student having equal probability of
frame. being selected. In the first stage, schools were selected
with a probability proportionate to the schools
Response Rate Targets estimated enrollment of 15-year-olds. In the second
School response rates. The PISA international
guidelines for the 2012 assessment required that
jurisdictions achieve an 85 percent school response
rate. However, while stating that each jurisdiction must
make every effort to obtain cooperation from the 2
The calculation of response rates described here is based on the
sampled schools, the requirements also recognized that formula stated in the international guidelines and is not consistent
this is not always possible. Thus, it was allowable to with NCES standards. A more conservative way to calculate response
use substitute, or replacement, schools as a means to rates would be to include participating replacement schools in the
avoid loss of sample size associated with school denominator as well as in the numerator and to add replacement
schools that were hard refusals to the denominator.
nonresponse. The international guidelines stated that at
least 65 percent of participating schools must be from
the original sample. Economies were only allowed to
use replacement schools (selected during the sampling

PISA-4
PISA
NCES HANDBOOK OF SURVEY METHODS

stage, a sample of 50 students was selected from each sampling and data management software provided by
school in an equal probability sample, regardless of size ACER. KeyQuest was used to manage the sample,
(all eligible students were selected if there were fewer draw the student sample, track participation, and
than 50). The United States set a target cluster size produce verification reports used to clean the data in
(TCS) of 50 students per school in order to achieve the preparation for submitting the data file to ACER.
required student yield of 35 assessed students per
school (taking into account student exclusions and A list of schools for the U.S. sample was prepared
absences). The TCS for the main study was slightly using data from the 2008-09 Common Core of Data
larger than the TCS used on PISA 2009 in the United (CCD) and the 2009-10 Private School Universe
States to account for the financial literacy assessment. Survey (PSS), two NCES surveys with the most current
Out of the 50 students, 42 were sampled to take the data at the time of the PISA frame construction. The
paper-based mathematics, science, and reading literacy U.S. school sample for PISA 2012 consisted of 240
assessment. Out of these 42 students, 20 were schools containing at least one 7th through 12th grade
subsampled to also take the computer-based class194 large schools with at least 50 estimated
assessment. The remaining eight students were sampled eligible students, 11 moderately small schools with
to take the financial literacy assessment. If fewer than between 25 and 50 estimated eligible students and 16
50 age-eligible students were enrolled in a school, all very small schools with less than 25 but greater than 2
15-year-old students in that school were selected. The estimated eligible students, and 19 very small schools
U.S. national TCS and student sampling plans were with estimated eligible enrollment of less than equal to
approved by the international consortium. Within each 2 eligible students. Eligible schools in the PISA 2012
stratum, the frame was implicitly stratified (i.e., sorted school frame included 207 of the 240 original sampled
for sampling) by five categorical stratification schools in the U.S. national sample (18 schools did not
variables: grade range of the school (five categories); have any 15-year-olds enrolled, 6 had closed, and 9
type of location relative to populous areas (city, suburb, were otherwise ineligible), and 139 agreed to
town, rural); first three digits of the zip code; combined participate.
percentage of Black, Hispanic, Asian, Pacific Islander,
and American Indian/Alaska Native students (above or Assessment Design
below 15 percent); and estimated enrollment of 15- Test scope and format. In PISA 2012, the three subject
year-olds. domains were tested, with mathematics as the major
domain and reading and science as the minor domains.
The 2012 PISA sampling employed techniques to Every student answered mathematics items, and some
undersample very small schools (those with fewer than students answered reading items, science items, or both
twenty-one 15-year-olds) and to minimize overlap with reading and science items.
the High School Longitudinal Study of 2012
(HSLS:12), a U.S. education study with data collection The development of the PISA 2012 assessment
conducted in fall 2012. If any PISA substitute school instruments was an interactive process among the PISA
overlapped with an originally sampled or first substitute Consortium, various expert committees, and OECD
HSLS school, the substitute was not to be contacted for members. The assessment included items submitted by
PISA. Under this rule, none of the schools were participating jurisdictions and items developed by the
eliminated from the list of PISA substitute schools. consortiums test developers. Representatives of each
jurisdiction reviewed the items for possible bias and for
In 2012, as in each cycle since 2003, schools were relevance to PISAs goals. The intention was to reflect
selected in the first stage with probability proportional in the assessment the national, cultural, and linguistic
to size (PPS) sampling, and students were sampled in variety of the OECD jurisdictions. Following a field
the second stage, yielding overall equal probabilities of trial that was conducted in most jurisdictions, test
selection. Comparatively in PISA 2000, the U.S. school developers and expert groups considered a variety of
sample had a three-stage design, the first of which was aspects in selecting the items for the main study: (a) the
the selection of a sample of geographic primary results from the field trial, (b) the outcome of the item
sampling units (PSUs). The change to a two-stage review from jurisdictions, and (c) queries received
model was made in PISA 2003 to reduce the design about the items.
effects observed in the 2000 data and to minimize
respondent burden on individual districts by PISA 2012 was a paper-and-pencil assessment.
distributing the response burden of the study across Approximately half of the items were multiple choice,
districts as much as possible. about 20 percent were closed- or short-response items
(for which students wrote an answer that was simply
Once the list of students was received from a school, it either correct or incorrect), and about 30 percent were
was formatted for importing into KeyQuest, the open constructed-response items (which were graded

PISA-5
PISA
NCES HANDBOOK OF SURVEY METHODS

by trained scorers using an international scoring guide (for which students wrote an answer that was simply
and could be assigned partial credit). either correct or incorrect), and about 30 percent were
open constructed responses (for which students wrote
Multiple-choice items were either (a) standard multiple answers that were graded by trained scorers using an
choice, with a limited number (usually four) of international scoring guide). In PISA 2012, with the
responses from which students were required to select exception of students participating in the financial
the best answer; or (b) complex multiple choice, which literacy assessment, every student answered
presented several statements, each of which required mathematics items. Not all students answered reading,
students to choose one of several possible responses science items, problem solving and/or financial literacy
(true/false, correct/incorrect, etc.). Closed- or short- items.
response items included items that required students to
construct their own responses from a limited range of For 2012, a subset of students who took the paper-
acceptable answers or to provide a brief answer from a based assessment also took a 40-minute computer-
wider range of possible answers, such as mathematics based assessment. The computer-based assessment
items requiring a numeric answer, and items requiring a consisted of 168 problem-solving items, 164
word or short phrase. Open constructed-response items mathematics items, and 144 reading items allocated to
required more extensive writing, or showing a 24 assessment forms. Each form was made up of two
calculation, and frequently included some explanation clusters that together contained 18 to 22 items.
or justification. Pencils, erasers, rulers, and (in some Altogether there were four clusters of problem solving,
cases) calculators were provided. four clusters of mathematics, and two clusters of
reading. In addition to the cognitive assessment,
In 2012, computer-based assessments in mathematics students also completed a 30-minute questionnaire
and reading were offered as optional assessments for designed to provide information about their
participating economies. Thirty-two economies, backgrounds, attitudes, and experiences in school.
including the United States, chose to administer them. Principals in schools where PISA was administered
In these economies, a subset of students who took the also completed a 30-minute questionnaire about their
paper-based assessment also took an additional schools.
computer-based assessment. Although the paper-based
assessment items and the computer-based assessment Data Collection and Processing
items were derived from the same frameworks, there
PISA 2012 was coordinated by the OECD and managed
was no overlap in the assessment items between the
at the international level by the PISA Consortium.
two assessment modes. The interactive nature of
PISA is implemented in each education system by a
computer-based assessment allowed PISA to assess
National Project Manager (NPM). In the United States,
students in novel contexts that are not possible with a
the NPM works with a national data collection
traditional paper-based format.
contractor to implement procedures prepared by the
Test design. The PISA 2012 final paper-based PISA Consortium and agreed to by the participating
assessment consisted of 85 mathematics items, 44 jurisdictions. In 2012, the U.S. national data collection
reading items, 53 science items, and 40 financial contractor was Westat as well as a subcontractor,
literacy items allocated to 17 test booklets (in Pearson. A steering committee also gave input on the
economies that did not administer the optional financial dissemination and development of PISA in the United
literacy assessment there were 13 test booklets). Each States.
booklet was made up of four test clusters. Altogether
The 2012 PISA multicycle study was again
there were seven mathematics clusters, three reading
collaboration between the governments of participating
clusters, three science clusters, and two financial
countries, the Organization for Economic Cooperation
literacy clusters. The mathematics, science, and reading
and Development (OECD), and a consortium of various
clusters were allocated in a rotated design to 13
international organizations, referred to as the PISA
booklets. The financial literacy clusters in conjunction
Consortium. This consortium in 2012 was led by the
with mathematics and reading clusters were allocated in
Australian Council for Educational Research (ACER)
a rotated design to four booklets. The average number
and includes the German Institute for International
of items per cluster was 12 items for mathematics, 15
Educational Research (DIPF), the German Social
items for reading, 18 items for science, and 20 items for
Sciences Infrastructure Services Centre for Survey
financial literacy. Each cluster was designed to average
Research and Methodology (GESIS-ZUMA), the
30 minutes of test material. Each student took one
University of Maastrichts Research Centre for
booklet, with about 2 hours worth of testing material.
Education and the Labour Market (ROA), the U.S.
Approximately half of the items were multiple-choice,
research company Westat, the International Association
about 20 percent were closed or short response types
for the Evaluation of Educational Achievement (IEA),

PISA-6
PISA
NCES HANDBOOK OF SURVEY METHODS

and the Belgian firm CapStan. There have been changes The students were randomly assigned one of 17 test
to the PISA consortium in charge of administering the booklets, which test administrators distributed. U.S.
2015 data collection. students who took the mathematics assessment in PISA
2012 were allowed to use, and were provided,
Reference dates. Each economy collected its own data, calculators.
following international guidelines and specifications.
The technical standards required that students in the Scoring. A substantial portion of the PISA 2012
sample be 15 years and 3 months to 16 years and 2 assessment was devoted to open constructed-response
months at the beginning of the testing period. The items. The process of scoring these items is an
maximum length of the testing period was 42 days. important step in ensuring the quality and
Most economies conducted testing from March through comparability of the PISA data. Detailed guidelines
August 2012. The United States and the United were developed for the scoring guides themselves,
Kingdom were given permission to move the testing training materials to recruit scorers, and workshop
dates to September through November in an effort to materials used for the training of national scorers. Prior
improve response rates. The range of eligible birth to the national training, the PISA Consortium organized
dates was adjusted so that the mean age remained the international training sessions to present the material
same (i.e., 15 years and 3 months to 16 years and 2 and train scoring coordinators from the participating
months at the beginning of the testing period). In 2003, jurisdictions, who in turn trained the national scorers.
the United States conducted PISA in the spring and fall
and found no significant difference in student For each test item, the scoring guides described the
performance between the two time points. intent of the question and how to code students
responses. This description included the credit labels
Incentive. School packages were mailed to principals in full credit, partial credit, or no creditattached to the
mid-September with phone contact from recruiters possible categories of response. Also included was a
beginning a few days after the mailing. As part of the system of double-digit coding for some mathematics
PISA 2012 school recruitment strategy, the materials and science items, where the first digit represented the
included a description of school and student incentives. score and the second digit represented the different
Schools and school coordinators were each paid $200, strategies or approaches that students used to solve the
and students received $25 and 4 hours of community problem. The second digit generated national profiles
service for participating in the paper-based session and of student strategies and misconceptions. In addition,
an additional $15 if they were selected and participated the scoring guides included real examples of students
in the computer-based assessment. responses accompanied by a rationale for their
classification for purposes of clarity and illustration.
Data collection. Each economy collected its own data.
The PISA consortium emphasizes the implementation To examine the consistency of this marking process in
of standardized procedures in all jurisdictions (data more detail within each jurisdiction (and to estimate the
collection followed a manual developed by the PISA magnitude of the variance components associated with
Consortium). Professional staff, trained in the the use of scorers), the PISA Consortium generated an
international guidelines, were responsible for test inter-rater reliability report on a subsample of
administration. School staff members were only assessment booklets. The results of the homogeneity
responsible for specifying parental consent analysis showed that the marking process of items is
requirements, listing students, and providing testing largely satisfactory and that on average countries are
space. more or less reliable in the coding of the open-ended
responses.
To ensure quality, test administrators were observed in
a sample of schools in each jurisdiction by a PISA For the PISA 2012, approximately half of the items
Quality Monitor (PQM). PQMs were engaged by the were multiple-choice, about 20 percent were closed or
PISA Consortium itself. Observed schools were chosen short response types (for which students wrote an
jointly by the PISA Consortium and the PQMs; in the answer that was simply either correct or incorrect), and
United States, a total of 7 schools were observed by the about 30 percent were open constructed responses (for
PQM. The main responsibility of the PQM was to which students wrote answers that were graded by
record the extent to which testing procedures in schools trained scorers using an international scoring guide).
were implemented in accordance with the standard test
administration procedures. In U.S. schools, the PQMs Data entry and verification. In PISA 20012 each
observations indicated that international procedures for jurisdiction was responsible for entering data into data
data collection were applied consistently. files following a common international format.
Variables could be added or deleted as needed for
different national options; approved adaptations to

PISA-7
PISA
NCES HANDBOOK OF SURVEY METHODS

response categories could also be accommodated. design. Economies that participated in the financial
Student response data were entered directly from the literacy assessment had a total of 17 booklets each.
test booklets and questionnaires using specialized Each student completed one test booklet. The fact that
software that allowed the data files to be merged into each student completed only a subset of items means
KeyQuest and facilitated the checking and correction of that classical test scores, such as the percent correct, are
data through various data consistency checks. After not accurate measures of student performance. Instead,
these checks, the data were sent to ACER for data scaling techniques were used to establish a common
cleaning; there, the data were checked to ensure they scale for all students. In PISA 2009, item response
followed the international structure, the identification theory (IRT) was used to estimate average scores in
system was reviewed, single case problems were each jurisdiction for science, mathematics, and reading
corrected manually, and standard data cleaning literacy, as well as for three reading literacy subscales:
procedures were applied to the questionnaire files. integrating and interpreting, accessing and retrieving,
and reflecting and evaluating. Subscale scores were not
During data cleaning, analysts identified as many available for mathematics literacy or science literacy
anomalies and inconsistencies as possible, and through for 2009 because not all students answered science
a process of extensive discussion between each national and/or mathematics items.
center and ACER, an effort was made to correct and
resolve all data issues. After this, ACER compiled IRT identifies patterns of response and uses statistical
background univariate statistics and performed models to predict the probability of a student answering
preliminary classical and Rasch item analysis. an item correctly as a function of his or her proficiency
in answering other questions. PISA 2009 used a mixed
Estimation Methods coefficients multinomial logit IRT model. This model is
Weighting. The use of sampling weights is necessary similar in principle to the more familiar two-parameter
for the computation of statistically sound, nationally logistic IRT model. With the multinomial logit IRT
representative estimates. Adjusted survey weights model, the performance of a sample of students in a
account for the probabilities of selection for individual subject area or subarea can be summarized on a simple
schools and students, for school or student scale or series of scales, even when students are
nonresponse, and for errors in estimating the size of the administered different items.
school or the number of 15-year-olds in the school at
the time of sampling. IRT was used for PISA 2012 also, to estimate average
scores for mathematics, science, and reading literacy
The internationally defined weighting specifications for for each economy, as well as for three mathematics
PISA 2012 included base weights and adjustments for process and four mathematics content scales. For
nonresponse. The school base weight was defined as economies participating in the financial literacy
the reciprocal of the schools probability of selection. assessment and the computer-based assessment, these
(For substitute schools, the school base weight was set assessments will be scaled separately and assigned
equal to the original school it replaced.) The student separate scores.
base weight was given as the reciprocal of the
probability of selection for each student selected from Plausible values. Scores for students are estimated as
within a school. plausible values because each student completed only a
subset of items. These values represent the distribution
These base weights were then adjusted for school and of potential scores for all students in the population
student nonresponse. The school nonresponse with similar characteristics and identical patterns of
adjustment was done individually for each jurisdiction item response. It is important to recognize that
using implicit and explicit strata defined as part of the plausible values are not test scores and should not be
sample design. In the case of the United States, two treated as such. Plausible values are randomly drawn
variables were used: school control and Census region. from the distribution of scores that could be reasonably
The student nonresponse adjustment was done within assigned to each individual. As such, the plausible
cells based first on students final school nonresponse values contain random error variance components and
rate and their explicit stratum; within that, grade and are not optimal as scores for individuals. Five plausible
gender were used. values were estimated for each student for each scale in
PISA 2012. Thus, statistics describing performance on
All PISA 2012 analyses were conducted using these the PISA science, reading, and mathematics literacy
adjusted sampling weights. scales are based on plausible values.

Scaling. There were 13 test booklets, each containing a If an analysis is to be undertaken with one of these
slightly different subset of items, in the PISA 2012 cognitive scales, then (ideally) the analysis should be

PISA-8
PISA
NCES HANDBOOK OF SURVEY METHODS

undertaken five times, once with each of the five 22 items were common to PISA 2003 and PISA 2006.
relevant plausible value variables. The results of these The science assessment for PISA 2012 consisted of 53
five analyses are averaged; then, significance tests that items that were used in PISA 2009 and 2006.
adjust for variation between the five sets of results are
computed. To establish common reporting metrics for PISA, the
difficulty of the link items, measured on different
Imputation. Missing background data from student and occasions, is compared. Using procedures that are
principal questionnaires are not imputed for PISA 2009 detailed in the PISA 2012 Technical Report, the
reports. PISA 2012 also did not impute missing comparison of item difficulty on different occasions is
information for questionnaire variables. used to determine a score transformation that allows the
reporting of the data for a particular subject on a
In general, item response rates for variables discussed common scale. The change in the difficulty of the
in NCES PISA reports exceed the NCES standard of 85 individual link items is used in determining the
percent. transformation; as a consequence, the sample of link
items that has been chosen will influence the choice of
Measuring trends. Although the PISA 2012 framework transformation. This means that if an alternative set of
was updated, it is still possible to measure trends in link items had been chosen, the resulting transformation
mathematics literacy over time, as the underlying would be slightly different. The consequence is an
construct is intact. For specific trends in performance uncertainty in the transformation due to the sampling of
results, please see the NCES PISA website the link items, just as there is an uncertainty in values
(http://nces.ed.gov/surveys/pisa/pisa2012/). Reading such as jurisdiction means due to the use of a sample of
literacy scales used in PISA 2000, PISA 2003, PISA students.
2006, PISA 2009, and PISA 2012 are directly
comparable, which means that the value of 500 in PISA Future Plans
2012 has the same relative meaning as it did in PISA
2000, PISA 2003, PISA 2006, and PISA 2009. The next cycle of PISA data collection will take place
However, for PISA 2003, the mathematics assessment in 2015.
underwent major development work and was broadened
(http://nces.ed.gov/Surveys/PISA/)
to include four sub-domains; only two of these
appeared in PISA 2000. As such, mathematics literacy
scales are only comparable between PISA 2003, PISA 5. DATA QUALITY AND
2006, PISA 2009, and PISA 2012. Likewise, PISA COMPARABILITY
2006 was the first major assessment of science literacy;
thus, the science literacy scale in PISA 2006 is only A comprehensive program of continuous quality
directly comparable with PISA 2009 and PISA 2012. monitoring was central to ensuring full, valid
implementation of the PISA procedures and the
The PISA 2000, 2003, 2006, 2009, and 2012 recording of deviations from these procedures. Quality
assessments of reading, mathematics, and science are monitors from the PISA Consortium visited a sample of
linked assessments. That is, the sets of items used to schools in every jurisdiction to ensure that testing
assess each domain in each year include a subset of procedures were carried out in a consistent manner. The
common items; these common items are referred to as purpose of quality monitoring is to observe and record
link items. In PISA 2000 and PISA 2003, there were 28 the implementation of the described procedures;
reading items, 20 math items, and 25 science items that therefore, the field operations manuals provided the
were used in both assessments. The same 28 reading foundation for all the quality monitoring procedures.
items were retained in 2006 to link the PISA 2006 data
to PISA 2003, The PISA 2009 assessment included 26 The manuals that formed the basis for the quality
of these 28 reading items and a further 11 reading items monitoring procedures were the PISA Consortium data
from PISA 2000, not used since that administration, collection manual and the PISA data management
were also included in PISA 2009. The PISA 2012 manual. In addition, the PISA data were verified at
assessment included 37 of these link items from 2009 several points starting at the time of data entry.
as well as an additional 7 items included in 2009 to
establish the reading trend scale. In mathematics, 48 Despite the efforts taken to minimize error, as with any
math items from PISA 2003 were used in PISA 2006; study, PISA has limitations that researchers should take
PISA 2009 included 35 of the 48 mathematics items into consideration. This section contains a discussion of
that were used in PISA 2006, and of these, 34 were two possible sources of error in PISA: sampling and
used in PISA 2012. For the science assessment, 14 nonsampling errors.
items were common to PISA 2000 and PISA 2006, and

PISA-9
PISA
NCES HANDBOOK OF SURVEY METHODS

Sampling Error information in a study; a potential source of respondent


Sampling errors occur when a discrepancy between a bias in this survey was social desirability bias. For
population characteristic and the sample estimate arises example, students may overstate their parents
because not all members of the target population are educational attainment or occupational status. If there
sampled for the survey. The size of the sample relative were no systematic differences among specific groups
to the population and the variability of the population under study in their tendency to give socially desirable
characteristics both influence the magnitude of responses, then comparisons of the different groups
sampling error. The particular sample of 15-year-old would accurately reflect differences among groups.
students from the 2011-12 school year was just one of Readers should be aware that respondent bias may be
many possible samples that could have been selected. present in this survey as in any survey; however, it is
Therefore, estimates produced from the PISA 2012 not possible to state precisely how such bias may affect
sample may differ from estimates that would have been the results.
produced had another sample of students been selected.
This type of variability is called sampling error because Coverage error. Every National Project Manager
it arises from using a sample of 15-year-old students (NPM) was required to define and describe their
rather than all 15-year-old students in that year. jurisdictions national desired target population and
explain how and why it might deviate from the
The standard error is a measure of the variability owing international target population. Any hardships in
to sampling when estimating a statistic. The approach accomplishing complete coverage were specified,
used for calculating sampling variances in PISA is discussed, and approved (or not) in advance. Where the
Fays method of balanced repeated replication (BRR). national desired target population deviated from full
This method of producing standard errors uses national coverage of all eligible students, the deviations
information about the sample design to produce more were described and enrollment data provided to
accurate standard errors than would be produced using measure how much that coverage was reduced. School-
simple random sample (SRS) assumptions for non-SRS level and within-school exclusions from the national
data. Thus, the standard errors reported in PISA can be desired target population resulted in a national defined
used as a measure of the precision expected from this target population corresponding to the population of
particular sample. students recorded in each jurisdictions school sampling
frame.
Nonsampling Error
In PISA 2009, the United States reported 82 percent
Nonsampling error is a term used to describe variations coverage of the 15-year-old population and 95 percent
in the estimates that may be caused by population coverage of the national desired target population
coverage limitations, nonresponse bias, and (OECD 2010). The United States reported a 5.2 percent
measurement error, as well as data collection, overall exclusion rate, which was higher than the
processing, and reporting procedures. For example, the internationally acceptable exclusion rate of 5 percent.
sampling frame in the United States was limited to However, when language exclusions were accounted
regular public and private schools in the 50 states and for (i.e., removed from the overall exclusion rate), the
the District of Columbia and cannot be used to United States no longer had an exclusion rate greater
represent Puerto Rico or other jurisdictions (e.g., other than 5 percent. For PISA 2012, 95 percent coverage of
U.S. territories and DoD schools overseas). The sources the national desired target population was achieved.
of nonsampling errors are typically problems such as
unit and item nonresponse, the differences in Nonresponse error. Nonresponse error results from
respondents interpretations of the meaning of survey nonparticipation of schools and students. School
questions, response differences related to the particular nonresponse, without replacement schools, will lead to
time the survey was conducted, and mistakes in data the underrepresentation of students from the type of
preparation. school that did not participate, unless weighting
adjustments are made. It is also possible that only a part
In general, it is difficult to identify and estimate either of the eligible population in a school (such as those 15-
the amount of nonsampling error or how much bias it year-olds in a single grade) was represented by the
causes. In PISA 2012, efforts were made to prevent schools student sample; this also requires weighting to
such errors from occurring and to compensate for them compensate for the missing data from the omitted
when possible. For example, the design phase entailed a grades. Student nonresponse within participating
field test that evaluated items as well as the schools occurred to varying extents. Students who
implementation procedures for the survey. One type of could not be given achievement test scores but were not
nonsampling error that may be present in PISA is excluded for linguistic or disability reasons, will be
respondent bias, which occurs when respondents
systematically misreport (intentionally or not)

PISA-10
PISA
NCES HANDBOOK OF SURVEY METHODS

underrepresented in the data unless weighting bias. Frame characteristics were taken from the 2008
adjustments are made. 09 Common Core of Data for public schools and from
the 200910 Private School Universe Survey for
Unit nonresponse. Of the 240 original sampled schools private schools. The available school characteristics
in the PISA 2012 U.S. national sample, 207 were included affiliation (public or private), locale (city,
eligible (18 schools did not have any 15-year-olds suburb, town, rural), Census region, number of age-
enrolled, 6 had closed, and 9 were otherwise ineligible), eligible students, total number of students, and
and 139 agreed to participate. The weighted school percentage of various racial/ethnic groups (White,
response rate before replacement was 67 percent, Black, Hispanic, non-Hispanic, Asian, American Indian
requiring the United States to conduct a nonresponse or Alaska Native, Native Hawaiian/Pacific Islander,
bias analysis, which was used by the PISA consortium and multiracial). The percentage of students eligible for
and the Organization for Economic Cooperation and free or reduced-price lunch was available for public
Development (OECD) to evaluate the quality of the schools only.
final sample. However, investigation into nonresponse
bias at the school level in the United States in PISA For original sample schools, participating schools had a
2012 provides evidence that there is little potential for higher mean percentage of Hispanic students than the
nonresponse bias in the PISA participating sample total eligible sample of schools (21.1 versus 18.1
based on the characteristics studied. It also suggests percent, respectively). Participating original sample
that, while there is little evidence that the use of schools also had a higher mean percentage of students
substitute schools reduced the potential for bias, it has eligible for free or reduced-price lunch than did the
not added to it. Moreover, the application of school total eligible sample of schools (39.3 versus 36.1
nonresponse adjustments substantially reduced the percent, respectively). All factors were then considered
potential for bias. simultaneously in a logistic regression analysis, and
only town (a territory inside an urban cluster with a
Table PISA-1. U.S. weighted school and student core population between 25,000 and 50,000) was a
response rates: PISA 2012 significant predictor of participation. The percentage of
Weighted response students eligible for free or reduced-price lunch was not
rate (percent) included in the logistic regression analysis as public
School and private schools were modeled together using only
the variables available for all schools. For final sample
Before replacement 67 schools (with substitutes), participating schools had a
After replacement 77 higher mean percentage of students eligible for free or
Student 89 reduced-price lunch than the total eligible sample of
SOURCE: Kelly, D., Xie, H., Nord, C.W., Jenkins, F., Chan, schools (38.4 versus 36.2 percent, respectively). When
J.Y., and Kastberg, D. (2013). Performance of U.S. 15-Year- all factors were considered simultaneously in a logistic
Old Students in Mathematics, Science, and Reading Literacy regression analysis (again with free or reduced-price
in an International Context: First Look at PISA 2012 (NCES lunch eligibility omitted), no variables were statistically
2014-024). U.S. Department of Education. Washington, DC: significant predictors of participation.
National Center for Education Statistics. (pp. B-3, B-5)
With the inclusion of substitute schools and school
A total of 6,110 students in the United States were nonresponse adjustments applied to the weights, only
sampled for the PISA 2012 assessment. The overall the percentage of students eligible for free or reduced-
student exclusion rate for the United States was 5.3 price lunch remained statistically significant.
percent, with an overall weighted student participation Specifically, the participating schools had a higher
rate after replacement of 89 percent. mean percentage of students eligible to receive free or
reduced-price lunch than the total eligible sample of
For PISA 2012, a bias analysis was conducted in the schools (38.4 versus 36.2 percent, respectively).
United States to address potential problems in the data However, there was not a statistically significant
owing to school nonresponse; however, the relationship between participating schools and the total
investigation into nonresponse bias at the school level frame of eligible schools for the percentage of students
in the United States in PISA 2012 provided evidence eligible for free or reduced-price lunch (38.4 versus
that there is little potential for nonresponse bias in the 37.1 percent, respectively); this means that despite the
PISA participating sample based on the characteristics tendency of schools with higher percentages of students
studied. To compare PISA participating schools to the eligible for free and reduced-price lunch to participate
total eligible sample of schools, it was necessary to at a greater rate than other sampled schools, there is
match the sample of schools to the sample frame to little evidence of resulting potential bias in the final
identify as many characteristics as possible that might sample.
provide information about the presence of nonresponse

PISA-11
PISA
NCES HANDBOOK OF SURVEY METHODS

Measurement error. Measurement error is introduced found differences between PISA and the other two
into a survey when its test instruments do not studies. In science, it found that more items in PISA
accurately measure the knowledge or aptitude they are built connections to practical situations and required
intended to assess. students to demonstrate multistep reasoning and fewer
items used a multiple-choice format than in NAEP or
Data Comparability TIMSS. In mathematics, it found that more items in
A number of international comparative studies already PISA than in NAEP or TIMSS were set in real-life
exist to measure achievement in mathematics, science, situations or scenarios, required multistep reasoning,
and reading, including the Trends in International and required interpretation of figures and other
Mathematics and Science Study (TIMSS) and the graphical data. These tasks reflect the underlying
Progress in International Reading Literacy Study assumption of PISA: as 15-year-olds begin to make the
(PIRLS). The Adult Literacy and Lifeskills Survey transition to adult life, they need to know how to read
(ALL) was last conducted in 2003 and measured the or use particular mathematical formulas or scientific
literacy and numeracy skills of adults. A new study, the concepts, as well as how to apply this knowledge and
Program for the International Assessment of Adult these skills in the many different situations they will
Competencies (PIAAC), was administered for the first encounter in their lives.
time in 2011 and assessed the level and distribution of
adult skills required for successful participation in the Age-based sample. In contrast with TIMSS and PIRLS,
economy of participating jurisdictions. In addition, the which are grade-based assessments, PISAs sample is
United States has been conducting its own national based on age. TIMSS assesses fourth- and eighth-
surveys of student achievement for more than 35 years graders, while PIRLS assesses only fourth-graders. The
through the National Assessment of Educational PISA sample, however, is drawn from 15-year-old
Progress (NAEP). PISA differs from these studies in students, regardless of grade level. The goal of PISA is
several ways. to represent outcomes of learning rather than outcomes
of schooling. By placing the emphasis on age, PISA
Content. PISA is designed to measure literacy broadly, intends to show not only what 15-year-olds have
whereas studies such as TIMSS and NAEP have a learned in school in a particular grade, but outside of
stronger link to curricular frameworks and seek to school as well as over the years. PISA thus seeks to
measure students mastery of specific knowledge, show the overall yield of an economy and the
skills, and concepts. The content of PISA is drawn from cumulative effects of all learning experience. Focusing
broad content areas (e.g., space and shape in on age 15 provides an opportunity to measure broad
mathematics) in contrast to more specific curriculum- learning outcomes while all students are still required to
based content, such as geometry or algebra. For be in school across the many participating jurisdictions.
example, with regard to the reading assessment, PISA Finally, because years of education vary among
must contain passages applicable to a wide range of jurisdictions, choosing an age-based sample makes
cultures and languages, making it unlikely that the comparisons across jurisdictions somewhat easier.
passages will be intact, existing texts.
6. CONTACT INFORMATION
Tasks. PISA also differs from other assessments in that
it emphasizes the application of reading, mathematics, For content information about PISA, contact
and science literacy to everyday situations by asking
students to perform tasks that involve interpretation of Patrick Gonzales
real-world materials as much as possible. A study Phone: 415-920-9229
comparing the PISA, NAEP, and TIMSS mathematics Email: [email protected]
assessments found that the mathematics topics
addressed by each assessment are similar, although Mailing Address:
PISA places greater emphasis on data analysis and less National Center for Education Statistics
on algebra than does either NAEP or TIMSS. However, Institute of Education Sciences
it is in how that content is presented that makes PISA U.S. Department of Education
different. PISA uses multiple-choice items less 1990 K Street NW
frequently than NAEP or TIMSS, and it contains a Washington, DC 20006-5651
higher proportion of items reflecting moderate to high
mathematical complexity than do those two 7. METHODOLOGY AND
assessments. EVALUATION REPORTS
An earlier comparative analysis of the PISA, TIMSS, Most of the technical documentation for PISA is
and NAEP mathematics and science assessments also published by the OECD. The U.S. Department of

PISA-12
PISA
NCES HANDBOOK OF SURVEY METHODS

Education, NCES, is the source of several additional (2004). International Outcomes of Learning in
references listed below. Mathematics Literacy and Problem Solving: PISA
2003 Results From the U.S. Perspective (NCES
General 2005-003). National Center for Education
Baldi, S., Jin, Y., Skemer, M., Green, P., Herget, D., Statistics, Institute of Education Sciences, U.S.
and Xie, H. (2007). Highlights From PISA 2006: Department of Education. Washington, DC.
Performance of U.S. 15-Year-Olds in Science and Available at http://nces.ed.gov/pubsearch/
Mathematics Literacy in an International Context pubsinfo.asp?pubid=2005003.
(NCES 2008-016). National Center for Education
Organization for Economic Cooperation and
Statistics, Institute of Education Sciences, U.S.
Department of Education. Washington, DC. Development (OECD). (2005). PISA 2003 Data
Available at http://nces.ed.gov/pubsearch/ Analysis Manual. Paris: Author.
pubsinfo.asp?pubid=2008016. Organization for Economic Cooperation and
Fleischman, H.L., Hopstock, P.J., Pelczar, M.P., and Development (OECD). (2007). PISA 2006: Science
Competencies for Tomorrows World. Volume 1:
Shelley, B.E. (2010). Highlights From PISA 2009:
Performance of U.S. 15-Year-Old Students in Analysis. Paris: Author.
Reading, Mathematics, and Science Literacy in an Organization for Economic Cooperation and
International Context (NCES 2011-004). National Development (OECD). (2009). PISA 2009
Center for Education Statistics, Institute of
Assessment Framework - Key Competencies in
Education Sciences, U.S. Department of
Reading, Mathematics and Science. Paris: Author.
Education. Washington, DC. Available at
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2 Organization for Economic Cooperation and
011004. Development (OECD). (2010). PISA 2009 Results:
What Students Know and Can Do - Performance in
Kastberg D., Roey S., Lemanski N. Chan J. Y., and Reading, Mathematics and Science (Volume I).
Murray G. (2014). Technical Report and User
Paris: Author.
Guide for the Program for International Student
Assessment (PISA): 2012 Data Files and Database Organization for Economic Cooperation and
with U.S.-Specific Variables (NCES 2014-025). Development (OECD). (2012). PISA 2009
U.S. Department of Education. Washington, DC: Technical Report. Paris: Author.
National Center for Education Statistics. Available
at http://nces.ed.gov/pubs2014/2014025.pdf. Survey Design
Hopstock, P. and Pelczar, M. (2011). Technical Report Green, P., Herget, D., and Rosen, J. (2009). Users
and User's Guide for the Program for Guide for the Program for International Student
International Student Assessment (PISA): 2009 Assessment (PISA): 2006 Data Files and Database
Data Files and Database with U.S. Specific With United States Specific Variables (NCES
Variables (NCES 2011-025). National Center for 2009-055). National Center for Education
Education Statistics, Institute of Education Statistics, Institute of Education Sciences, U.S.
Sciences, U.S. Department of Education. Department of Education. Washington, DC.
Washington, DC: U.S. Government Printing Available at http://nces.ed.gov/pubsearch/
Office. Available at pubsinfo.asp?pubid=2009055.
http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2
011025. Organization for Economic Cooperation and
Development (OECD). (2009). PISA 2006
Kelly, D., Xie, H., Nord, C.W., Jenkins, F., Chan, J.Y., Technical Report. Paris: Author.
and Kastberg, D. (2013). Performance of U.S. 15-
Year-Old Students in Mathematics, Science, and Organization for Economic Cooperation and
Reading Literacy in an International Context: First Development (OECD). (2010). PISA 2009 Results:
Look at PISA 2012 (NCES 2014-024). U.S. What Students Know and Can DoStudent
Department of Education. Washington, DC: Performance in Reading, Mathematics and Science
National Center for Education Statistics. Available (Volume I). Paris: Author.
at http://nces.ed.gov/pubs2014/2014024rev.pdf. Organization for Economic Cooperation and
Development (OECD). (2013). PISA 2012
Lemke, M., Sen, A., Pahlke, E., Partelow, L., Miller,
Assessment FrameworkKey Competencies in
D., Williams, T., Kastberg, D., and Jocelyn, L.
Reading, Mathematics and Science. Paris: Author.

PISA-13
PISA
NCES HANDBOOK OF SURVEY METHODS

Organization for Economic Cooperation and 2003 Assessments (NCES 2007-049). National
Development (OECD). (2013), PISA 2012 Center for Education Statistics, Institute of
Assessment and Analytical Framework: Education Sciences, U.S. Department of
Mathematics, Reading, Science, Problem Solving Education. Washington, DC. Available at
and Financial Literacy, OECD Publishing. http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2
Available at http://www.oecd.org/pisa/pisa 007049.
products/PISA%202012%20framework%20e-
book_final.pdf. Neidorf, T.S., Binkley, M., Gattis, K., and Nohara, D.
(2006). Comparing Mathematics Content in the
Organization for Economic Cooperation and National Assessment of Educational Progress
Development (OECD). (2014), PISA 2012 (NAEP), Trends in International Mathematics and
Technical Report. Available at Science Study (TIMSS), and Program for
http://www.oecd.org/pisa/pisaproducts/pisa2012tec International Student Assessment (PISA) 2003
hnicalreport.htm Assessments (NCES 2006-029). National Center
for Education Statistics, Institute of Education
PISA Project Consortium. (2005a). Main Study Sciences, U.S. Department of Education.
National Project Managers Manual. Washington, DC. Available at
Retrieved November 14, 2008, from http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2
https://mypisa.acer.edu.au/images/mypisadoc/opman 006029.
ual/pisa2003_national_project_manager_manual.pdf.
Nohara, D. (2001). A Comparison of the National
PISA Project Consortium. (2005b). School Sampling Assessment of Educational Progress (NAEP), the
Preparation Manual: PISA 2006 Main Study. Third International Mathematics and Science
Retrieved November 14, 2008, from Study Repeat (TIMSS-R), and the Programme for
https://mypisa.acer.edu.au/images/mypisadoc/opm International Student Assessment (PISA) (NCES
anual/pisa2003_school_sampling_manual.pdf. 2001-07). National Center for Education Statistics,
U.S. Department of Education. Washington, DC.
PISA Project Consortium. (2005c). PISA 2006 Available at http://nces.ed.gov/pubsearch/pub
Main Study Test Administrators Manual. sinfo.asp?pubid=200107.
Retrieved November 14, 2008, from
https://mypisa.acer.edu.au/images/mypisadoc/opma Stephens, M., and Coleman, M. (2007). Comparing
PIRLS and PISA With NAEP in Reading,
nual/pisa2006_test_administrator_manual.pdf.
Mathematics, and Science (Working Paper).
PISA Project Consortium. (2005d). PISA 2006 Main National Center for Education Statistics, Institute of
Study School Coordinators Manual. Retrieved Education Sciences, U.S. Department of Education.
November 14, 2008, from https://mypisa.acer. Washington, DC. Retrieved November 14, 2008, from
edu.au/images/mypisadoc/opmanual/pisa2006_scho http://nces.ed.gov/Surveys/PISA/pdf/comp
ol_coordinators_manual.pdf. paper12082004.pdf.

Data Quality and Comparability


Dossey, J.A., McCrone, S.S., and OSullivan, C.
(2006). Problem Solving in the PISA and TIMSS

PISA-14

You might also like