Teaching Statistics and Data Analysis With R
Teaching Statistics and Data Analysis With R
To cite this article: Mary C. Tucker, Stacy T. Shaw, Ji Y. Son & James W. Stigler (2023) Teaching
Statistics and Data Analysis with R, Journal of Statistics and Data Science Education, 31:1,
18-32, DOI: 10.1080/26939169.2022.2089410
ABSTRACT
KEYWORDS
We developed an interactive online textbook that interleaves R programming activities with text as a Data science education;
way to facilitate students’ understanding of statistical ideas while minimizing the cognitive and emotional Higher education; R;
burden of learning programming. In this exploratory study, we characterize the attitudes and experiences Statistics education research
of 672 undergraduate students as they used our online textbook as part of a 10-week introductory course
in statistics. Students expressed negative attitudes and concerns related to R at the beginning of the
course, but most developed more positive attitudes after engaging with course materials, regardless of
demographic characteristics or prior programming experience. Analysis of a subgroup of students revealed
that change in attitudes toward R may be linked to students’patterns of engagement over time and students’
perceptions of the learning environment.
CONTACT Mary C. Tucker [email protected] University of California, Los Angeles, Los Angeles, CA.
Supplementary materials for this article are available online. Please go to www.tandfonline.com/ujse.
© 2022 The Author(s). Published with license by Taylor and Francis Group, LLC.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution,
and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted.
JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION 19
Our interest is in how to introduce statistical programming the main course content (e.g., concepts and algebraic formulas),
languages (in our case using R) into the teaching of data analysis then it will be hard for students to view it as a tool to support
and in how to teach modern methods of data analysis as a means statistical thinking and understanding. Furthermore, students
of helping students create a deep and transferable understanding may not be able to see its relevance and value in their chosen
of the core concepts that underlie statistics. We also want to field of study or to their lives more broadly. It is also possible
connect our students to the fast-growing world of data science, that teachers who incorporate R attempt to teach too much
and increase the numbers of students, especially those who are programming, including concepts (e.g., for loops, pointers) that
traditionally disadvantaged, who can envision the field of data are not necessary for students taking an introductory statistics
science as a possible career pathway. class nor directly aligned with the goal of teaching statistics for
understanding.
2. Teaching Data Analysis with R
R is a statistical computing environment for data analysis that 2.1. Cognitive and Motivational Considerations
has been widely adopted by researchers and industry profession- Like many things in education, we believe the effects of inte-
als in STEM, the social sciences, and the humanities (R Core grating R into the introductory statistics course will depend
Team 2019). Though many data analysis tools are available, R on how the integration is carried out. If students perceive R
offers several advantages. First, R is open-source and free, which as something extra that must be learned, over and above the
reduces barriers to access. Second, R is flexible and powerful; already challenging subject matter of statistics, they may feel that
it can be used for both simulation and data analysis. Unlike learning R is too costly and stressful. But if students see R as a
analyzing data using point-and click interfaces, analyzing data in tool for understanding and doing statistics, they may see R as
R involves writing code, which can make statistical thinking and valuable and relevant, and thus develop more positive attitudes
data analysis processes more visible and reproducible; thus, the toward R and toward statistics.
use of R may offer students an additional representational tool to Our perspective draws on theory and research from cognitive
build their understanding. Third, R is used by a growing number psychology. According to Cognitive Load Theory (Sweller 1988,
of people across different fields and thus is more generalizable 1999, 2018), inherently complex learning materials (known as
than other statistical software programs. high intrinsic cognitive load) or materials that introduce too
Although more and more educators are using R to teach much information simultaneously (high extraneous cognitive
data analysis (e.g., Baumer et al. 2014; Nolan and Temple Lang load) can deplete students’ cognitive resources and impede
2015; Mascaró, Sacristán, and Rufino 2016), many barriers meaningful learning (Sweller 1988, 1999, 2018). Conversely,
remain. One barrier is the lack of integration between text- materials that stimulate mental activity in ways that are
based instructional materials and R coding environments. For challenging, yet relevant (high germane cognitive load), can
example, a growing number of textbooks supplement statistics promote learning and transfer (Paas, Renkl, and Sweller 2004).
content with R, but most require students to switch between the Our perspective also draws on the expectancy-value-cost theory
text and some other platform (such as RStudio installed on their of achievement motivation (Barron and Hulleman 2015), which
own computer) to try out any R code (i.e., Dalgaard 2008; Field, provides a useful framework for understanding how students’
Miles, and Field 2012; Nolan and Temple Lang 2015; Navarro perceptions of the learning environment shape their choices,
2020). For novice students, the initial difficulty of setting up R persistence, and motivation. When students perceive instruc-
may be a barrier to the R component of the class (Çetinkaya- tional activities to be valuable and relevant, their motivation and
Rundel and Rundel 2018), and even though R is interspersed interest increases (Hulleman and Harackiewicz 2009; Hulleman
with the statistics concepts in the textbook, it might feel like et al. 2017). When students perceive instructional activities to be
something separate they have to do after they read the book. too costly (in time, resources, or stress), motivation and interest
In addition to challenges integrating R with course materials, decreases (Barron and Hulleman 2015).
many instructors report (both formally and informally) that
students “hate programming” and fear students’ initial feelings
toward R will turn them off of statistics (Ward 2013; Rode and 2.2. Setting the Stage
Ringel 2019). Because R is syntax-based (rather than having With these considerations in mind, in this project we aimed
a point-and-click interface), there is a general perception that to engineer a new way of integrating the teaching of R into
it may be harder to learn for some students—especially those the teaching of statistics so as to promote students’ statistical
who come in with no prior programming experience (Biehler thinking and transfer. Our project builds directly on three devel-
1997). For many instructors, just the idea of teaching R may be opments: the MOSAIC project (making R syntax more transpar-
challenging, particularly when they themselves may have little ent to novices), web 2.0 technology (making R programming
or no experience with R. All of these factors lead to a persistent environments more available to novices), and the emergence
worry that the cognitive and emotional burden of learning R will of the field of learning engineering (making student data from
make it more difficult for students to focus their attention on learning R more available to researchers).
understanding statistical concepts (rOpenSci 2018).
We share some of these concerns. However, it is possible The MOSAIC Project The first of these is the MOSAIC
that these presumed negative effects of learning R might be project (Pruim, Kaplan, and Horton 2017). Though R was
mitigated by the way in which R is introduced and taught. If not designed to support students’ development of deep
R is taught as a set of procedures to be learned separately from understanding of statistics concepts, the MOSAIC team has
20 M. C. TUCKER ET AL.
written a number of R packages designed to do just that: to support deep learning of statistics without increasing cognitive
simplify the syntax of functions so that students can more easily load or eliciting negative emotions in students.
understand the underlying structure. MOSAIC R functions The pedagogical design of the book is based on the practic-
not only enable students to explore and analyze data, but they ing connections hypothesis, which is grounded in theory and
also provide tools for representing and understanding complex research from psychology and the learning sciences (Fries et
statistical ideas, such as randomness. al. 2020). The practicing connections hypothesis argues that in
order to develop coherent and transferable knowledge in com-
Widespread Availability of Web2.0 Technology A second
plex domains, like statistics, students need repeated opportuni-
development is the ability to integrate R programming envi-
ties to engage in deliberate practice making connections among
ronments in cloud-hosted online instructional materials. Even
key concepts, representations, and situations of the domain
with tools like MOSAIC, students still, until recently, had to
over time. Whereas most traditional statistics texts use algebraic
either download and install an R integrated programming
notation to represent statistics concepts, our textbook uses R as
environment (IDE) such as R Studio on their own computers to
a key representation, and integrates R according to the following
run R or use computer labs with R already installed. This has
principles:
necessitated the pedagogical separation between statistics and
data analysis. Today, next generation web technologies make it
1. R exercises should help students to represent ideas and make
possible to interleave explanatory text with R programming
connections between concepts, not just be used to compute
examples, right on the same web page. The barriers to full
answers. In our textbook, R exercises focus on using model-
integration of statistics and data analysis having now been
ing, simulation, and visualization to understand abstract con-
removed, paves the way for researchers to understand how to
cepts like randomness and the data generating process. The
exploit the new platforms to benefit student learning.
R code should be used to help students learn the meaning of
Availability of Student Response Data Finally, the emergence other important statistical representations (e.g., the algebraic
of the interdisciplinary fields of learning engineering (Lieber- notation of the General Linear Model, specific vocabulary).
man 2018; Thille 2018; Dede, Richards, and Saxberg 2019) has 2. R exercises should be interleaved throughout the text and
fostered new ways of capturing and analyzing students’ inter- embedded in the practice of data analysis. The textbook
actions, responses, and learning on a moment-by-moment level should interleave R exercises with other types of exercises
as students work through course materials. Certainly, individual (e.g., writing, multiple choice, categorization tasks) to pro-
instructors have devised creative ways of integrating statistics vide opportunities for productive struggle, feedback, and
and data analysis, but we have scant data available to assess deliberate practice. In other words, R exercises should be
the actual impact of these different pedagogical innovations on viewed as one part of an active opportunity for students to
students’ motivation and learning, especially in real time. This make connections.
leads directly to our current project. 3. Practice with R should start simple and become increasingly
sophisticated and complex over time, providing students a
3. Project Overview gentle on-ramp to the adoption of R as a tool for doing and
thinking about statistics and data analysis.
This study is part of a larger project in which we are building
4. Expectations about what authentic users of R do should be
a technology platform and a set of working routines for
taught in an authentic manner. Novices might expect to learn
developing, implementing, testing, and continuously improv-
how to flawlessly generate code that works on the first try.
ing online learning materials based on students’ response
Instead, a growth mindset toward R should be cultivated
data. The first prototype of this approach is a continuously
where students are taught to expect errors, forgetting, and
improving online textbook for teaching introductory statistics,
frustration as part of the process for all data scientists. Using
CourseKata Statistics and Data Science (available for preview
cheat sheets, asking for help, and searching online are not
at www.coursekata.org; see Stigler et al. 2020). The online book
hacks just for beginners but something that even professional
consists of 12 chapters and 144 pages, each of which interleave
R users do.
text, graphics, R coding exercises, and formative assessment
5. Students should have opportunities to apply their developing
questions (more than 1200 in all). We briefly describe the book
skills to answer meaningful questions using real data so that
in the next section.
they can experience, first-hand, the value of computation
for understanding and doing statistics and the relevance of
3.1. Course Design
statistical thinking in their everyday lives.
A major feature of the online book is the complete integration of
data analysis using R into the introductory statistics curriculum. In sum, the textbook should reduce the R learning curve and
Students using our online textbook do not have to download, increase motivation wherever possible; focus on teaching R
install or configure any software, or even switch from one win- as a tool for understanding and doing statistics, rather than
dow on a screen to another, in order to complete the pages of the programming as an end in itself; draw on resources created by
book. We hypothesize that by interleaving R data analysis activi- other statistics educators (e.g., MOSAIC) to reduce the number
ties with the introduction of statistical concepts, and by building of functions students need to learn and to make those functions
on the MOSAIC project (Pruim, Kaplan, and Horton 2017), we more understandable; scaffold R activities, and provide hints
can make script-based programming accessible to all learners, and feedback; cultivate a growth mindset orientation toward
regardless of background or prior programming experience, and learning to program; and make it easy for students to get help, so
JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION 21
they don’t focus on memorization (e.g., R cheat sheet, glossary, public research university located in a major city in the west-
help desk). ern United States. The course provides a basic introduction to
statistics and data analysis with an emphasis on its application to
research in psychology. The goals are for students to understand
3.2. Study Aims
basic concepts that underlie descriptive and inferential statistics
The goal of the current research was to present an initial “proof and use them to make sense of new situations, to be prepared
of concept.” We examined three large classes of students’ atti- cognitively and emotionally to learn more advanced techniques
tudes and experiences as they used the online textbook through- in the future, and to be able to do basic data analysis using
out a 10-week introductory course in statistics. Prior work has the R statistical programming language. Students majoring in
found that students are initially anxious viewing R output, but psychology must complete the course with a grade of C- or
their anxiety abates over time (Rode and Ringel 2019). However, better in order to remain in their degree program.
this finding has yet to be replicated in a larger sample size and The three sections of the course described in this study are
within a course that requires students to write code, generate blended courses. In addition to standard lecture and office hours,
plots, and compare models in R (rather than look at R output). each class incorporated online components. The online com-
If our approach to integrating R as a tool for understanding ponents included the interactive online textbook (CourseKata
statistics is effective, we expect that at the end of the course Statistics and Data Science), a question/answer forum, and links
we will find that students not only showed gains in learning to online resources (e.g., a help desk and reference materials).
and transfer, but they also developed positive attitudes toward Much of the course content was conveyed through the online
R (even for students who are initially hesitant about learning R interactive textbook, which included over 1200 embedded for-
programming). mative assessments, embedded R programming exercises, and
We had three specific aims in this exploratory study. The pri- practice quizzes. The first four chapters in the textbook provide
mary aim was to determine if it is possible to use R in an intro- students with a scaffolded initial introduction to R, including a
ductory course with students who have limited mathematical description of the interactive programming environment as well
and computer programming backgrounds, without negatively as practice with basic programming concepts such as functions,
impacting their learning or motivation. Past research suggests variables, and data types. Students were not provided with sup-
that students hold negative beliefs and misconceptions about plemental instruction that introduced R outside of the content
computer programming, which can influence their motivation in the online textbook.
and behavior during learning (Siek et al. 2006; Scott and Ghinea Lectures occurred twice weekly, were led by instructors, and
2014; Cheryan, Master, and Meltzoff 2015; Google and Gallup focused on deepening understanding of concepts, connecting
2015; Tek, Benli, and Deveci 2018). It is possible that even if R concepts, discussing examples, and answering questions. Stu-
is a valuable skill for students to learn, that students would, on dents participated during lecture by answering questions posed
average, leave with a negative disposition to programming. But by the instructor via an audience polling tool. Other face-to-face
if the design of our course is successful, students may end the components included a separate, weekly, large-group discussion
course feeling positive about R and perceiving the course to be section which was used to administer quizzes and answer ques-
valuable. tions about course content. Quizzes were administered once
A second aim was to understand individual differences in every other week. On weeks when no quizzes were scheduled,
how students experienced the course and in how differences discussion sections involved short review presentations and
in these experiences might produce different motivational and question and answer sessions led by graduate Teaching Asso-
learning outcomes. In particular, we wanted to make sure that ciates. Students’ final course grades comprised performance on
integrating R did not contribute to existing disparities by helping quizzes (administered once every two weeks), completion of
some groups of students and hindering others, such as female homework assignments (reading and completing all exercises
students, and students with race/ethnicities that are tradition- in the assigned online textbook chapters), and performance on
ally underrepresented in STEM fields (Riegle-Crumb and King cumulative exam(s).
2010; Gallup 2016; Charles and Theìbaud 2018).
A third aim was to identify potential challenges and opportu-
nities for intervention to improve students’ experiences learning
4.2. Participants
statistics and data analysis with R. Even if most students have
a positive experience using R in our course, it is likely that Data were collected from 672 undergraduate students who used
some students may struggle. Thus, our final aim is to identify our interactive online textbook as part of the course, Intro-
students who do not develop positive attitudes toward R and duction to Psychological Statistics during the 2019–2020 aca-
identify how they may differ from students who do end up with demic year. Students were enrolled in one of three sections of
positive attitudes. By better understanding the experiences of the course at the University of California, Los Angeles. Each
students who struggle, we hope to design improvements that section was taught by a different instructor, but all used the
might reduce their numbers in future iterations of our course. same blended course format and interactive online textbook:
CourseKata Statistics and Data Science. The data were collected
as part of an ongoing project, which was approved by the Institu-
4. Method tional Review Board at the University of California, Los Angeles
4.1. Study Context (IRB No: 20-001033).
We initially obtained data from 789 students: 290 who took
The course, Introduction to Psychological Statistics, is an under- the course in Fall 2019 (Class A), 244 who took one section of
graduate course offered by the psychology department at a large the course in Winter 2020 (Class B), and 255 students who took
22 M. C. TUCKER ET AL.
another section of the course in Winter 2020 (Class C). How- embedded two-thirds of the way through the course content
ever, our analyses focus on only those students who (a) remained (chapter 8 in a 12-chapter book). During the course, students’
enrolled for the duration of the course, (b) completed at least interactions and performance on embedded formative assess-
one question on both the pre- and post- course surveys, and (c) ments were collected within the online textbook. An overview
agreed to share their data with the research team. Students who of the study design is shown in Figure 1.
did not meet these criteria (n = 117) were excluded. Of these
excluded students, 27 (23%) did not agree to share their data, 30
(26%) created an account but did not respond to any questions 4.4. Measures
in the online textbook, and 60 (51%) did not complete either the
Measures were generated by students’ responses to survey
pre- or post-course survey.
questions administered at three time points before and during
The resulting analytic sample (n = 672) was 73% female and
the course and log-data from students’ interactions within the
self-identified as 37% East/Southeast Asian, 19%, Latino, 27%
White, 5% Middle Eastern and North African, 4% Black/African online textbook. Questionnaires assessed students’ attitudes
American, and 5% more than one race/ethnicity/other. Table 1 and beliefs about learning and students’ background and
contains sociodemographic information for students in the demographic characteristics. The measures are described below.
sample. Attitudes toward Learning R Students rated their attitudes
Due to small sample sizes, participants who did not dis- toward using R on a single item—“In this course, you will
close gender (n = 18) or selected nonbinary or other gender
use/have used R (a programming language) to analyze data; how
(n = 6) were not included in formal hypothesis tests com-
do you feel about this?” to which they responded using a 5-
paring gender differences. Similarly, Black/African American
point Likert scale from strongly negative to strongly positive.
students (n = 25), Middle Eastern and North African students
R attitude items were administered at the beginning and end
(n = 31), and students who self-identified with more than
one race/ethnicity/other (n = 37) were not included in formal of the course (t1 and t3 ). In addition to the attitude measure,
hypothesis tests of racial/ethnic differences due to low frequen- students completed two additional R items at t3 . Students rated
cies. Instead, we present the raw data for these demographic their confidence in their ability to use R (e.g., “I am confident I
subgroups and comment on how those data compare to patterns could use R to analyze a new dataset”) using a 5-point scale from
from larger demographic groups (e.g., male, female and Asian, not at all confident to extremely confident. They also rated how
Latino, White students). important R was for their learning using a 5-point scale from
not at all important to important.
concerned that…(write “not at all” if you so feel).” Responses value, and cost by averaging the scores for each construct at
that mentioned “R,” “coding,” “computer programming,” or each time point.
“programming” were coded to create a dichotomous indicator of Beliefs about Memorization On the pre- and post-course
whether or not students mentioned computer programming as a surveys (t1 and t3 ), students rated how much they felt the course
concern (1 = mentioned computer programming as a concern, required (or would require) memorization (e.g., “I expect that
0 = did not mention computer programming as a concern). this course will require a lot of memorization” or “This course
required a lot of memorization”) using a 5-point scale from
strongly disagree to strongly agree.
Expectancy, Value, and Cost We measured students’ moti-
vational beliefs using items from the expectancy-value-cost Engagement Engagement with course materials was mea-
scale (Kosovich et al. 2015). Students rated their expectations sured using data from students’ interactions within the online
for success in the course (e.g., “I know I can learn the material textbook. Measures were calculated on a chapter-by-chapter
in this course,” “I believe I can be successful in this course”) level for each student. We looked at the following engagement
using a 5-point scale from strongly disagree to strongly agree. measures: chapter review score, calculated as the proportion of
Expectancies were measured at two timepoints: t1 and t2 . correct responses on the embedded review activity at the end
Perceived course value was measured using two items (“The of each chapter, word count, calculated as the number of words
content of this course is important for me,” and “What I on a summary students were required to write for each chapter,
learn in this course will be useful in the future”). Students R performance, calculated as the proportion of R activities that
rated agreement using a 5-point scale from strongly disagree were correct on the first submission, R attempts to correct, for R
to strongly agree. Value items were administered at three activities that students answered incorrectly on the first attempt,
timepoints: t1 , t2 , and t3 . Perceived course cost was measured the average number of attempts it took students to get the correct
using three items—“I’m unable to put in the time needed to do answer.
well in this course,” “I have to give up too much to do well in this Sociodemographic Characteristics Several socio-demographic
course,” and “This course is too stressful for me.” Students rated variables that could account for differences in students’ atti-
their agreement using a 5-point scale from strongly disagree tudes and learning outcomes were included in the analyses.
to strongly agree. Cost items were administered at a single These variables included self-reported gender and race/ethnicity,
timepoint, t2 . We created aggregate measures of expectancy, GPA, and parental education. Students reported their cumulative
24 M. C. TUCKER ET AL.
grade point average by selecting from one of the following five Previous programming experience also varied as a function
categories: 3.50–4.00, 3.00–3.49, 2.50–2.99, 2.00–2.49, and less of parental education, X 2 (1) = 16.77, p < 0.001. Students
than 2.00. Parental education was measured by asking students with college educated mothers (53.7%) were more likely to have
to report the highest level of education attained by their mother. prior programming experience than students with noncollege
Responses were dummy coded to indicate the mother’s educa- educated mothers (36.4%). Previous programming experience
tion level (1 = college educated, 0 = not college educated). did not vary as a function of gender, X 2 (1) = 0.32, p = 0.6.
Table 2. Distribution of initial R attitude ratings overall and broken down by programming experience, gender, and race/ethnicity.
Pre-course R Attitude Ratings (t1 ) n
1 2 3 4 5
Strongly negative Somewhat negative Neither positive nor negative Somewhat positive Strongly positive
All students 59(9%) 195 239 138 41(6%) 672
Programming experience
None 4(12%) 113 120 53 13(4%) 340
Some 18(5%) 82 119 84 28(9%) 331
Gender
Female 48(10%) 149 181 93 21(4%) 492
Male 9(6%) 38 49 41 19(12%) 156
Nonbinary/other 0(0%) 4 2 0 0(0%) 6
Did not disclose 2(11%) 4 7 4 1(1%) 18
Race/ethnicity
Asian 22(9%) 79 88 48 14(6%) 251
Latino 19(15%) 32 50 19 5(4%) 125
White 11(6%) 51 55 52 15(8%) 184
Multi-racial/other 1(3%) 13 16 2 3(9%) 35
M. East/N. African 2(7%) 8 12 7 2(7%) 31
Black/African Am. 2(8%) 7 9 6 1(4%) 25
Did not disclose 2(11%) 5 7 4 1(5%) 19
Table 3. Frequency of R attitude ratings at the beginning and at the end of the with some prior programming experience (mean rank = 367.3)
course.
rated their confidence higher than students without any prior
Pre-course R attitude ratings (t1 ) Post-course R attitude ratings (t3 ) n programming experience (mean rank = 302.3) and male
1 2 3 4 5 students (mean rank = 359.1) rated their confidence higher
1 = Strongly negative 7 9a 14a 25b 4b 59 than female students (mean rank = 359.1). A Kruskal-Wallis
2 = Somewhat negative 14 31 34a 88b 28b 195 test further revealed a detectable difference in the average
3 = Neither positive nor negative 9 25 53 120 32 239
4 = Somewhat positive 4 4 16 72 42 138
end-of-course R attitude ratings by students’ race/ethnicity,
5 = Strongly positive 1 3 3 15 19 41 H(2) = 14.2, p = 0.0008. A post hoc Dunn test with
Bonferroni adjustment revealed that White students reported
Note: a Students who started out negative and increased their attitudes toward
R. b Students who started out negative and ended the course feeling positive higher confidence than Asian students (Z = 2.5, p = 0.04) and
toward R. Latinx students (Z = 3.7, p = 0.0007). Although Black/African
American students (n = 25) and Middle Eastern/North African
students (n = 31) were not included in this analysis due to
To investigate differences in R attitudes across the two time- small sample sizes, their responses are similar to the responses
points (t3 − t1 ) and to see if change in R attitude ratings varied of Latino and Asian students.
based on individual student characteristics, we used repeated In terms of R importance, overall, students felt that R was
ordinal regression analysis. We fit separate models to estimate important for their learning in this course. The most com-
the effect of time on R attitude ratings and to estimate the mon responses were “very important” (37%) and “extremely
main and interactive effects of time and prior programming important” (36%), and this pattern was similar within each sub-
experience, gender, race/ethnicity, and concern about learning R group of students. Mann-Whitney U tests showed that impor-
at the beginning of the course. In each ordinal regression model, tance ratings did not differ by previous programming experi-
R attitude rating was an ordinal outcome variable with time (t1 ence (Z = 1.5, p = 0.1) or gender (Z = 1.4, p = 0.2).
and t3 ) and the student characteristic (group) as predictors, with Similarly, a Kruskal-Wallis test revealed that R importance did
students as the random factor. not vary by race/ethnicity for Asian, Latino, and White stu-
There was a discernible difference between the likelihood dents (H(2) = 4.2, p = 0.1). As with R confidence ratings,
of rating R positively at the beginning of the course compared Black/African American students (36% of whom rated R as
to the end of the course (likelihood ratio X2 (1) = 224, p < extremely important) and Middle Eastern/North African stu-
0.0001). Across all students, the odds of rating R more pos- dents (42% of whom rated R as extremely important) ratings of
itively were higher at the end of the course (t3 ) than at the R importance were similar to the responses of Latino (39% of
beginning of the course (t1 ). Though, overall, students changed whom rated R as extremely important), Asian (40% of whom
their attitudes toward R to be more positive, some groups of rated R as extremely important), and White students (30% of
students increased their attitudes more than others. Specifically, whom rated R as extremely important).
there was a statistically discernible difference in the pattern of R
ratings for female compared to male students (likelihood ratio
5.5. Students Who Expressed Negative Attitudes on the
X2 (1) = 4.44, p = 0.0001) such that the odds of female students
Presurvey
increasing their R attitude ratings from beginning to end of the
course were higher than the odds of males increasing their R Though the majority (80%) of students who started off feeling
attitude ratings from beginning to the end of the course. There negatively toward R shifted their feelings to be more positive at
was also a discernible difference in the pattern of R ratings for the end of the course, there were students whose attitudes did
students who did and those who did not mention R as a concern not improve. To better understand the students’ experiences that
(likelihood ratio X2 (1) = 10.0, p = 0.002) such that students might have contributed to these differences, we identified two
who mentioned R as a concern were more likely to increase their subgroups of students:
R attitude ratings to be positive from beginning to end of the
1. The negative-to-positive students were 297 students who rated
course than students who did not mention R as a concern.
their attitude toward learning R as “very negative,” “somewhat
negative,” or “neither positive nor negative” at the beginning
5.4. Confidence and Importance Ratings of the course (t1 ) and at the end of the course (t3 ) rated their
attitude as “somewhat positive” or “very positive.”
In addition to R attitude ratings, we looked at students’ ratings
2. The negative-to-negative students were 196 students who also
of how confident they felt using R to analyze a new dataset
rated their attitudes as “very negative,” “somewhat negative,”
and their rating of how important they felt R was for their
or “neither positive nor negative” at the beginning of the
learning of course material. The frequencies for R confidence
course (t1 ) and again rated their attitudes as “very negative,”
ratings and R importance ratings are shown in Tables 4 and 5,
“somewhat negative,” or “neither positive nor negative” at the
respectively.
end of the course (t3 ).
As Table 4 shows, students reported feeling confident in
their ability to use R at the end of the course. However, We compared these two subgroups of students, which we
some groups of students reported higher confidence than will refer to as the neg-pos and neg-neg, respectively, in an effort
others. Mann-Whitney U tests revealed that R confidence to identify how to improve future students’ experiences with
ratings differed by prior programming experience (Z = 4.6, learning R. We compared socio-demographic characteristics as
p < 0.0001) and gender (Z = 2.8, p = 0.005). Students well as two categories of students’ experiences: learning-related
JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION 27
Table 4. Distribution of R confidence ratings at the end of the course overall and broken down by programming experience, gender, and race/ethnicity.
Post-course R confidence ratings (t3 ) n
1 2 3 4 5
Not at all confident Somewhat confident Moderately confident Very confident Extremely confident
All students 23(3%) 84 220 277 65(10%) 669
Programming Experience
None 16(5%) 54 124 116 27(8%) 337
Some 7(2%) 30 96 160 38(11%) 331
Gender
Female 17(4%) 66 165 202 40(8%) 490
Male 6(4%) 11 45 70 23(15%) 155
Nonbinary/other 0(0%) 1 2 1 2(33%) 6
Did not disclose 0(0%) 6 8 4 0(0%) 18
Race/ethnicity
Asian 7(3%) 30 86 104 23(9%) 250
Latino 7(6%) 23 38 46 9(7%) 123
White 4(2%) 14 54 83 29(16%) 184
Multi-racial/other 2(6%) 3 11 19 0(0%) 35
M. East/ N. African 1(3%) 6 10 12 2(7%) 31
Black/African Am. 2(8%) 2 11 8 2(8%) 25
Did not disclose 0(0%) 6 9 4 0(0%) 19
Table 5. Distribution of R importance ratings at the end of the course overall and broken down by programming experience, gender, and race/ethnicity.
Post-course R Importance Ratings (t3 ) n
1 2 3 4 5
Not at all important Somewhat important Moderately important Very important Extremely important
All students 14(2%) 42 121 249 241(36%) 667
Programming Experience
None 7(2%) 25 71 114 119(35%) 336
Some 7(2%) 17 50 134 122(37%) 330
Gender
Female 11(2%) 28 83 179 186(38%) 487
Male 1(< 1%) 11 33 62 49(31%) 156
Nonbinary/other 0(0%) 1 0 2 3(5%) 6
Did not disclose 2(11%) 2 5 6 3(17%) 18
Race/ethnicity
Asian 3(1%) 11 44 92 98(40%) 248
Latino 2(2%) 9 22 43 48(39%) 124
White 4(2%) 12 35 78 54(30%) 183
Multi-racial/other 1(3%) 3 6 9 16(46%) 35
M. East/ N. African 1(3%) 4 4 9 13(42%) 31
Black/African American 1(4%) 1 5 9 9(36%) 25
Did not disclose 2(11%) 2 5 7 3(16%) 19
beliefs (e.g., success expectancies, perceptions of value and cost, Though both groups valued the course content at the beginning
and beliefs about memorization) and engagement with course of the course, the neg-neg group came to value the course less
materials. over time more than the neg-pos group. Paired sample t-tests
and Cohen’s d calculations revealed that, although both groups
Characteristics of the Two Subgroups There were no dis- decreased their perceived value from timepoint 1 to timepoint
cernible differences in the two subgroups of students at the start 2, students in the neg-neg group, t(180) = 9.0, p < 0.001,
of the course. Students in the two subgroups showed similar d = 0.71, decreased their perceived value more than students in
distributions of prior programming experience, X 2 (1) = 0.88, the neg-pos group, t(277) = 3.4, p = 0.0009, d = 0.23. Students
p = 0.3, gender X2 (1) = 0.35, p = 0.6, race/ethnicity, X2 (2) = in the neg-neg group also showed a decrease in mean value from
4.1, p = 0.1, and grade point average X2 (4) = 8.9, p = 0.06. The time point 2 to timepoint 3, t(180) = 2.3, p = 0.02, d = 0.16,
groups also did not differ in terms of the proportion of students whereas students in the neg-pos group did not, t(277) = 0.28,
who mentioned R as a concern at the beginning of the course p = 0.8, d = 0.02.
X2 (1) = 0.69, p = 0.4. Raw data for the socio-demographic A repeated measures ANOVA revealed an interaction
characteristics of students in the neg-neg and neg-pos groups between subgroup and time for expectancy, F(1, 465) = 20.9,
are presented in Table 6. p < 0.0001. As Figure 4 shows, both groups of students
Belief Trajectories We used repeated measures ANOVA to developed lower expectations for success from beginning to
compare changes in expectancy and value beliefs over time end of the course. However, the neg-neg group showed a greater
for the two subgroups of students (neg-neg vs. neg-pos). As decrease in their expectations from timepoint 1 to timepoint 2,
Figure 3 shows, there was an interaction between subgroup and t(179) = 7.8, p < 0.0001, d = 0.72, than did the neg-pos
time for perceived course value, F(2,948) = 31.5, p < 0.0001. group, t(286) = 2.13, p = 0.03, d = 0.16. A one-way ANOVA
28 M. C. TUCKER ET AL.
Subgroup
Neg-pos Neg-neg
Total students Programming Experience 297 196
None 160 114
Some 137 (42%) 82 (46%)
Gender
Female 233 (79%) 145 (74%)
Male 56 40
Nonbinary/other 4 2
Did not disclose 4 9
Race/ethnicity
Asian 122 (41%) 67 (34%)
Latino 54 (18%) 47 (24%)
White 76 (26%) 41 (21%)
Multi-racial/other 21 (6%) 11 (6%)
Middle Eastern/North African 10 (3%) 12 (6%)
Black/African American 9 (3%) 9 (5%)
Did not disclose 5 (2%) 9 (5%)
Figure 4. Expectations for success at the beginning of the course (t1) and after
chapter 8 (t2) for students in the neg-neg and neg-pos subgroups.
Figure 5. Performance on end-of-chapter review questions for students in the neg- Figure 7. Attempts to arrive at the correct answer on incorrect R activities for each
neg and neg-pos groups. chapter for students in the neg-neg and neg-pos subgroups.
Note: Performance was measured as the proportion of end-of-chapter review ques- Note: Average attempts to arrive at the correct answer for each chapter for each
tions answered correctly. A z-score was calculated for each student as a measure student were calculated by dividing the total number of attempts on R exercises
of average performance on the review questions in each chapter relative to other in each chapter by the total number of R exercises in that chapter the student
students. The graph shows estimates and 95% confidence intervals for each group answered incorrectly on the first attempt. A z-score was calculated for each student
by chapter. as a measure of average attempts to correct on R activities in each chapter relative
to other students. The graph shows estimates and 95% confidence intervals for each
group by chapter.
Figure 6. Proportion of R exercises answered correctly on the first attempt for each
chapter for students in the neg-neg and neg-pos subgroup.
Note. Performance was measured as the proportion of R activities that were sub- Figure 8. Word count for end-of-chapter summaries for students in the neg-neg
mitted correctly on the first attempt. A z-score was calculated for each student as a and neg-pos subgroups.
measure of average performance on all R activities in each chapter relative to other Note: Word count was calculated as the total number of words students used in their
students. The graph shows estimates and 95% confidence intervals for each group end-of-chapter summaries A z-score was calculated for each student as a measure
by chapter. of summary length for each chapter relative to other students. The graph shows
estimates and 95% confidence intervals for each group by chapter.
6. Discussion
In line with previous research (i.e., Anderson et al. 2008;
The goal of this study was to explore whether integration of R Baser 2013) our results show that, in the beginning, some stu-
coding within an interactive online textbook can reduce the per- dents tend to hold negative attitudes toward programming. Yet,
ceived cost of learning programming for students while increas- we found that most students (84%) ended the course either
ing its value and relevance to statistical understanding and prac- positively disposed or neutral toward R. This corroborates past
tice. Specifically, we were interested in whether integrating R evidence that suggests although students may initially be more
programming in a way that facilitates student understanding of anxious about computing, their anxiety greatly abates with prac-
statistics would lead psychology students taking an introductory tice (Du, Wimmer, and Rada 2016; Rode and Ringel 2019).
statistics course to develop more positive attitudes toward R and Even more promising, we found that students, in general,
improve their motivation to learn statistics. developed more positive attitudes toward R over time and
30 M. C. TUCKER ET AL.
this pattern appeared similar for students of different genders, review activities. Past research has shown that students who
race/ethnic backgrounds and different levels of prior experience. conceive of learning as memorization hold an interconnected
In fact, students who were the most concerned about learning pattern of beliefs that can be maladaptive to learning (Säljö
programming at the beginning of the course showed the greatest 1979; Van Rossum, Deijkers, and Hamer 1985). For example,
increase in their attitudes toward programming after engaging students who view learning as memorization often tend to view
with course materials. These findings suggest that, in line with their ability to learn as fixed, stable, and unchanging (Chan
our hypothesis and findings from other successful projects like 2008) and are more likely to believe that learning is about
MOSAIC, that when students are introduced to programming testing, calculation, and practice than understanding and con-
languages like R in a way that supports understanding, they can necting concepts (Liang and Tsai 2010). Students who conceive
develop more positive attitudes toward programming. Similar of learning as memorization may also adopt a “surface level”
results were found by Charters et al. (2014) in a population of approach to learning, be less interested in the course material,
adults learning computer programming for the first time. Our and show lower self-efficacy (Tsai et al. 2011). Our results show
partial support for a relation between beliefs and performance,
study extends these findings to a new population: undergraduate
as students in the neg-neg group performed lower on the end of
students learning computer programming in the context of an
chapter review activities as well as the R programming activities,
introductory course in statistics.
suggesting that they may have a different approach to the course
Although the disparities in R attitudes between students material that leads to lower performance in general.
from different backgrounds became less pronounced over time, Further, we found that although the neg-neg students per-
there were still detectable differences in students’ R confidence ceived the course to be more time-consuming than the neg-
and perceptions of the importance of R for their learning. For pos students, their patterns of engagement with course materials
example, male students and White students felt more confident suggest they spent equivalent amounts of time on the course,
in their ability to use R to analyze a new dataset than women and and they did not spend more time on R exercise attempts,
Asian and Latino students. Thus, although providing students nor did they write more on open-response questions. In fact,
with opportunities to work with R to promote understanding students who adhered to their negative attitudes tended to make
may help to narrow discrepancies between demographic groups, fewer attempts and used fewer words than students who devel-
it does not completely erase them. This finding is similar to other oped more positive attitudes. This suggests that, though objec-
findings that show women tend to have lower confidence in tively, both groups of students put roughly equal effort into the
STEM domains compared to male students, despite equal levels course with pos-pos students sometimes showing greater effort,
of preparation and previous performance (Ellis, Fosdick, and students who had negative attitudes toward R at the end of the
Rasmussen 2016). course subjectively felt the course was more time-consuming.
While most students who entered the course with negative These results, while exploratory, raise questions about the
attitudes toward R developed positive attitudes toward R by role of students’ experiences and behavior and the role of these
the end of the course, some students maintained their negative factors in attitude formation. Past studies have found experi-
attitudes throughout the course. Students’ difficulties learning ence factors to be only moderately correlated with students’
programming are well documented in the literature (i.e., Qian attitudes (i.e., Garland and Noyes 2004). Our results indicate
and Lehman 2017), but little is known about what differentiates that experiences and behaviors during learning—specifically
students who come to enjoy programming from those who do students’ performance and perceptions of how valuable and
not. Most studies have focused on the relationship between costly the course is—may differentiate students who develop
initial attitudes and learning. For example, Tai (2003) found more positive attitudes from those who maintain initial negative
attitudes.
that students who held positive attitudes toward computer-
assisted learning environments demonstrated greater learning
of computer programming. However, there is considerable value Strengths, Limitations, and Areas for Future Research This
in studying the experiences of students who do not go on to exploratory study contributes to the growing body of knowledge
forming positive identities, on a more fine-grained level over of how computational tools can be integrated to improve statis-
time. tics teaching and learning. A strength of this study is that we
Our findings provide some preliminary insights into stu- were able to longitudinally track attitudes and link these beliefs
dents’ experiences and belief trajectories over time. Students to demographic and performance data. As a proof of concept,
who began the class with negative attitudes toward R but later our results were promising. Students reported positive experi-
reported positive attitudes (neg-pos) and students who never ences using R during our course, and this pattern was similar for
adopted positive attitudes (neg-neg) did not differ in terms of students across varying levels of prior experience. One reason
demographics or initial course perceptions and expectations. for the success may be the way we integrated R in the course.
However, partway through the course, students in the neg-neg For example, in addition to designing our integration to reduce
group valued the course less, had lower expectations for success, the cognitive load of learning R and making the experience of
and believed that the course was more costly. What might drive using R more valuable to students, we implemented additional
these differences? supports such as a help desk that was embedded directly into
Some clues as to why students had such different experiences the online textbook, a glossary of R functions, and a cheat sheet
can be found in their approaches to learning: the neg-neg group that students could download and use as a reference. The effect
believed (both at the beginning and end of the course) that of these approaches warrants more exploration in the future.
learning would require a lot of memorization. They generally This study was limited in a number of ways. First, we mea-
performed worse on the R coding exercises and end of chapter sured students’ attitudes toward R using a single rating item
at the beginning and end of the course. However, students’
JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION 31
attitudes toward R are likely more complex and may vary sub- References
stantially at different points in the course. As our results show,
Anderson, N., Lankshear, C., Timms, C., and Courtney, L. (2008), “Because
students’ attitudes toward learning R may also differ from their
it’s Boring, Irrelevant and I Don’t Like Computers’: Why High School
confidence and from how important they perceived R to be for Girls Avoid Professionally-Oriented ICT Subjects,” Computers & Educa-
their learning. Future studies should investigate the different tion, 50, 1304–1318. https://doi.org/10.1016/j.compedu.2006.12.003.
dimensions of students’ R attitudes and how they change over ASA GAISE College working group, Carver, R., Everson, M., Gabrosek, J.,
time throughout the course in relation to contextual factors such Rowell, G. H., Horton, N. J., Lock, R., Mocko, M., Rossman, A., Velle-
as overall course performance or the difficulty of the material. man, P., Witmer, J., and Wood, B. (2016), “Guidelines for Assessment
and Instruction in Statistics Education: College report, 2016,” Avail-
It may also be helpful to include more qualitative measures of able at https://www.amstat.org/asa/education/Guidelines-for-Assessment-
students’ R attitudes to develop a richer and more complete and-Instruction-in-Statistics-Education-Reports.aspx.
understanding of how students are feeling about learning R. Barron, K. E., and Hulleman, C. S. (2015), “Expectancy-Value-Cost
Second, our investigations of the relationship between Model of Motivation,” in International Encyclopedia of the Social &
change in students’ attitudes toward R and their motivational Behavioral Sciences, J. D. Wright, pp. 503–509, Amsterdam: Elsevier.
beliefs and experiences throughout the course were mostly https://doi.org/10.1016/b978-0-08-097086-8.26099-6.
Baser, M. (2013), “Attitude, Gender, and Achievement in Computer Pro-
exploratory. More research is needed to understand the gramming,” Middle East Journal of Scientific Research, 14, 248–255.
relationship and direction between these variables. Future https://doi.org/10.5829/idosi.mejsr.2013.14.2.2007.
studies could investigate whether interventions that target Baumer, B., Çetinkaya-Rundel, M., Bray, A., Loi, L., and Horton, N. J.
students’ motivational beliefs or early experiences using R might (2014), “R Markdown: Integrating a Reproducible Analysis Tool into
lead those students who maintained negative attitudes toward R Introductory Statistics,” Technology Innovations in Statistics Education,
8, 1–30.
to develop more positive attitudes by the end of the course. For
Biehler, R. (1997), “Software for Learning and for Doing Statistics,” Inter-
example, past research has shown that feedback is important national Statistical Review, 65, 167–189. https://doi.org/10.1111/j.1751
for learning programming. Perhaps changing the nature or 5823.1997.tb00399.x.
content of the feedback students receive when they answer a Çetinkaya-Rundel, M., and Rundel, C. (2018), “Infrastructure and
question incorrectly may buffer against negative experiences Tools for Teaching Computing Throughout the Statistical Curricu-
when students make mistakes learning R. Similarly, adding lum,” The American Statistician, 72, 58–65. https://doi.org/10.1080/
00031305.2017.1397549.
additional scaffolds to make learning R less costly or identifying Chan, K. (2008), “Hong Kong Teacher Education Students’ Epistemo-
and reaching out to students who struggle with R early in the logical Beliefs and their Relations with Conceptions of Learning and
course may improve their subsequent learning experiences. Learning Strategies,” The Asia Pacific Education Researcher, 16, 36–50.
Other important limitations of the present study include the https://doi.org/10.3860/taper.v16i2.265.
small sample sizes in some socio-demographic subgroups (e.g., Charles, M., and Theìbaud, S. (2018), Gender and Stem: Understanding
African American/Black, Middle Eastern/North African, Multi- Segregation in Science, Technology, Engineering, and Mathematics. Basel:
Mdpi.
racial/other race/ethnic identity, and nonbinary people) and the Charters, P., Lee, M. J., Ko, A. J., and Loksa, D. (2014), “Challeng-
treatment of demographic subgroups in our survey. To better ing Stereotypes and Changing Attitudes.” in SIGCSE Proceedings.
capture the diversity of students’ experiences, we are collecting https://doi.org/10.1145/2538862.2538938.
data from a more diverse sample of students at the university, Cheryan, S., Master, A., and Meltzoff, A. N. (2015), “Cultural Stereotypes
community college, and high school levels. Future studies will as Gatekeepers: Increasing Girls’ Interest in Computer Science and
Engineering by Diversifying Stereotypes,” Frontiers in Psychology, 6, 49–
include expanded formal statistical analyses that include impor-
49. https://doi.org/10.3389/fpsyg.2015.00049.
tant, yet, traditionally excluded groups whose experiences we Christensen, R. (2018), “ordinal – Regression Models for Ordinal
were not able to fully capture in this study. Additionally, we Data”: R package version 2018.8-25. Available at http://www.cran.r-
have updated our demographic survey items to better capture project.org/package=ordinal/.
the diversity of student backgrounds. For example, we included Cobb, G. (2007a), “One Possible Frame for Thinking about Experien-
additional questions to take into account the vastly different tial Learning,” International Statistical Review, 75, 336–347. https://doi.
org/:10.1111/j.1751-5823.2007.00034.x.
backgrounds of Asian American students and to differentiate (2007b), “The Introductory Statistics Course: A Ptolemaic Cur-
between Asian American students and international students riculum?,” Technology Innovations in Statistics Education, 1, 1–17.
from Asia. https://escholarship.org/content/qt6hb3k0nz/qt6hb3k0nz.pdf.
Finally, an important unanswered question concerns how (2015), “Mere Renovation is Too Little Too Late: We Need
learning R might benefit students’ learning of statistics concepts. to Rethink our Undergraduate Curriculum from the Ground Up,”
Though we did not address the relationship between learn- The American Statistician, 69, 266–282. https://doi.org/10.1080/
00031305.2015.1093029.
ing R and students’ statistics understanding in this study, we Cobb, P., Gravemeijer, K. P., Bowers, J., and Doorman, M. (1997), Statistical
encourage future research in this area. For example, it would be Minitoolss [Applets and Applications], Nashville and Utrecht: Vanderbilt
interesting to know what types of experiences using R are useful University and Freudenthal Institute – Utrecht University.
for developing understanding and transfer of statistics concepts, Dalgaard, P. (2008), Introductory Statistics with R. New York: Springer.
and whether, once they have learned how to use R to explore Dede, C., Richards, J., and Saxberg, B. (2019), Learning Engineering for
Online Education: Theoretical Contexts and Design-Based Examples, New
data, students apply those strategies to understand and explore
York: Routledge.
datasets they encounter in the future. Du, J., Wimmer, H., and Rada, R. (2016), “Hour of Code": Can it
Change Students’ Attitudes Toward Programming?” Journal of Infor-
mation Technology Education: Innovations in Practice, 15, 053–073.
Funding http://doi.org/10.28945/3421.
Ellis, J., Fosdick, B. K., and Rasmussen, C. (2016), “Women 1.5 Times
This project has been made possible in part by a grant from the Chan More Likely to Leave STEM Pipeline after Calculus than Men: Lack of
Zuckerberg Foundation, and a Learning Lab from the Governor’s Office of Mathematical Confidence a Potential Culprit,” PloS One, 11, e0157447.
Planning and Research, California (OPR18115). https://doi.org/10.1371/journal.pone.0157447.
32 M. C. TUCKER ET AL.
Field, A., Miles, J., and Field, Z. (2012), Discovering Statistics Using R, Qian, Y., and Lehman, J. (2017), “Students’ Misconceptions and Other
London: Sage Publications Ltd. Difficulties in Introductory Programming: A Literature Review,” ACM
Finzer, W. (2001), Fathom Dynamic Statistics [Computer Software], Transactions on Computing Education, 18, 1–24. https://doi.org/10.
Emeryville, CA: Key Curriculum Press. 1145/3077618.
Fries, L., Son, J. Y., Givvin, K. B., and Stigler, J. W. (2020), “Practicing R Core Team (2019), “R: A Language and Envionment for Statistical
Connections: A Framework Guide Instructional Design for Developing Computing.” R Foundation for Statistical Computing, Vienna, Austria.
Understanding in Complex Domains,” Educational Psychology Review, Available at http://www.R-project.org/.
33, 739–762. https://doi.org/10.1007/s10648-020-09561-x. Reid, A., and Petocz, P. (2002), “Students’ Conceptions of Statistics:
Gal, L., and Ginsburg, L. (1994), “The Role of Beliefs and Attitudes in Learn- A Phenomenographic Study,” Journal of Statistics Education, 10,
ing Statistics: Towards an Assessment Framework,” Journal of Statistics 1–18.
Education, 2, 1–16. https://doi.org/10.1080/10691898.1994.11910471. Riegle-Crumb, C., and King, B. (2010), “Questioning a White Male Advan-
Gallup (2016), Trends in the State of Computer Science in U.S. K-12 tage in STEM: Examining Disparities in College Major by Gender
Schools. Gallup. and Race/Ethnicity,” Educational Researcher, 39, 656–664. https://doi.
Garland, K. J., and Noyes, J. M. (2004), “Computer Experience: A Poor org/10.3102/0013189x10391657.
Predictor of Computer Attitudes,” Computers in Human Behavior, 20, Rode, J. B., and Ringel, M. M. (2019), “Statistical Software Output in the
823–840. https://doi.org/10.1016/j.chb.2003.11.010. Classroom: A Comparison of R and SPSS,” Teaching of Psychology, 46,
Google and Gallup. (2015), Images of Computer Science: Perceptions 319–327. https://doi.org/10.1177/0098628319872605.
Among Students, Parents, and Educators in the US. rOpenSci (2018), “rOpenSci Educators Collaborative: What are the
Gould, R., Wong, R., and Ryan, C. N. (2017), Essential Statistics: Exploring challenges when teaching science with R?” R Bloggers. Available
the World through Data. Boston, MA: Pearson. at https://www.r-bloggers.com/2018/07/ropensci-educators-collaborative-
Hulleman, C. S., and Harackiewicz, J. M. (2009), “Promoting Interest and what-are-the-challenges-when-teaching-science-with-r/.
Performance in High School Science Classes,” Science, 326, 1410–1412. Rossman, A., and Chance, B. (2014), “Using Simulation-based Inference
https://doi.org/10.1126/science.1177067. for Learning Introductory Statistics,” Wiley Interdisciplinary Reviews:
Hulleman, C. S., Kosovich, J. J., Barron, K. E., and Daniel, D. B. (2017), Computational Statistics, 6, 211–221.
“Making Connections: Replicating and Extending the Utility Value Säljö, R. (1979), “Learning about Learning,” Higher Education, 8, 443–451.
Intervention in the Classroom,” Journal of Educational Psychology, 109, https://doi.org/10.1007/bf01680533.
387–404. https://doi.org/10.1037/edu0000146. Scott, M. J., and Ghinea, G. (2014), “On the Domain-Specificity of Mind-
Konold, C., and Miller, C. (2004), Tinkerplots [Computer Software], sets: The Relationship between Aptitude Beliefs and Programming
Amherst, MA: University of Massachussetts Amherst. Practice,” IEEE Transactions on Education, 57, 169–174. https://doi.
Kosovich, J. J., Hulleman, C. S., Barron, K. E., and Getty, S. (2015), “A Prac- org/10.1109/te.2013.2288700.
tical Measure of Student Motivation: Establishing Validity Evidence for Siek, K. A., Connelly, K., Stephano, A., Menzel, S., Bauer, J., and Plale, B.
the Expectancy-Value-Cost Scale in Middle School,” The Journal of Early (2006), “Breaking the Geek Myth: Addressing Young Women’s Misper-
Adolescence, 35, 790–816. https://doi.org/10.1177/0272431614556890. ceptions about Technology Careers,” Learning & Leading with Technol-
Liang, J., and Tsai, C. (2010), “Relational Analysis of College Science-Major ogy, 33, 19–22.
Students’ Epistemological Beliefs toward Science and Conceptions of Son, J. Y., Blake, A. B., Fries, L., and Stigler, J. W. (2021), “Model-
Learning Science,” International Journal of Science Education, 32, 2273– ing First: Applying Learning Science to the Teaching of Introductory
2289. https://doi.org/10.1080/09500690903397796. Statistics,” Journal of Statistics and Data Science Education, 29, 4–18.
Lieberman, M. (2018), “Learning Engineers Inch Toward the Spotlight.” https://doi.org/10.1080/10691898.2020.1844106.
Inside Higher Ed. Available at https://www.insidehighered.com/digital- Stigler, J. S., Son, J. Y., Givvin, K. B., Blake, A. B., Fries, L., Shaw, S. T.,
learning/article/2018/09/26/learning-engineers-pose-challenges-and- and Tucker, M. C. (2020), “The Better Book Approach for Education
opportunities-improving. Research and Development,” Teachers College Record: The Voice of Schol-
Lock, R. H., Lock, P. H., Morgan, K. L., Lock, E. F., and Lock, D. F. (2021), arship in Education, 122, 1–32.
Statistics: Unlocking the Power of Data, Hoboken, NJ: Wiley. Sweller, J. (1988), “Cognitive Load during Problem Solving,” Cognitive
Mascaró, M., Sacristán, A. I., and Rufino, M. M. (2016), “For the Science, 12, 257–285.
Love of Statistics: Appreciating and Learning to Apply Experimen- (1999), Instructional Design in Technical Areas, Melbourne: ACER
tal Analysis and Statistics through Computer Programming Activi- Press.
ties,” Teaching Mathematics and Its Applications, 35, 74–87. https://doi. (2018), “Instructional Design,” in Encyclopedia of
org/10.1093/teamat/hrw006. Evolutionary Psychological Science, eds. T. K. Shackelford
McCulloch, R. S. (2017), “Learning Outcomes in a Laboratory Environment and V. A. Weekes-Shackelford, pp. 1–5, Cham: Springer.
vs. Classroom for Statistics Instruction: An Alternative Approach Using https://doi.org/10.1007/978-3-319-16999-6_2438-1.
Statistical Software,” International Journal of Higher Education, 6, 131– Tai, D. Y. (2003), “A Study on the Effects of Spatial Ability in Promoting
142. https://doi.org/10.5430/ijhe.v6n5p131. the Logical Thinking Abilities of Students with Regard to Programming
Moore, D. S. (1992), “Teaching Statistics as a Respectable Subject.” In Language,” World Transactions on Engineering and Technology Educa-
Statistics for the Twenty-First Century, eds. F. Gordon and S. Gordon, tion, 2, 251–254.
pp. 14–25, Washington, DC: Mathematical Association of America. Tek, F. B., Benli, K. S., and Deveci, E. (2018), “Implicit Theories and Self-
(1997), “New Pedagogy and New Content: The Case of Statistics,” Efficacy in an Introductory Programming Course,” IEEE Transactions on
International Statistical Review, 65, 123–165. Education, 61, 218–225. https://doi.org/10.1109/te.2017.2789183.
Navarro, D. (2020), “Learning Statistics with R.” Available at https:// Thille, C. (2018), Bridging Learning Research and Teaching Practice
learningstatisticswithr.com. for the Public Good: The Learning Engineer, New York: TIAA
Nolan, D., and Temple Lang, D. (2007), “Dynamic, Interactive Documents Institute.
for Teaching Statistical Practice,” International Statistical Review, 75, Tsai, C., Jessie Ho, H. N., Liang, J., and Lin, H. (2011), “Scientific Epistemic
295–321. Beliefs, Conceptions of Learning Science and Self-efficacy of Learning
(2010), “Computing in the Statistics Curricula,” The American Science Among High School Students,” Learning and Instruction, 21,
Statistician, 64, 97–107. 757–769. https://doi.org/10.1016/j.learninstruc.2011.05.002.
(2015), Data Science in R, Boca Raton, FL: Chapman and Hall/CRC. Van Rossum, E. J., Deijkers, R., and Hamer, R. (1985), “Students’ Learn-
Paas, F., Renkl, A., and Sweller, J. (2004), “Cognitive Load Theory: Instruc- ing Conceptions and their Interpretation of Significant Educational
tional Implications of the Interaction between Information Structures Concepts,” Higher Education, 14, 617–641. https://doi.org/10.1007/
and Cognitive Architecture,” Instructional Science, 32, 1–8. http://www. bf00136501.
jstor.org/stable/41953634. Ward, B. W. (2013), “What’s Better—R, SAS®, SPSS®, or Stata®? Thoughts
Pruim, R., Kaplan, D. T., and Horton, N. J. (2017), “The Mosaic Package: for Instructors of Statistics and Research Methods Courses,” Journal
Helping Students to ’Think with Data’ Using R,” The R Journal, 9, 77– of Applied Social Science, 7, 115–120. https://doi.org/10.1177/193672
102. 441245057.