Teaching Machine Learning in Elementary-1
Teaching Machine Learning in Elementary-1
Research paper
article info a b s t r a c t
Article history: The emergence and ubiquity of Artificial Intelligence in the form of Machine Learning (ML) systems
Received 30 December 2020 have revolutionized daily life. However, scant if any attention has been paid to ML in computing
Received in revised form 9 September 2021 education, which continues to teach rule-based programming. A new, promising research field in
Accepted 13 September 2021
education consists of acquainting children with ML to foster this much-needed shift from traditional
Available online 23 September 2021
rule-driven thinking to ML-based data-driven thinking. This article presents the development of
Keywords: computational thinking competencies in 12-year-old students who participated in a learning-by-design
Computational thinking or a learning-by-teaching ML course. The results, based on a qualitative and quantitative evaluation of
Constructionism the students’ achievements, indicate that they demonstrated computational thinking competencies
Artificial intelligence at various levels. The learning by design group evidenced greater development in computational
Machine learning skills, whereas the learning by teaching group improved in terms of computational perspective. These
Elementary school findings are discussed with respect to promoting children’s problem-solving competencies within a
Programming
constructionist approach to ML.
© 2021 Elsevier B.V. All rights reserved.
1. Introduction ways that a computer could also execute (Wing, 2006). The prob-
lems can be tackled by an information-processing agent which
Societies and education are undergoing radical changes as can be a computer or a human. CT attempts to solve complex
a result of what is known as the digital turn (Levin, & Mam- problems by implementing a three-step problem-solving process:
lok, 2021). These changes impact our daily lives, but also raise (1) decomposing the problem into sub-problems, (2) solving the
fundamental questions about our understanding of the world sub-problems, and then (3) combining the solutions to the sub-
around us. Preparing the younger generation for life in the new problems into an overall solution (Leiser, 1996). This process
digital world is of crucial interest to educators, and in particular requires the use of competencies such as decomposition, pattern
the best ways to enable elementary school students to master recognition, abstraction, and algorithm design. These are now
emerging technological phenomena. Students can and should widely accepted as comprising CT and form the basis of curric-
understand not only the basics of interactions with digital re- ula that aim to support its learning and assess its development
ality, but also its construction and ontological principles. The worldwide (Grover, 2017), and in particular in Israel (Ministry of
use of ML systems in education has grown exponentially in the Education Israel, 2021). They can be referred to as ‘‘traditional CT’’
last few years, mainly due to the increased availability of ML or ‘‘automation CT’’ competencies because they are used in the
resources (Domingos, 2015; Druga, Vu, Likhith, & Qiu, 2019). cognitive effort of automating a solution to a problem. It is worth
The current study reports on a pilot experimental program to noting that the computing education community has yet to find
foster students’ computational thinking (CT) skills as they develop a consensual definition of CT (Tedre, & Denning, 2016).
through the implementation of ML systems in elementary school.
This paper employs several key concepts that are widely used • Artificial Intelligence (AI) systems are computer systems that
in both the mass media and academic sources. Because these can perform tasks that would normally require human intelli-
concepts are often ascribed different meanings, it is crucial to gence (McCarthy, Minsky, Rochester, & Shannon, 2006). AI sys-
define them at the outset in terms of the context in which they tems are characterized by their ability to learn. The data gener-
are used in this article. ated by an AI learning process is used to make decisions akin to
those of humans in similar situations. This is often framed by
• Computational Thinking (CT) can be defined as the thought considering AI as the computer’s ability to create new knowl-
processes involved in formulating problems and their solutions in
edge from data and use that knowledge to make human-level
decisions (Rouhiainen, 2018).
∗ Corresponding author.
E-mail addresses: [email protected] (G. Shamir), • An Artificial Neural Network (ANN) can be viewed as an AI sys-
[email protected] (I. Levin). tem that simulates the interaction of human neurons in the brain.
https://doi.org/10.1016/j.ijcci.2021.100415
2212-8689/© 2021 Elsevier B.V. All rights reserved.
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
An ANN can be represented as a graph consisting of connected two different ML courses based on a CT-ML framework that we
neurons as binary elements, in which outputs of elements are designed to study and evaluate elementary school students’ CT
connected to inputs of others (Yegnanarayana, 2009). development while constructing ML systems.
The remainder of the paper is organized as follows. The theo-
• Machine Learning (ML) is one way to implement AI. The first
retical background is provided in Section 2. Section 3 deals with
personal computers were programmed to run predesigned algo-
the methodology, research design, and procedure. The results are
rithms. In contrast, thanks to ML, today’s computers and their
presented in Section 4 and the discussion in Section 5.
digital artifacts can accumulate experience and data and improve
their functioning based on this data to become self-evolving AI
2. Background
systems.
• Deep learning is a kind of ML. It is based on multilevel neural Machine Learning (ML) has changed practically all the areas
networks consisting of multiple layers of artificial neurons. The of our lives, from health care to politics to journalism. Whereas
development of deep neural networks is considered one of the the Industrial Revolution automated manual work and the In-
most important advances in digital technologies in recent years. formation Revolution did the same for cognitive endeavors, ML
These technologies have made it possible to come very close to has automated automation itself. When working in the tradi-
achieving human-like visual perception on recognition tasks. This tional Automation paradigm, a software developer intends these
fact in itself is unprecedented and remarkable. The traditional systems to be internally controlled and self-regulated. In other
concept of a computer’s capabilities simply viewed the computer words, humans who design programs set down rules for the
as the executor of a predesigned rigid algorithm. Modern digital system agents and their interrelations. During runtime, when the
devices equipped with deep learning capabilities have changed software executes, these agents implement the program with
the traditional paradigm of computing significantly. Computers no further human intervention. However, this is not the case in
can now not only execute a program written into its memory in the automation of automation paradigm. Rather, the system of
advance but also learn and develop from experience. agents changes the system’s behavior autonomously in a way
Both academia and the popular media have taken growing that its creator is not aware of and cannot predict. In such
interest in ML, its characteristics and its social impact. However, cases, once the program is executed, there is no human inter-
some crucial issues remain relatively unexplored although they vention, and the behavior of the agents changes with no outside
are essential for understanding processes taking place in soci- intervention (Domingos, 2015).
ety, especially with respect to education. One of these neglected One of the most exciting aspects of building ML systems is
issues, which in fact prompted this study, is the change in the that it challenges the traditional computational thinking (CT) skill
concept of computing itself. The fact that a computer with ML set. ML does not follow the three classical CT problem-solving
abilities no longer merely carries out the program stored in its steps mentioned above. In this sense, ML makes problem-solving
memory has had major ramifications for the concept of CT and different and can redefine CT’s educational goals. If indeed ML is
how it is taught. The second key insight motivating this study the automation of automation, what are the CT skills an individual
is that one of the most important contributions of ML has gone needs to develop an ‘‘automation of automation’’ solution to a
mostly unnoticed: it can supplement human intelligence (Domin- problem? How can these skills be imparted to elementary school
gos, 2015). Thus the ways in which ML can be integrated into students?
education is both fascinating and of critical importance. Specifi- Here we harnessed the Constructionist approach to teach ML
cally, how has the emergence of ML affected current computing in elementary school. Constructionism is usually referred to as
school curricula and what effects will it have on students’ knowl- learning-by-design. It applies to all domains, and suggests that
edge, skills, and perspectives? To answer these questions, we students learn best when creating and using external representa-
explored the feasibility and effectiveness of introducing ML in tions for modeling and reasoning (Blikstein, & Wilensky, 2009;
elementary school, not as a new stand-alone discipline, but as a Papert, & Harel, 1991). The Constructionist approach emerged
way to expand and enrich existing fields, based on CT. from the famous and widely known Constructivist approach (Ack-
The idea to include ML education as part of CT learning has ermann, 2001). Papert was influenced by Piagetian principles
wide and justified support (Mariescu-Istodor, & Jormanainen, and many of Jean Piaget’s Constructivist ideas were adopted by
2019). The current study constitutes a step towards achieving Constructionism. However, Constructionism also enriched Con-
this goal. Below we describe ways to develop elementary school structivism with remarkable and important innovations. These
students’ competencies to construct ML systems, inspired by innovations relate primarily to the emergence of digital technol-
the Constructionist Approach. The choice of this pedagogical ogy in the 1980s and include the fact that learning-by-design
approach is related to the history of Constructionist ideas. The involves a new kind of activity; namely, designing software ar-
founders of Constructionism had a ML research background. tifacts. Papert realized that program/algorithm design, which in-
Seymour Papert (Papert, 1980) and his followers considered that cludes software development, would be similar in many ways to
there is a profound interconnection between human learning and teaching in the upcoming digital era. This realization was rooted
machine learning (Badie, 2016; Kahn, & Winters, 2020). Today in the nature of CT itself, which requires the learner to formulate
this interrelationship continues to elicit considerable scientific rules for the behavior of an artifact while in the process of creat-
and practical interest (Levin, & Tsybulsky, 2017). ing it. Since then, the Constructionist approach has gone beyond
In Shamir, and Levin (2021) we presented an experimen- traditional learning-by-design to encompass the new component
tal ML course and its outcomes in terms of students’ motiva- of learning-by-teaching, a computer metaphor (Vartiainen, Tedre,
tion to learn and to understand the fundamentals of ML. Re- & Valtonen, 2020). As ML has gradually penetrated educational
sults showed high engagement during constructionist learning practices, this component of learning activity has become more
and that the novel programmable learning environment, Single- prevalent. This can be seen in the transition to data-driven learn-
Neuron, helped make machine learning understandable. Using ing from the rule-driven learning that characterized traditional
the perceptron mechanism allowed students to create their own programming. This trend is also reflected in the structure of this
ML-based artifacts and explore a wonder of AI: how one artifact study, which is based on observations of students in the two
satisfies different purposes when trained with different datasets. different courses developed for our research purposes. The first
This paper further develops the study and compares CT gains of course involved designing an artificial neural network and was
2
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
3.1. The ML curriculum buckets and populate them with text. The module also uses the
Scratch micro-world.
To study students’ acquisition and development of CT ML One student’s project in Module 3 was a ML system that
competencies, we developed four learning modules that actively determines whether a person is a Minecraft fan or a Fortnite fan
engage students with ML using a variety of learning platforms. based on inputting a statement. The student selected the Fortnite
The modules were: (1) Introduction to ML, (2) Practicing the and Minecraft categories and created the training datasets, which
ML process, (3) Constructing a data-driven ML system and (4) a consisted of statements related to each category. The training set
Constructing rule-driven ML system. is shown in Fig. 4.
The modules comprising the ML curriculum were adminis- The next step in Module 3 was to use the IBM Watson en-
tered in two different courses. The Learning ML by Design course gine to train the ML model, which results in a ready-to-use
consisted of modules 1, 2, and 4, and thus focused on all aspects server. Once the server was ready, the students constructed client
of ML, including creating an Artificial Neural Network (ANN). The software that interacted with the model and validated its predic-
second course, Learning ML by Teaching, consisted of modules 1, tion accuracy. The client software was created with Scratch. The
2, and 3, which enhanced training and validating an ML system corresponding screenshot is presented in Fig. 5.
but omitted the ANN construction. As shown in Fig. 5, the egg character instructs the user to
The module activities were based on the Use-Modify-Create type in a sentence that the ML system classifies. Once the user
learning progression model which facilitates learning using scaf- types in the sentence, the egg character outputs the classification
folds. It consists of a three-stage progression for engaging in CT the ML model returned and its accuracy probability. During in-
within computational environments. It is based on the premise class observations, the students who created the example in Fig. 5
that scaffolding increasingly deep interactions promotes the ac- explained they added the saxophone just for laughs. The IBM-
quisition and development of CT (Lee et al., 2011). Each learn- Watson engine does not support classification of text in Hebrew;
ing module focused on one progression level of the model, as hence Fig. 5 shows the student’s project with text in English,
illustrated in Fig. 3. including mistakes.
Module 4 — Constructing rule-driven ML system
Module 1 — Introduction to ML
In this module students moved from modifiers of ready-made
In this module, the students were involved in discovering how
ML systems, as was done in Module 2, to actual creators, but in a
ML works and the ways in which it differs from rule-based pro-
different way than in Module 3. In Module 4, they did not interact
gramming. They were given traditional algorithm creation tasks
with a ready-made ANN, but instead constructed it themselves.
to demonstrate the uniformity of their results in rule-driven com-
The neural network algorithm that the students were asked to
puting. As a contrast, they were shown examples of ML mistakes.
create was based on the logic gate activation function. To enable
While doing so they were introduced to data-driven computing.
a Constructionist pedagogy, we used the Single-Neuron toolkit. As
The students were also given an opportunity to engage in a
ML algorithms have become more complex, their reasoning and
conversation with an AI chatbot and were encouraged to decide
the rationales for their judgments have become less accessible
whether it passed the Turing test. This module used the Code.org
and difficult to examine. This so-called ‘‘black box’’ problem has
platform for algorithm creation and the Mitsuku website for AI
been acerbated by recent developments in deep learning that
conversation. In this module the students’ CT progression level
make use of very complex multi-layered neural networks (Webb
was the initial 1 – ‘use’, and they did not program ML or train a
et al., 2019). For this reason, to enable students to understand the
machine. ML algorithm we asked the students to create an ANN consisting
Module 2 — Practicing the ML process of just one neuron.
This module advanced students from ML users to modifiers, Initially, they learned about logic gates through examples of an
which is the next level in the CT learning progression model. AND gate and an OR gate. Then they learned about truth tables.
The students carried out a set of training activities on an ML Next, they created an ANN to operate a simulation of a water
system that involved classifying images into two categories. The system with two faucets that could be in only one of two modes,
categories increased in complexity as the activities progressed. either open or shut. After programming it, the students tested it
They required the students to train and validate the classifier on a truth table of an AND gate, and then changed their inputs
model. The context was ecological systems, based on the 6th to an OR truth table and tested it again. They were encouraged
grade science curriculum in Israel. The ML platform used was to monitor the neural network’s weight to better understand
‘‘AI for oceans’’ by the Code.org organization, which consists of why/where the system was wrong in the initial training iterations
a ready-made ML algorithm and a given ML training interface. and improve its accuracy as the training process integrated addi-
tional inputs. At the end of the module the students enhanced
Module 3 — Constructing a data-driven ML system their ML system with an additional programmed character that
In this module the students progressed from modifiers of pre- explained the system. This gave them time to reflect on the
made ML systems to actual creators, which is the top level in the system’s purpose and mechanisms.
use-modify-create CT learning progression. They were required One student’s project was a single neuron system that con-
to train a ready-made ML machine and program the software trolled the output of a water pipe flow according to the OR
that served as its client. The participants chose what the ML gate. The student set the truth table values to reflect the OR
would classify; in other words, they decided which categories gate rules, and constructed the neuron model. The little green
to form. Module 3 had more requirements than in Module 2, monster character in the system acted as the ‘storyteller’ who
which only had pre-made categories. The students also created was programmed by the student to explain the system as shown
the text-based dataset to train the machine. The client software in Fig. 6.
programming was done using a dedicated programming interface
to provide interactions with the pre-made server. Finally, the 3.2. Method and procedure
students activated their software, using a validation dataset of
their own to evaluate the machine’s performance with respect to To examine the students’ acquisition and development of CT
their predictions. The module used the Machine Learning for Kids ML, the two different ML courses were administered to ele-
platform, which provides an interface for users to define labeled mentary school students, and addressed the following research
5
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
Fig. 3. A sequence diagram of the order of the modules for the two ML courses.
questions: In what ways are students’ CT reflected in ML learning Data collection and analysis were based on evaluations. A detailed
using the Constructionist approach? Do the ’Learning ML by De- description of each instrument is provided below.
sign’ and ’Learning ML by Teaching’ courses differ in effectiveness Prior to the course, the students’ basic acquaintance with
and if so, how? online conference tools and programming with MIT Scratch were
This pilot study was conducted with two groups of students. assessed. In Israel many students learn computer science (CS),
Seven students participated in the Learning ML by Design course mainly Scratch programming, starting in 4th grade. Participants
and eleven in the Learning ML by Teaching course. All the stu- in this study who did not have a CS background had to complete
self-learning tasks based on the CS curriculum of the Ministry of
dents were in 6th grade, were 12 years old, and volunteered for
Education prior to the intervention course. Students took a test
the 12-hour, teacher guided, online ML course. Each participant
at the start of the course. Each course module was composed of
and their parents gave their informed consent to participate in a set of activities and a variety of test activities. A typical lesson
the study. lasted two hours and students took a total of six lessons. After the
This study evaluated the growth of students’ CT competen- course, a post-test was administered. Semi-structured interviews
cies as they experienced a novel Constructionist ML course for were also held with small groups which were transcribed for
elementary school children during the 2020 academic year. A se- further analysis.
quential explanatory mixed-method approach was used to eval- Students completed the evaluations individually, from their
uate the quantitative and qualitative data (Shamir & Levin, 2020). homes, using their computers during the online course. The tests
6
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
were made up of closed and open-ended questions on perspec- 1. Attention — should be sustained during the learning pro-
tives on ML, knowledge of ML processes, CT competencies, and cess.
interest in learning ML. The pretest consisted of 19 questions, 2. Relevance — why this material is important to me.
and the posttest was comprised of 32 questions. The questions 3. Confidence — can influence a student’s persistence and
investigated enjoyment of course activities, self-efficacy with ML accomplishment.
construction, and CT ML competencies in depth. The questions 4. Satisfaction — makes people feel good about their accom-
related to CT competencies were coded, since each question tar- plishments.
geted a specific skill. The test items were written by the re-
To study the Feature selection skill, pre- and post-proficiency
searchers specifically for this study. The qualitative validity rat-
questionnaires were used. The questionnaires asked students to
ings as evaluated by three expert judges indicated an Aiken V
identify features in a dataset, and to create a mind map of features
above 0.80 on all items.
for a given dataset.
To the best of our knowledge, ML learning self-efficacy has not
Students were asked to create a final project working alone
been reported in the literature. For this reason, the questionnaire
or in pairs. Each of the two courses required a different type
was based on a Constructionist validated robotics learning (Tsai,
of project. In both, they programmed an ML system, but in one,
Wang, Wu, & Hsiao, 2021) questionnaire modified for ML con- they created their own ANN based on the Single-Neuron toolkit,
struction. For example, instead of ‘‘I can make a robot’’, the item while in the other course, they created an ML system based on
was phrased as ‘‘I can make a ML system’’. Other items were: ‘‘I machine-learning-for-kids toolkit, as described above.
can discuss how to make ML systems easily with peers’’ and ‘‘I can Semi-structured interviews were conducted in small groups to
propose ideas for using ML to solve problems’’. The self-efficacy give participants ample opportunity to express their thoughts in
items were evaluated on a 5-point Likert scale, ranging from 1 detail. Interviews were 30 min long.
(not confident at all) to 5 (very confident). The interview protocol was organized into several major sec-
To examine students’ motivation to learn we implemented the tions:
ARCS Model of instructional design (Keller, 1987) which is based 1. Breaking the ice
on empirical investigations to assess students’ motivation. The a. Why did you decide to sign up for this course?
model is composed of four conceptual categories that subsume b. How was the course for you? (Students were referred to the
many of the specific concepts and variables that characterize shared bulletin board used during the course to refresh their
human motivation. memory)
7
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
2. About ML other hand, the Learning ML by Design score was 21% higher
a. How would you describe ML to friends or family? at 4.28. This may indicate that constructing an ANN hands-on
b. How can a ML system help people? contributed to students’ sense of capability.
c. Had you had the opportunity to interact with any type of ML Understanding ML processes: Using a questionnaire and inter-
system, what would you like it to be able to classify? views, we collected data on what the participants considered
3. Final course project key to good ML system training and evaluation. For example,
a. How did you get started making your project? during the course they were shown a video of an autonomous
b. What happened when you got stuck? car that failed to stop at a stop sign. In the interview one student
During the course, an online shared bulletin board tool was used explained this error: ‘‘The car needs to be trained better. If you
to post ideas and discuss them. This was used to capture students’ want it to identify these things, you put the sign in all sorts of
thoughts and discuss them in the final interviews. places, not just in one place, so the car can be trained in different
situations to recognize the sign’’.
4. Results The Learning ML by Teaching course post-questionnaire had
an average score of 4.6 which was lower than the Learning ML
by Design course score of 5 out of 5. This may indicate that
The fact that the thinking corresponding to ML differs signifi-
constructing an ANN can enhance students’ ML understanding.
cantly from more widely known computer thinking has attracted
Overall, the scores for the computational perspectives on the
attention in the literature and has fueled numerous scientific
Learning ML by Design course were higher, thus confirming our
debates (Denning & Tedre, 2021; Shapiro, Fiebrink, & Norvig,
hypothesis that constructing an ANN using appropriate scaffolds
2018). Therefore, the inclusion of ML in a CT course calls for
such as the Single-Neuron toolkit contributes to young students’
answers to important questions about the relationship between
computational thinking.
ML and CT. Students were asked in the interview what ways they thought
For this reason, Table 1 lists how ML CT practices relate to tra- a ML system could help people. Their answers varied in content:
ditional CT practices. To date, there is scant CT ML research; thus, ‘‘I would like to have a ML robot who is a soccer referee which sees
in this study we examined students’ CT competency in terms everything like if someone touches the ball by hand or is offside.
of computational perspectives, machine training practices, and It would be better if the robot was not on the field because it
machine validation practices. This is the essence of the study. In interrupts the players’’. Another student suggested using ML to
addition, we used the qualitative methodology of grounded the- replace judges in a court of law. Yet another student suggested
ory (Shapiro et al., 2018; Strauss, & Corbin, 1997) to characterize that ML could help the blind by calling out red lights and other
the effects of the ANN construction activity on the students. features of the environment it was trained to identify. Another
The next section compares the students’ CT gains relative to an suggested an autonomous scooter, so parents will not need to
ML course which did not involve the construction of an ANN (see drive students to after school activities.
Fig. 3 for a visual breakdown of the modules of the two courses).
ANN is considered difficult to construct, therefore we specifically 4.2. Machine training practices
wanted to assess its effects on students’ CT to better understand
its role in a CT curriculum. The results of CT practices related to ML training are presented
Students’ CT competency was examined in terms of computa- in Fig. 8.
tional perspectives, machine training practices, and machine val- Data split proficiency: The end of course evaluation aimed to as-
idation practices. The perspectives data from the pre/post ques- sess the students’ ability to split a set of images into two groups:
tionnaires were rated on a Likert scale ranging from 1 to 5. The one for training and validation. The purpose was to determine
training and validation proficiency data were graded from 0 to whether the students would reason in terms of 3 dimensions: (1)
100. The results served to compare the Learning ML by Design using as many of the full set of images as possible (2) not using
course and the Learning ML by Teaching course as detailed below. the same image for both datasets (3) splitting each pair of images
with common features in the two datasets. In both courses (see
4.1. Computational perspectives Fig. 8) students scored high, exceeding 0.75. In addition, the
students had a good grasp of data diversity in terms of covering as
The computational perspective results are presented in Fig. 7. many features as possible. However, they did not fully understand
Analysis of the post questionnaires in both courses indicated that a properly validated ML machine should not be tested with
a 5% increase in motivation as compared to the beginning of the the same images as used during training.
course. Given that the students chose to take the course, it was
Data filtering proficiency: This part of the evaluation assessed the
natural that their initial motivation was high, so an increase, even
students’ ability to filter out data before the training phase. For
despite the fact that the course was challenging, was meaningful. instance, when creating an ML system to classify images into
This was further supported by the fact that during the course a ‘‘Zebra’’ category and a ‘‘Pedestrian crossing’’ category, having
and in the final interview, participants asked to participate in a images with both zebra and pedestrian crossing may cause the
future course to learn more. The Learning ML by Teaching course machine to be trained incorrectly. In one of the questions that
participants scored an average of 4.28 out of 5 while the Learning tested data filtering proficiency, the students were given a set
ML by Design score was higher at 4.42. This may indicate that the of images (see Fig. 9). They were asked to decide which images
learning activity of creating an ANN using the Single-Neuron tool to remove from the dataset. This test was designed to determine
kit had a greater impact on students’ motivation to learn than whether the students would reason in terms of the following 2 di-
interacting with a pre-made ANN using code. mensions: (1) Remove images with both a zebra and a pedestrian
Self-efficacy with modeling: Self-efficacy is an important part of crossing; (2) Filter out images of a pedestrian crossing that was
successful learning. Thus, we examined modeling confidence us- painted to look like a zebra. The findings indicated that students
ing self-efficacy items. The comparison of the post-questionnaire did better on dimension 1 than dimension 2 but that the scores
in general were low (score ≤ 0.35, see Fig. 8).
Learning ML by Design course results to the Learning ML by
Teaching course results indicated that the Learning ML by Teach- Category Selection Proficiency: Category selection skills make it
ing course participants scored an average of 3.2 out of 5. On the possible to find similarities, including perceptual distances, with
8
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
Fig. 7. Results of computational perspectives for the Learning ML by Design course vs. the Learning ML by teaching course.
prototype examples of categories. The rationale stems from hu- them to select a category for classifying fish images out of a closed
man perception studies which posit that these categories are not set of obfuscate categories such as ‘‘Glitchy’’ or ‘‘Awesome’’ (see
defined by lists of features but rather by similarity to prototypes. Fig. 10). The students were asked in class to classify images as
The analysis here implemented this argument that categories are ‘‘true’’; i.e., fit the category, or ‘‘false’’; i.e., did not fit the category.
not merely a list of features. The evaluation showed that the The students verbally explained that it was too difficult for them
students did less well when asked to find a category (score ≤ to justify some of the categories, which is why they switched
0.57, see Fig. 8) than when asked to find features for a pre-defined them during the task.
category (score ≥ 0.76, see Fig. 8). In-class observations showed An interesting observation emerged from an activity that
that students frequently switched categories on a task that asked asked the students to create their own categories. One student
9
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
came up with algorithm primitives (e.g., words which are part historical dataset and applied to new data. The post-test included
of the algorithm lexicon) as categories. He wanted to categorize a set of questions that evaluated the ability to predict. It pre-
sentences into two types of games: Fortnite or Minecraft. The five sented a trained dataset of labeled images and asked the students
categories he created were ‘‘If’’, ‘‘Input sentence’’, ‘‘Equals’’, ‘‘Fort- to predict the result of a new image and to explain why. For
nite’’, ‘‘Minecraft’’. The latter two categories represent proper instance, the students were presented with a training dataset of
category selection, but the former three are incorrect. 6 images the ML system had used to learn whether an object ‘‘be-
longs in the ocean’’, labeled by ‘‘YES’’, or ‘‘does not belong in the
Data selection proficiency: Data selection is the ability to select
ocean’’, labeled by ‘‘NO’’. The 7th image required a prediction (see
data that are most likely to improve the system’s prediction
Fig. 12). On this specific question, most participants predicted
ability and avoid bias. This skill requires considering the different
the machine would classify the image of a blue can as ‘‘belongs
features of the object to be labeled and selecting a diverse dataset.
in the ocean’’. This indicates that they were able to put aside
For instance, when creating an ML system to identify images
the human intuition of predicting that a can ‘‘does not belong in
with sheep, insufficient data selection occurs when all the sheep
the ocean’’ but instead invoked their ML CT skills. Most student
images have greenery in the background. This could cause the
explanations referred to the ‘‘color feature’’. In other words, the
ML system to mistakenly categorize an image as having sheep
four objects tagged as ‘‘yes’’ (belongs in the ocean), were blue, as
even if it only shows a green pasture. In the interviews we
was the can. Thus, it was more likely that the ML system would
noticed that the students did not consider that sheep with only
predict that the can belongs in the ocean. They explicitly used the
a pasture in the background was a problem. On the other hand,
word ‘‘features’’ in their justification, strengthening our claim that
when evaluating data selection skills on a multiple-choice test,
they had assimilated the notion of feature selection. The results
the students scored relatively high (score ≥ 0.76, see Fig. 8).
showed that most students scored well (score > 0.93, see Fig. 11)
Feature selection proficiency: Feature selection is the process of on this skill.
identifying attributes that contain relevant information for an ML One of the students who gave a wrong answer explained it was
system’s data classification. The students were evaluated by tests, based on an ‘‘eyes’’ feature, saying, ‘‘The image of the can object
tasks and interviews, and during class observations. We created does not have eyes and neither do the two objects that were
several assessment tools to analyze feature selection, including labeled as do not belong in the ocean’’. This reasoning accurately
the number of features and their validity ratio. Pilot studies takes the feature into account but fails to acknowledge that
showed that these skills are extremely challenging; therefore, a eyeless jellyfish was also labeled as ‘‘Yes’’,. This response was thus
great deal of care went into the learning module, including a spe- scored as incorrect for having chosen ‘eyes’ as a differentiating
cific task of creating a feature map using explicit instructions on feature.
decomposition skills. The students scored relatively high (score >
Evaluation: Evaluation is the skill related to assessing how well
0.86, see Fig. 8) on this skill.
the ML system categorized the validation dataset and how the
A comparison of the scores in the two courses showed that the
results could be improved so that the labeling could be en-
differences in proficiency were small (difference <10%) except for
hanced. Possible issues involved in evaluation included whether
the Category Selection skill. In this case, Learning ML by Teaching
the machine was under-trained, whether the model was biased
course participants scored an average of 0.57 while the Learning
towards a certain object, whether the features were appropriately
AI by Design score was 18% lower at 0.29. This may suggest that
selected and whether the dataset was split appropriately. One
the enhanced engagement with category selection had a stronger
of the multiple-choice questions to test evaluation skills on the
influence on this skill than creating an ANN.
post-test was: ‘‘The ML system was trained to identify pictures of
‘‘food’’ or ‘‘not food’’. When tested, it gave inconsistent results for
4.3. Machine validation practices
sandwiches, often putting them in the ‘‘not food’’ class. What can
explain this?’’: (a) The person who trained the system does not
The results of CT practices related to ML validation are pre-
like sandwiches and did not put pictures of them into the training
sented in Fig. 11.
set; (b) sandwiches are not food, so the system is wrong; (c) the
Prediction proficiency: Prediction refers to forecasting the likeli- system is correct about oranges, so there is no problem with it;
hood of an ML model’s outcome after it has been trained on a (d) the programmer developed the system with a bug. The correct
10
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
Fig. 12. Prediction skill question . (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
answer is ‘a’. When comparing the course results, the Learning ML involved students in programming their own ANN, thus giving
by Teaching course students scored an average of 0.71, whereas them a white box view of a ML system. The Learning ML by
the Learning ML by Design score was lower at 0.63. This may Teaching course involved using a preexisting ANN, thus giving
imply that enhanced engagement with data throughout the ML them the ANN as a black box without understanding its internals.
process had a stronger influence on the student’s evaluation skills Both courses had favorable results with regards to students
than creating an ANN which only had two Boolean inputs: 0 or 1. gains but there were some differences between the outcomes. The
participants exhibited slightly greater computational perspectives
5. Discussion, limitations and future work in the Learning ML by Design course than in the Learning ML by
Teaching course. This may point to the advantages of teaching
Today, people are surrounded by machine leaning (ML) based a Constructionist ANN activity in a ML course. It may indicate
technology therefore it is only natural that learning ML will is that the worldview the students formed about their surroundings
finding its way into education. The inclusion of ML in a com- and about themselves was enhanced when they worked on a ML
putational thinking (CT) course requires answers to fundamental algorithm that enabled the machine to learn.
questions about the relationship between ML and CT: Is ML part On the other hand, students performed better in the Learning
of CT or an extension to it? Does our concept of traditional CT ML by Teaching course with respect to computational practices.
change after integrating ML components? The course had more data-driven activities than the Learning
This article presents findings exploring the gains of CT com- ML by Design course. The skills that resulted in high proficiency
petencies of elementary school students who took part in two in the Learning ML by Teaching course were category selection,
different ML courses that implemented constructing ML systems data selection, data filtering, data split, category prediction and
using a Constructionist pedagogy. result evaluation. These are all practices that are profoundly
The results showed that the students learned to successfully data-driven, which may explain the students’ greater gains in CT
create a computerized ML system and train it to classify input practices since these skills are based on the identification of data
datasets. They showed knowledge of how data should be se- attributes.
lected and filtered for the system to learn with high probability Research on the educational possibilities afforded by ML con-
of making a successful prediction. They acquired ML concepts struction in elementary school is in its infancy. This study pro-
such as datasets, features and ML-bias, as well as the difference vides new findings and pedagogical insights for future research,
between data-driven and rule-driven programming. They were development, and educational efforts. It suggests key design con-
able to develop rich ML projects using their own categories, siderations for future development of a Constructionist approach
datasets and evaluations. It was clear that students came up to ML learning. It presents teacher-guided activities that can be
with interesting, diverse ML systems for expressing themselves implemented in school. The activities are adapted for elementary
through the constructionist activities. school students’ mathematical background and incorporates the
Several ML environments were used by the students in the appropriate scaffolds. The modules are scalable so students can
two constructionist courses. The Learning ML by Design course be creative with their own intelligent machines and put forward
11
G. Shamir and I. Levin International Journal of Child-Computer Interaction 31 (2022) 100415
models to address a wide variety of problems. This work can Guyon, I., & Elisseeff, A. (2006). An introduction to feature extraction. In Feature
be extended to different age groups and more ML platforms can extraction (pp. 1–25). Berlin, Heidelberg: Springer.
International Society for Technology in Education & Computer Science Teachers
be added to the curricula. This study thus paves the way for
Association (2011). Operational definition of computational thinking for K–
incorporating a Constructionist ML learning approach into the 12 education. Retrieved from https://csta.acm.org/Curriculum/sub/CurrFiles/
elementary school CT curriculum. It addresses the need for con- CompThinkingFlyer.pdf.
structionist approaches to teaching ML. A CT-ML framework was Ivankova, N. V., Creswell, J. W., & Stick, S. L. (2006). Using mixed-methods
introduced (Fig. 2) to set the foundations of assessing students’ sequential explanatory design: From theory to practice. Field Methods, 18(1),
3–20.
CT engaging in ML activities. Jona, K., Wilensky, U., Trouille, L., Horn, M. S., Orton, K., Weintrop, D., & Be-
Nevertheless, this study has several limitations. It was con- heshti, E. (2014). Embedding computational thinking in science, technology,
ducted on a small group of students, all of whom were volunteers. engineering, and math (CT-STEM). In Future directions in computer science
While such small-scale qualitative studies are useful for in-depth education summit meeting, Orlando, FL.
Kahn, K., & Winters, N. (2020). Constructionism and AI: A history and possible
exploration of a phenomenon, they do not allow for generaliza-
futures.
tion beyond the sample under investigation (Ivankova, Creswell, Keller, J. M. (1987). Development and use of the ARCS model of instructional
& Stick, 2006). In future research, the results should be verified on design. Journal of Instructional Development, 10(3), 2–10.
a larger sample. Since the course was online and not in a regular Khalid, S., Khalil, T., & Nasreen, S. (2014). A survey of feature selection and
class mode, in-person classes might yield different results. feature extraction techniques in machine learning. In 2014 science and
information conference (pp. 372–378). IEEE.
Lee, I., Martin, F., Denner, J., Coulter, B., Allan, W., Erickson . . ., J., & Werner, L.
Declaration of competing interest (2011). Computational thinking for youth in practice. Acm Inroads, 2(1),
32–37.
The authors declare that they have no known competing finan- Leiser, D. (1996). Constructivism, epistemology and information processing.
Anuario de Psicología/The UB Journal of Psychology, (69), 93–114.
cial interests or personal relationships that could have appeared
Levin, I., & Mamlok, D. (2021). Culture and society in the digital age. Information,
to influence the work reported in this paper. 12, 68, 2021.
Levin, I., & Tsybulsky, D. (2017). The constructionist learning approach in the
Declaration of funding source digital age. Creative Education, 8(15), 2463.
Mariescu-Istodor, R., & Jormanainen, I. (2019). Machine learning for high school
students. In Proceedings of the 19th Koli calling international conference on
This research did not receive any specific grant from funding computing education research (pp. 1–9).
agencies in the public, commercial, or not-for-profit sectors. McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (2006). A proposal
for the dartmouth summer research project on artificial intelligence, august
Selection and participation of children 31, 1955. AI Magazine, 27(4), 12.
Ministry of Education Israel (2021). Computational thinking. Retrieved May 1,
2021, from https://pop.education.gov.il/teaching-practices/search-teaching-
This study was approved by our institution’s relevant ethics practices/computational-thinking/.
committee. We recruited elementary school students through Papert, S. (1980). Mindstorms: Computers, children, and powerful ideas (p. 255).
their parents. We sent the consent form to fully inform parents NY: Basic Books.
and youth prior to signing up. Each participant and their parents Papert, S., & Harel, I. (1991). Situating constructionism. Constructionism, 36(2),
1–11.
gave their informed consent to participate in the study. The
Resnick, M., Maloney, J., Monroy-Hernández, A., Rusk, N., Eastmond, E., Bren-
consent forms for participation were signed and obtained from nan ., K., & Kafai, Y. (2009). Scratch: programming for all. Communications
parents prior to the study. The children were informed that they of the ACM, 52(11), 60–67.
could withdraw from participation at any given time. Rouhiainen, L. (2018). Artificial intelligence: 101 things you must know today about
our future. Lasse Rouhiainen.
Shamir, G., & Levin, I. (2020). Transformations of computational thinking prac-
References tices in elementary school on the base of artificial intelligence technologies.
In Proceedings of EDULEARN20 conference, Vol. 6 (p. 7th).
Ackermann, E. (2001). Piaget’s constructivism, Papert’s constructionism: What’s Shamir, G., & Levin, I. (2021). Neural network construction practices in
the difference. Future of Learning Group Publication, 5(3), 438. elementary school. KI-Künstliche Intelligenz, 1–9.
Badie, F. (2016). Concept representation analysis in the context of human- Shapiro, R. B., Fiebrink, R., & Norvig, P. (2018). How machine learning impacts the
machine interactions. In 14th international conference on e-society (ES 2016) undergraduate computing curriculum. Communications of the ACM, 61(11),
(pp. 55–62). International Association for Development, IADIS. 27–29.
Blikstein, P., & Wilensky, U. (2009). An atom is known by the company it keeps: Strauss, A., & Corbin, J. M. (1997). Grounded theory in practice. Sage.
A constructionist learning environment for materials science using agent- Tedre, M., & Denning, P. J. (2016). The long quest for computational thinking.
based modeling. International Journal of Computers for Mathematical Learning, In Proceedings of the 16th Koli calling international conference on computing
14(2), 81–119. education research (pp. 120–129).
Brennan, K., & Resnick, M. (2012). New frameworks for studying and assessing Tsai, M. J., Wang, C. Y., Wu, A. H., & Hsiao, C. Y. (2021). The development
the development of computational thinking. In Proceedings of the 2012 annual and validation of the robotics learning self-efficacy scale (RLSES). Journal of
meeting of the American educational research association, Vancouver, Canada, Educational Computing Research, Article 0735633121992594.
Vol. 1 (p. 25). Vartiainen, H., Tedre, M., & Valtonen, T. (2020). Learning machine learning
De Vries, M. J. (2008). Gilbert simondon and the dual nature of technical artifacts. with very young children: Who is teaching whom? International Journal of
Techné: Research in Philosophy and Technology, 12(1), 23–35. Child-Computer Interaction, 25, Article 100182.
Denning, P. J., & Tedre, M. (2021). Computational thinking: A disciplinary Webb, M., Fluck, A., Deschenes, M., Kheirallah, S., Lee, I., Magenheim ., J., &
perspective. Informatics in Education. Zagami, J. (2019). Thematic working group 4-state of the art in thinking
Domingos, P. (2015). The master algorithm: How the quest for the ultimate learning about machine learning: Implications for education.
machine will remake our world. Basic Books. Wing, J. M. (2006). Computational thinking. Communications of the ACM, 49(3),
Druga, S., Vu, S. T., Likhith, E., & Qiu, T. (2019). Inclusive AI literacy for kids 33–35.
around the world. In Proceedings of FabLearn 2019 (pp. 104–111). Yegnanarayana, B. (2009). Artificial neural networks. PHI Learning Pvt. Ltd..
Grover, S. (2017). Assessing algorithmic and computational thinking in K-12: Zimmermann-Niefield, A., Turner, M., Murphy, B., Kane, S. K., & Shapiro, R. B.
Lessons from a middle school classroom. In Emerging research, practice, and (2019). Youth learning machine learning through building models of athletic
policy on computational thinking (pp. 269–288). Cham: Springer. moves. In Proceedings of the 18th ACM international conference on interaction
design and children (pp. 121–132).
12