0% found this document useful (0 votes)
23 views

Chat GPT Effect

Uploaded by

lequocky906
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Chat GPT Effect

Uploaded by

lequocky906
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/376788669

ChatGPT effects on cognitive skills of undergraduate students: Receiving


instant responses from AI-based conversational large language models
(LLMs)

Article · December 2023


DOI: 10.1016/j.caeai.2023.100198

CITATIONS READS

47 2,665

4 authors, including:

Harry Barton Essel Dimitrios Vlachopoulos


Kwame Nkrumah University Of Science and Technology Erasmus University Rotterdam
81 PUBLICATIONS 1,193 CITATIONS 102 PUBLICATIONS 2,935 CITATIONS

SEE PROFILE SEE PROFILE

John Opuni Amankwa


Kwame Nkrumah University Of Science and Technology
8 PUBLICATIONS 118 CITATIONS

SEE PROFILE

All content following this page was uploaded by Harry Barton Essel on 09 January 2024.

The user has requested enhancement of the downloaded file.


Computers and Education: Artificial Intelligence 6 (2024) 100198

Contents lists available at ScienceDirect

Computers and Education: Artificial Intelligence


journal homepage: www.sciencedirect.com/journal/computers-and-education-artificial-intelligence

ChatGPT effects on cognitive skills of undergraduate students: Receiving


instant responses from AI-based conversational large language
models (LLMs)
Harry Barton Essel a, Dimitrios Vlachopoulos b, *, Albert Benjamin Essuman c,
John Opuni Amankwa d
a
Department of Educational Innovations, Kwame Nkrumah University of Science and Technology, Kumasi, AK, 315-7530, Ghana
b
Rotterdam School of Management, Erasmus University, Burgemeester Oudlaan 50, 3062, PA, Rotterdam, the Netherlands
c
Department of Industrial Design, Kwame Nkrumah University of Science and Technology, Kumasi, AK, 315-7530, Ghana
d
Department of Communication Design, Kwame Nkrumah University of Science and Technology, Kumasi, AK, 315-7530, Ghana

A R T I C L E I N F O A B S T R A C T

Keywords: This study investigated the impact of using ChatGPT, a state-of-the-art generative AI-based model, on the critical,
ChatGPT creative, and reflective thinking skills of university students in Ghana. The study utilized a mixed-methods
Large language models research approach, incorporating quantitative and qualitative data collection instruments, and an experi­
Critical thinking
mental procedure with a pretest-posttest control group. The study ultimately enlisted a sample of 125 students
Creative thinking
Reflective thinking
randomly allocated to either the experiment group (60 students) or the control group (65 students). The research
Undergraduate students was conducted in the context of a Research Methodology course, which had adopted the flipped classroom
approach. The students in the experiment group engaged with ChatGPT for in-class tasks, while those in the
control group used traditional databases and search engines for similar tasks. Data were collected using the
Critical Thinking Scale, Creative Thinking Scale, Reflective Thinking Scale, and a student interview guide (semi-
structured). The study’s findings illustrated that incorporating ChatGPT influenced the students’ critical,
reflective, and creative thinking skills and their dimensions discernibly. As a result, the study provides sugges­
tions for academics, instructional designers, and researchers working in educational technology.

1. Introduction terabytes in size (Mbakwe et al., 2023) obtained from the Internet
through supervised and reinforced learning techniques (Kung et al.,
Recently, there has been an increasing fascination with artificial 2023). Moreover, LLMs can improve their abilities by fine-tuning,
intelligence (AI) large language models (LLMs) and their practical resulting in more sophisticated capabilities and human-like text out­
application. AI has recently stimulated revolutionary innovations in puts (Leippold, 2022 & Kasneci et al., 2023). With the revolution of
several industries globally, with LLMs making influential contributions generative AI, academics have begun to apply machine learning, natural
(Li et al., 2023). In education, there is a trend towards enforcing the 5th language processing (NLP), and LLMs to the maturation of conversa­
era of the Internet, also known as the Internet of Things (IoT), thus, tional AI interface, making their application a new study area in higher
resulting in a growing enthusiasm for integrating AI-assisted teaching education (Essel et al., 2022). One contemporary phenomenon in the
and learning (Al Darayseh, 2023) with LLMs. LLMs represent a signifi­ LLMs revolution is ChatGPT (Chat Generative Pre-trained Transformer),
cant breakthrough in AI technology, allowing for new natural language which garnered substantial awareness due to its capability to function in
generation and understanding opportunities. LLMs are advanced neural a myriad collection of natural language activities (Tlili et al., 2023).
network models that have a vast number of parameters. These models
incorporate billions of parameters and are usually trained on large
amounts of text data (175 billion parameters), often gigabytes or even

* Corresponding author.
E-mail addresses: [email protected] (H.B. Essel), [email protected] (D. Vlachopoulos), [email protected] (A.B. Essuman), ajopuni@knust.
edu.gh (J.O. Amankwa).

https://doi.org/10.1016/j.caeai.2023.100198
Received 12 July 2023; Received in revised form 17 December 2023; Accepted 20 December 2023
Available online 23 December 2023
2666-920X/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

2. Related studies among academics and the general public on various social media plat­
forms. This enthusiasm is largely attributed to the AI’s perceived ability
2.1. An overview of ChatGPT technology to revolutionize our understanding of professional work, cognitive
processes, and the essence of human creativity (Mbakwe et al., 2023).
ChatGPT is a conversational AI interface, within the LLMs category, This widespread acclaim highlights ChatGPT’s potential in transforming
which can respond to natural language inputs and generate human-like educational methods and contributing to a deeper understanding of
responses developed recently by OpenAI (OpenAI, 2023). In contrast to learning processes.
previous AI interfaces, which were mostly Deep Learning (DL) models
that retained and identified patterns in data (Rospigliosi, 2023), 3. Research gap and questions
ChatGPT belongs to a new class of AI algorithms. ChatGPT is trained to
estimate the probability of a specific succession of words founded on the In the context of the present study, Ghana, most academics in higher
context of the words that preceded it (Liu et al., 2023; Naseem et al., have been concerned about the potential adverse effect that ChatGPT
2021). ChatGPT is primarily designed for engaging in human-like con­ can pose on students’ learning experiences and cognition since the use of
versations, but its capabilities extend much further. It excels in creating generative AI models has not been analyzed extensively. A compre­
original content such as stories, poems, and novels, and can replicate hensive Twitter sentiment analysis was conducted on the general
various behaviors it has learned to mimic (Tlili et al., 2023). This AI adoption of ChatGPT as an AI model and discovered that users
demonstrates remarkable proficiency in a range of natural language (including academics) have conflicting perspectives (Haque et al.,
tasks, from crafting coherent essays and performing translations to 2022). Hence, the vulnerabilities academics perceive about ChatGPT
answering questions (Rospigliosi, 2023) and creating computer code have shown profound and instantaneous protective measures against its
(Kasneci et al., 2023). In addition, ChatGPT is capable of interactive adoption in higher education as conglomerates of academic institutions
dialogues, where it can follow up on previous responses, acknowledge in the United States of America (New York and Los Angeles) prohibited
its own mistakes, challenge incorrect premises, and decline inappro­ ChatGPT from academic networks due to the perceived menace of uti­
priate requests (OpenAI, 2023). Additionally, ChatGPT generates lizing it to cheat in academic work (Shen-Berro, 2023). Like many
ongoing conversations that lead to follow-up questions, providing developing countries, Ghana has no standard policy governing the
distinct experiences from search engines that usually do not preserve a integration of AI-based models into educational institutions’ learning
chronology of the progression of an answer but present a list of separate and teaching process. This lack of policy highlights the incomplete na­
links to resources based on the relevance of specific keywords utilized as ture of the techno-centric perspective of the educational ecosystem since
search queries (Firat, 2023). ChatGPT provides additional questions that it does not account for the power dynamics involved in implementing
enhance and broaden the answers, addressing any challenges the such mechanisms, such as government or local governance policies,
questioner presents. Also, ChatGPT can enhance academic research and educational institution governors, respective educators, or even the
constructive writing tasks, critical thinking, problem-solving, and students themselves (Luckin et al., 2022). Despite the potential of
develop research skills (Dwivedi et al., 2023; Sullivan et al., 2023). It can ChatGPT to improve cognitive skills and enhance the learning experi­
also suggest unexplored aspects and current research topics, giving ence, some academics may still view its use in the classroom negatively,
students a better understanding and analysis of a particular topic (Kas­ perpetuating discrimination and biases (Kooli, 2023). As a result, these
neci et al., 2023). Notwithstanding, the discussion surrounding the academics may choose to continue leveraging conventional teaching
application of AI-based models in higher education reached a critical resources and methodologies rather than incorporating ChatGPT into
crossroads following the public release of ChatGPT in November 2022. their teaching practices. Moreover, many educators and educational
institutions may lack the knowledge or expertise to effectively incor­
2.2. Educational potential of ChatGPT in higher education porate new technologies into their teaching practices, especially when
leveraging and integrating LLMs and may tag LLMs as a negative in­
Recently, informal observations of ChatGPT’s usage indicate proof of fluence on students cognitive skills. According to Blomhøj (2011),
deductive reasoning, a sequential thought process, and the ability to having solid cognitive abilities means dealing with a particular chal­
maintain a long-term dependency (Kung et al., 2023). Besides, multiple lenge in an informed and thoughtful manner. These abilities would
preprints of academic research, blog posts, and media sources have necessitate the use of 21st-century learning skills such as
highlighted the benefits of utilizing ChatGPT in the educational problem-solving (Tsankov, 2018), creative thinking (Kassymova et al.,
ecosystem (Tlili et al., 2023). The expedience of ChatGPT in higher 2020), critical thinking (Biasi, Valencia, & Obregon, 2019), and reflec­
education has been recognized as a possible area of attraction due to its tive thinking (Chen et al., 2019). Many students, with Ghana not an
myriad applications. Some authors (Tlili et al., 2023 & Mollick & Mol­ exception, often need help analyzing, synthesizing, evaluating, and
lick, 2022) have even recommended integrating ChatGPT into instruc­ deeding information into new experiences (Akpur, 2020), leading to
tional didactic to enhance interactive learning experience. Moreover, inadequate problem-solving skills and limited creativity. Despite the
ChatGPT’s conversational format encourages students to exchange growing use of ChatGPT and other conversational AI models in higher
questions and answers, promoting awareness and deeper personal education, the effects of these models on such cognitive skills, which
reflection through scaffolding learning (Darvishi et al., 2022; Rospi­ have become the bane of academics, still need to be explored. Thus, this
gliosi, 2023). ChatGPT’s feature of answering follow-up questions en­ study aims to give an answer to the following research questions:
courages students to challenge and clarify information, facilitating
synthesis with actual knowledge and promoting a more in-depth un­ 1. Are the scores related to critical thinking skills of undergraduate
derstanding of numerous meanings and notions. The utilization of students utilizing ChatGPT significantly different from those of stu­
ChatGPT in higher education environments demonstrates its potential to dents using traditional lecture-based methods?
foster personalized learning approaches, significantly enhance student 2. Do the scores reflecting creative thinking skills of undergraduate
engagement, and encourage self-directed learning. Additionally, it students who utilize ChatGPT significantly vary from those engaged
shows promise in advancing cognitive competence among students in traditional lecture-based methods?
(Sanusi, Olaleye, Agbo, & Chiu, 2022). Despite these advantages, the 3. Are the scores indicative of reflective thinking skills among under­
impact of ChatGPT on critical thinking, creative thinking, and reflective graduate students leveraging ChatGPT significantly distinct from
thinking in relation to students’ learning outcomes remains an area yet those employing conventional lecture-based methods?
to be fully explored and understood. From its inception, ChatGPT 4. What are the perceptions and attitudes of undergraduate students
received an overwhelmingly positive reception, particularly noticeable regarding the use of ChatGPT as an educational tool?

2
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

In the current academic landscape, the ascendance of AI technolo­ 4.2. Research protocol & study population
gies, particularly those exemplified by systems like ChatGPT, presents a
pivotal opportunity to reassess and potentially redefine pedagogical The study cohort consisted of 125 undergraduate students enrolled
strategies. This paper posits that it is imperative to rigorously evaluate in a “Quantitative Research Design” course during the second semester
whether and how ChatGPT can augment critical thinking skills in stu­ of the academic year 2022–2023. This course, which carries 2 TPC
dents - a competency that is increasingly indispensable in the digital and (Theory, Practical, Credit) units, is designed to provide students with a
information-driven era. comprehensive understanding of research methodologies. Throughout
To this end, the first research question is crafted to investigate the the semester, students receive 1 h of theoretical instruction and engage
extent to which the utilization of ChatGPT as a pedagogical aid engen­ in 3 h of hands-on training per week, amounting to a total of 48 h over 12
ders notable variances in students’ capacities to analyze and critically weeks. It’s worth noting that students’ participation in this study was
evaluate information. This inquiry is juxtaposed against the backdrop of entirely voluntary, and their decision to opt in or out had no bearing on
conventional, lecture-based, educational methodologies, providing a their grades for the course. At the outset of the experiment, the research
comparative analysis of efficacy in fostering critical thinking. team clearly explained the purpose and goals of the study to the
Furthermore, the second research question delves into the realm of participating students. Prior to the commencement of the study, all
creative thinking. Recognizing that creativity is a cornerstone of inno­ students underwent a pretest that assessed their levels of Creative
vation and problem-solving in a multitude of fields, this question seeks Thinking, Critical Thinking, and Reflective Thinking using standardized
to ascertain whether the integration of ChatGPT into educational para­ scales. Those who provided informed consent were then randomly
digms can act as a catalyst for enhanced creativity among students. By assigned to either the experimental group (EG) or the control group (CG)
comparing the creative outputs and thought processes of students using a simple random sample technique facilitated by a random num­
engaged with ChatGPT against those adhering to conventional educa­ ber generator in Microsoft Excel. The control CG comprised 65 students,
tional techniques, the study aims to uncover any significant disparities while the EG included 60 students.
in creativity levels attributable to the use of this AI tool. In the study group, men comprised 60.8% (76) of the sample, while
The third research question pivots towards the concept of reflective women comprised 39.2% (49) of the total. Students in both the CG and
thinking, a crucial component of DL and critical self-assessment. This EG were able to gain experience concerning the learning environment
inquiry is aimed at determining the differential impact of ChatGPT, as and procedures leveraged throughout the course.
opposed to conventional educational methods, on students’ abilities to The average age of all the students was 21.711 (±4.77). Their cu­
engage in reflective thinking. Such an investigation is crucial for un­ mulative weighted average (CWA) scores were in the second upper di­
derstanding whether ChatGPT can foster more profound self-reflection vision, averaging 68.67 (±6.81). Most of the students in both the CG and
and assimilation of learned content, thus enhancing the depth and EG were male, with 35 (58.3%) students in the EG and 41 (63.1%)
quality of learning experiences. students in the CG. The preceding specifications and CWA scores of the
Lastly, given the burgeoning role of AI in educational contexts, it is CG and EG were similar. Table 1 provides a summary of the students’
imperative to comprehend student attitudes and perceptions towards descriptive traits.
such technologies. The final research question, therefore, focuses on As presented in Fig. 1, which illustrates the process of the interven­
elucidating students’ viewpoints and attitudes concerning the use of tion protocol, the Critical Thinking Scale (CTS), Creative Thinking Scale
ChatGPT as an educational tool. This exploration is intended to shed (MCTS), and Reflective Thinking Scale (RTS) were administered as a
light on the acceptability of AI in educational settings and identify po­ pretest to both groups of respondents. After participating in sampling
tential barriers to its effective integration. method activities over 3 weeks, the respondents were re-examined uti­
Each of these research questions is underpinned by a fundamental lizing the same scales. A semi-structured opinion guide for students was
objective: to thoroughly explore and harness the potential of AI, also implemented to garner qualitative data. This form was employed to
particularly tools like ChatGPT, in transforming various cognitive skills acquire qualitative data established on the students’ research objectives
and shaping student attitudes towards learning in the digital age. and, therefore, to achieve triangulation by gathering data from re­
spondents leveraging quantitative and qualitative approaches to
4. Methods enhance the findings and achieve more details. The semi-structured
guide for the students was utilized to determine the benefits and hin­
4.1. Research design drances of ChatGPT in the activities and to expose underlying charac­
teristics that could influence the quantitative findings.
This study was designed as a mixed-method approach (sequential The present study was conducted under the Helsinki Declaration
explanatory), an experiment with pretest-posttest design and a qualita­ (1975) and comparable ethical standards, with the approval of the
tive design of responses to follow-up questions. By employing such
methodological approach, a holistic response to the research questions
can be given by combining quantitative data (cognitive skills scores) Table 1
with qualitative data (opinions of undergraduate students). This com­ Descriptive traits of the respondents.
bination allows for a more effective interpretation of the results and EG (n = CG (n = X; t; P-
helps to uncover the underlying factors contributing to any observed 60) 65) value
differences (Ivankova et al., 2006). The addition of qualitative data in n (%) n (%)
this study can offer insights into the underlying reasons behind any
Gender Men 35 (58.3) 41 (63.1) X=
differences in critical thinking, creative thinking, and reflective thinking 5.731a
scores. Understanding the students’ perspectives, experiences, and at­ Women 25 (41.7) 24 (36.9) P = 0.337
titudes towards the use of ChatGPT can shed light on the potential Age Mean ± 21.45 ± 21.79 ± t = 1.121b
benefits or challenges associated with each approach. Finally, this SD 4.61 4.34 P = o.325
Academic standings Mean ± 68.21 ± 68.63 ± T=
design allows for a direct comparison between the cognitive skills scores
SD 6.14 6.53 0.072b
of undergraduate students using ChatGPT and those using conventional, P = 0.673
lecture-based research methods. Prior experience with Yes 29 (48.3) 28 (43.1) X=
Chatbots 4.504a
No 31 (51.7) 37 (56.9) P = 0.205

±SD = standard deviation; a Fisher’s exact test; b Independent samples t-test.

3
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

& Fidell, 2013).

4.3.3. Creative thinking scale (MCTS)


The study employed a 25-item Creative Thinking Scale (MCTS)
designed Özgenel and Çetin (2017) to gauge the creative thinking
abilities of the participating students. This comprehensive scale en­
compasses six latent constructs, each measuring distinct aspects of cre­
ative thinking: I) ’Courage,’ consisting of four items; II) ’Innovation
Search,’ with eight items; III) ’Inquisitive,’ comprising three items; IV)
’Self-Discipline,’ consisting of five items; V) ’Doubt,’ including two
items; and VI) ’Flexibility,’ which encompasses three items. Each item in
the scale was assessed by students using a five-point Likert-type scale,
where ’1′ represented ’Strongly Disagree,’ and ’5′ indicated ’Strongly
Agree.’
The MCTS was designed to yield scores ranging from a minimum of
Fig. 1. The intervention protocol flowchart.
25 points to a maximum of 125 points, with higher scores indicating a
greater level of creative thinking proficiency. To ensure the scale’s
Ethics Committee of the Department of Educational Innovations in reliability, the study computed internal consistency reliability co­
Science and Technology (EIST/REF No: January 01, 2023), Kwame efficients, with Cronbach’s alpha (α) estimated at 0.89 and McDonald’s
Nkrumah University of Science and Technology, Kumasi, Ghana. A alpha coefficients (ω) calculated as 0.90, underscoring the scale’s
digital written and verbal informed consent was solicited from the stu­ robustness. To evaluate the goodness of fit of the MCTS within the
dents who volunteered to be part of the study. study’s context, CFA was conducted, and the results aligned with
established criteria for model fit. The fit indices reported were as fol­
4.3. Measures lows: X2/sd = 3.01, CFI = 0.92, GFI= .94, IFI= .97, AGFI = 0.90, CFI =
0.93, RMSEA= .077, and SRMR= .069. These fit indices collectively
4.3.1. Sociodemographic traits indicated an acceptable fit for the MCTS within the study’s framework.
As part of the study, sociodemographic data were garnered from the
students. These traits included gender, age, CWA, previous experiences 4.3.4. Reflective thinking scale (RTS)
with conversational AI Chatbots such as ChatGPT or GPT-3, and type of The 16-item RTS, developed by Kember et al. (1999), was employed
chatbot experienced. to estimate reflective thinking of the students. The RTS is composed of
four latent constructs: I) Understanding (4 items); II) Reflection (4
4.3.2. Critical thinking scale (CTS) items); III) Critical Reflection (4 items); IV) Habitual Action (4 items).
The study utilized an 11-item CTS developed by Sosu (2013) to assess Each item in the constructs was estimated on a five-point Likert-type
the critical thinking skills of the participating students. This scale is scale with 1 denoting "Strongly disagree" and 5 denoting "Strongly
designed to evaluate two key latent constructs: ’Critical Openness,’ agree". The RTS had a minimum score of 16 points and a maximum score
comprising seven items, and ’Reflective Skepticism,’ consisting of four of 80 points with higher scores illustrating greater levels of reflective
items. Each item in the scale was rated by the students on a 5-point thinking. The internal consistency reliability coefficient (Cronbach’s
Likert-type scale, where ’1′ corresponded to ’Strongly Disagree’ and ’5′ alpha) of the RTS was estimated as α = 0.96 and McDonald’s alpha
denoted ’Strongly Agree.’ By summing the scores of these 11 items, an coefficients of ω = 0.97 for the current study. According to Tabachnick
overall score for each student was computed, ranging from 11 to 55 and Fidell (2013), the fit indices for this study’s CFA of the RTS were
points. The scoring range was categorized into three levels: scores be­ acceptable. These indices included X2/sd = 2.91, CFI = 0.91, GFI = 0.95,
tween 11 and 34 indicated ’Low Critical Thinking Skills,’ scores from 35 IFI = 0.94, AGFI = 0.93, CFI = 0.94, RMSEA= .082 and SRMR = 0.075,
to 44 represented ’Moderate Critical Thinking Skills,’ and scores falling which are acceptable according to Tabachnick and Fidell (2013).
within the range of 45–55 were indicative of ’High Critical Thinking
Skills.’ For a deeper understanding of the critical thinking skills, the 4.3.5. Semi-structured opinion guide for students
scale also allowed for a more granular assessment. Specifically, the The researchers designed a semi-structured student interview guide
’Critical Openness’ scores spanned from 7 to 35, with corresponding to assess the students’ opinions on leveraging ChatGPT to learn sampling
cut-off points: scores of 29–35 reflected ’High Critical Openness,’ scores methods. The focus group forum was performed in one meeting in which
between 22 and 28 indicated ’Moderate Critical Openness,’ and scores 15 EG members participated. Besides, the interview was held virtually a
ranging from 7 to 21 represented ’Low Critical Openness.’ On the other week after the experiment. The researchers wrote the questions in the
hand, ’Reflective Skepticism’ scores ranged from 4 to 20, with cut-off interview form. Following this, two field experts in educational research
values: scores of 17–20 denoted ’High Reflective Skepticism,’ scores evaluated the questions, and they were altered to agree with the feed­
from 13 to 16 signified ’Moderate Reflective Skepticism,’ and scores back. Then, the final guide was developed (see Appendix). The form
within the range of 4–12 indicated ’Low Reflective Skepticism.’ The seeks to define the students’ positive and negative experiences during
study’s internal consistency reliability was robust, as indicated by a the implementation of ChatGPT and to evaluate their readiness to persist
Cronbach’s alpha coefficient (α) of 0.93 for the Critical Thinking Scale. in leveraging this mechanism.
Furthermore, the McDonald’s alpha coefficient (ω) was calculated as
0.94 for this specific study. To assess the goodness of fit of the Critical 4.4. Intervention procedure in the experimental group (EG)
Thinking Scale, Confirmatory Factor Analysis (CFA) was performed, and
the results were in accordance with established standards. The fit indices ChatGPT, the study’s intervention, was involved at the beginning of
included X2/sd = 2.98, Comparative Fit Index (CFI) = 0.96, Goodness of the semester. The EG received an intervention to improve their critical,
Fit Index (GFI) = 0.94, Incremental Fit Index (IFI) = 0.95, CFI= .94, creative, and reflective thinking skills leveraging ChatGPT. Students
Adjusted Goodness of Fit Index (AGFI) = 0.95, Root Mean Square Error were assigned to the intervention group, and the intervention was
of Approximation (RMSEA) = 0.061, and Standardized Root Mean delivered via the FC approach with the a Learning Management System
Square Residual (SRMR) = 0.049. These indices collectively indicated (LMS). The intervention group was provided with educational resources
an acceptable fit for the CTS within the study’s framework (Tabachnick such as lecture videos and reading materials. They used the videos

4
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

connected to the lesson topics via the LMS one week before each lesson. EG were administered a semi-structured opinion guide and requested to
Additionally, students were provided with a series of prompts related to evaluate the effectiveness of leveraging ChatGPT for their course
the topic of the week, which they were required to explore and respond assignments.
to using ChatGPT. These prompts were developed to stimulate the stu­
dents to think critically, creatively, and reflectively. Examples of the 4.6. Statistical analysis
prompts included "What are the different types of research design?,
Which one do you think is most suitable for your research question, and Microsoft Excel 365 was utilized to compute the data. The data were
why?", "What are some of the challenges that you might encounter when inputted and examined using Jamovi 2.3.24 (The Jamovi Project, 2021;
designing a research study? How can you overcome these challenges?" Fox & Weisberg, 2020; Lenth, 2020). The Skewness and Kurtosis Tests
and "Can you think of a real-life example of a research study that had a were utilized to estimate the normality of the data. The data demon­
significant impact on society? What were the key findings, and how did strated normal distribution (values was ±2). Besides, a Shapiro-Wilk
they impact the field?" test was utilized to determine if the distribution of the numerical vari­
During the first 30 min of the lecture, the lecturer provided a brief ables was normally distributed. The analysis results illustrated that the
review of the topic and then encouraged students to ask questions and study group was normally distributed. The posttest scores of the critical,
engage in discussions. Interactive methods such as discussions, problem- creative, and reflective thinking scales were compared after controlling
solving, and brainstorming were leveraged in the class to promote for the pretest scores to address the research questions. Moreover, the
maximum interaction between the instructor and students. The lesson posttest scores of the critical, creative, and reflective thinking measures
time was utilized more effectively in the EG as students had already for the CG and EG were compared by controlling the pretest scores to
watched the lesson videos and came prepared for the class. This estimate the scales’ constructs. The ANCOVA estimate was utilized in
approach allowed for more interactive methods like the question and the study to compare the scales’ post-test scores and their constructs by
answer session to cover the subjects students needed assistance with. controlling the pretest scores. The pretest scores were controlled as the
Thus, the EG students were also encouraged to leverage ChatGPT to variance induced by the pretest scores of the two groups preceding the
clarify any possible misconceptions they may have had about the lesson. experimental process did not affect the result, as the EG and CG groups
The theoretical sessions were 40–50 min, while practical sessions lasted were not randomly formed. A significance (discernible) level of less than
for 60–70 min, with a total duration of 120 min for the research methods p < 0.05 and a confidence level of 95% were leveraged to estimate
courses. A topic test (based on discussion and individual written tasks) statistical discernibility in the analysis of all tests.
was given per session. The EG stated their thoughts regarding using the ChatGPT in terms of
Students were given a deadline to submit their responses, after which positive, negative, and prospective features. Following the study’s
they were assessed based on the differentia and profundity of their re­ completion, the researchers created a raw data document in Microsoft
sponses. The course instructor, teaching assistants, and ChatGPT Word by transcribing the responses. Two coders then utilized content
assessed the responses and provided feedback and improvement di­ analysis to organize the data to categorize related responses under the
rections. Students were trained on using the ChatGPT, including same topics. The Miles and Huberman (1994) formula was employed to
accessing the mechanism through a web browser and entering their estimate the validity of the codes produced by two coders. The standard
prompts in the dialogue box. Besides, the students were instructed on codes were divided into the absolute number of codes and multiplied by
how to interpret the responses provided by ChatGPT and use them to 100%. The reliability ratio was discovered as 0.95 (95%)([consensus
guide their thinking and research. number]/[absolute consensus + number of disagreements]). The re­
searchers concentrated on the 5% distinction and consented that this
4.5. Intervention procedure in the control group (CG) distinction is representative of qualitative analysis. Table 5 provides an
overview of the topics that came up. The findings also include a few
The students who were part of the study and designated to the CG (n student statements about their encounters.
= 65) attended the synchronous course with the rest of the class. The CG
received the same model via the Schoology LMS but without ChatGPT. 5. Results
Instead, they were given the same prompts as the EG students and asked
to respond to the prompts leveraging conventional research methods, 5.1. Critical thinking
such as reading textbooks, searching for articles online, and using other
conventional sources of information. For example, for the topic of To address the RQ 1, data were garnered from the respondents and
"Research Design," the prompts were as follows: What are the different tested for statistical discernibility between the critical thinking scale and
types of research design? Which one do you think is most suitable for its dimensions (Reflective Skepticism and Critical Openness) scores of
your research question, and why?", "What are some of the challenges the CG and EG. CG’s critical thinking pretest score was m = 24.9, and the
that you might encounter when designing a research study? How can posttest score was m = 30.6. In contrast, the pretest score for the EG was
you overcome these challenges?" and "Can you think of a real-life m = 28.4, while their posttest score was m = 39.2.
example of a research study that significantly impacted society? What The pretest scores of the respondents in both EG and CG were esti­
were the key findings, and how did they impact the field?" mated employing the CTS and its dimensions. Afterwards, the one-way
Like the EG, the students in the CG received 40–50 min of theoretical ANCOVA test was leveraged to analyze any differences between the two
sessions, and 60–70 min of practical sessions, with a total duration of groups. Preliminary checks were completed to calculate the assumptions
120 min for the research methods courses. Besides, a lesson test (based of normality, linearity, and homogeneity of variance. A Shapiro-Wilk
on discussion and individual written tasks) was given after each session. test indicated that the critical thinking scale scores were normally
Students were given a deadline to submit their responses, after which distributed in the CG and EG, CTS (W = 0.981, p = 0.076), critical
they were assessed based on the differentia and profundity of their re­ openness (W = 0.984, p = 0.090), and reflective skepticism (W = 0.970,
sponses. The course instructor (lead author) and teaching assistants p = 0.081). Levene’s test demonstrated that the assumptions of
assessed the responses and provided feedback and improvement di­ normality were not violated, critical thinking scale [F(1, 123) = 0.368],
rections. Furthermore, the students in the CG differed from the EG in p = 0.545, critical openness [F(1, 123) = 0.469, p = 0.701], and
that they were not permitted to use ChatGPT or any other LLMs during reflective skepticism [F(1, 123) = 0.563, p = 0.190]. These results
their in-class question-and-answer activities and assignments. provide confirmation that the scores in both the CG and the EG effec­
Following the study’s conclusion, the EG and CG were administered tively met and adhered to the necessary assumptions required for the
posttest assessments of the CTS, MCTS, RTS. Similarly, students in the subsequent statistical analysis, reassuring the validity and reliability of

5
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

the data for our research. The estimated marginal means (EMM) for the in the CG and EG, creative thinking scale (W = 0.984, p = 0.145),
construct and its dimensions are illustrated in Table 2. Courage (W = 0.991, p = 0.059), Innovative Search (W = 0.982, p =
The study leveraged covariance analysis to differentiate whether 0.095), Inquisitive (W = 0.994, p = 0.115), Self-Discipline (W = 0.976,
there was a statistically discernible variance between the posttest scores p = 0.00.298), Doubt (W = 0.976, p = 0.127), and Flexible (W = 0.985,
of the CTS, reflective skepticism, and critical openness dimensions of the p = 0.134). Levene’s test demonstrated that the assumptions of
EG and CG. The results are recapitulated in Table 2. normality were not violated, critical thinking scale [F(1, 123) = 0.571],
After controlling for the pretest score, there was a statistically p = 0.180], Courage [F(1, 123) = 1.11, p = 0.293], Innovative Search [F
discernible effect on the posttest score of critical thinking [F(1,122) = (1, 123) = 0.469, p = 0.701], Inquisitive [F(1, 123) = 0.904, p = 0.152],
36.3, p < 0.001, n2p = 0.229]. To put it differently, the posttest scores of Self-Discipline [F(1, 123) = 0.802, p = 0.372], Doubt [F(1, 123) = 1.90,
the CTS for the EG (m = 39.2, SD = 6.57) were discernibly higher than p = 0.170], and Flexible [F(1, 123) = 0.743, p = 0.390]. These results
the critical thinking posttest scores of the CG (m = 30.6, SD = 7.64). The indicate that the scores met the required assumptions for the analysis.
Pretest score of critical thinking was discernibly linked to the posttest The EMM for the construct and its dimensions are illustrated in Table 3.
score [F(1, 122) = 19.2, p < 0.001, np2 = 0.136] as illustrated in The study leveraged covariance analysis to measure whether there
Table 2. was a statistically discernible variance between the posttest scores of the
The critical openness scores of the EG and CG, a CTS dimension, were MCTS, Courage, Innovative Search, Inquisitive, Self-Discipline, Doubt,
compared. Table 2 exhibits the results of one-way ANCOVA conducted and Flexible dimensions of the EG and CG. The results of the study are
on the posttest critical openness scores of the EG and CG, F(1,122) = summarized in Table 3.
43.6, p < 0.001, n2p = 0.263. To put it differently, the posttest scores of After controlling for the pretest score, there was a statistically sig­
the critical openness dimension for the EG (m = 24.7, SD = 5.38) were nificant effect on the posttest score of creative thinking [F(1,122) =
discernibly higher than the critical openness dimension posttest scores 9.91, p = 0.002, n2p = 0.075], as illustrated in Table 3. To state it
of the CG (m = 17.8, SD = 5.33). The Pretest score of critical openness differently, the posttest scores of the critical thinking scale for the EG (m
dimension was discernibly linked to the posttest score [F(1,122) = 10.6, = 92.0, SD = 6.52) were significantly higher than the creative thinking
p = 0.001, n2p = 0.080]. posttest scores of the CG (m = 88.9, SD = 6.39). The pretest score of
Besides the reflective skepticism scores of the CG and EG, a CTS creative thinking was discernibly linked to the posttest score [F(1,122)
dimension, were compared. Table 2 shows the results of one-way = 7.02, p = 0.009, n2p = 0.054].
ANCOVA performed on the posttest reflective skepticism scores of the Besides, there was a statistically significant effect on the posttest
EG and CG, F(1,122) = 6.53, p = 0.012, n2p = 0.051. To express it score of the courage dimension [F(1,122) = 5.54, p = 0.020, n2p =
differently, the posttest scores of the reflective skepticism dimension for 0.043] after controlling for the pretest score of the courage dimension as
the EG (m = 14.5, SD = 2.62) were discernibly higher than the reflective witnessed in Table 3. In other words, the posttest scores of the courage
skepticism dimension posttest scores of the CG (m = 12.8, SD = 3.73). dimension for the EG (m = 13.1, SD = 2.07) were discernibly higher
The Pretest score of critical openness dimension was discernibly linked than the courage dimension posttest scores of the CG (m = 12.3, SD =
to the posttest score [F(1,122) = 4.92, p = 0.028, n2p = 0.039]. This 3.26). The pretest score was discernibly linked to the posttest score [F
suggests that the initial scores on the critical openness dimension had a (1,122) = 12.28, p < 0.001, np2 = 0.091] of the courage dimension.
discernible influence on the subsequent posttest scores. The innovative search dimension scores of the CG and EG were also
compared. The results (Table 3) of one-way ANCOVA conducted on the
5.2. Creative thinking posttest innovative search dimension scores of the EG and CG, F(1,122)
= 23.9, p < 0.001, n2p = 0.164. To set it differently, the posttest scores of
In order to address RQ2, data were collected from the respondents the innovative search dimension for the EG (m = 28.12, SD = 3.95) were
and tested for statistical discernible between the MCTS and its di­ discernibly higher than the critical openness dimension posttest scores
mensions (Creative thinking, Courage, Innovative Search, Inquisitive, of the CG (m = 24.63, SD = 3.36). The pretest score of the innovative
Self-Discipline, Doubt, and Flexible) scores of the CG and EG. CG’s search dimension was discernibly linked to the posttest score [F(1,122)
creative thinking pretest score was m = 59.2, and the posttest score was = 11.9, p < 0.001, n2p = 0.089].
m = 88.9. In difference, the pretest score for the EG was m = 57.2, while Further, the inquisitive dimension scores of the EG and CG were
their posttest score was m = 92.0. approximated. Table 3 exhibits the results of one-way ANCOVA
Preliminary checks were completed to calculate the assumptions of executed on the posttest inquisitive dimension scores of the EG and CG, F
normality, linearity, and homogeneity of variance. The pretest scores of (1,122) = 5.56, p = 0.020, n2p = 0.044. In other words, the posttest
the respondents in both CG and EG were measured utilizing the MCTS scores of the inquisitive dimension for the EG (m = 12.40, SD = 2.11)
and its dimensions. Afterwards, the one-way ANCOVA test was lever­ were discernibly higher than the inquisitive dimension posttest scores of
aged to analyze any disparities between the two groups. A Shapiro-Wilk the CG (m = 10.89, SD = 3.82). The pretest score of the inquisitive
test showed that the creative thinking scale scores were normally spread dimension was discernibly linked to the posttest score [F(1,122) = 4.47,

Table 2
Covariate analysis of critical thinking scale and its dimensions.
Source Sum of squares df Mean square f p EMM

Critical thinking Overall model 2472 2 1236.1 7.47 <.001 EG: 40.3
group 856 1 856.3 9.91 <.001 CG: 32.8
Pretest 1616 1 1615.8 7.02 <.001
Residuals 5434 122 44.5
Critical Openness dimension Overall model 1441 2 720.4 7.60 <.001 EG: 25.7
group 1159 1 1159.2 12.28 <.001 CG: 19.4
Pretest 282 1 281.7 5.54 0.001
Residuals 3245 122 26.6 33.2
Reflective skepticism dimension Overall model 117.0 2 58.5 6.86 0.001 EG: 15.2
group 50.3 1 50.3 4.92 0.028 CG: 13.7
Pretest 66.7 1 66.7 6.53 0.012
Residuals 1246.2 122 10.2

EMM: Estimate Marginal Means.

6
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

Table 3
Covariate analysis of creative thinking scale and its dimensions.
Source Sum of squares df Mean square f p EMM

Creative thinking Overall model 671 2 335.7 7.47 <.001 EG: 92.3
group 393 1 392.9 9.91 0.002 CG: 88.7
Pretest 278 1 278.5 23.20 0.009
Residuals 4839 122 39.7
Courage dimension Overall model 123.5 2 61.74 7.60 <.001 EG: 13.2
group 38.4 1 38.40 5.54 0.020 CG: 12.1
Pretest 85.1 1 85.09 12.28 <.001
Residuals 845.2 122 6.93 14.67
Innovative Search dimension Overall model 439 2 219.4 21.4 <.001 EG: 27.9
group 293 1 293.1 23.9 <.001 CG: 24.8
Pretest 146 1 145.8 11.9 <.001
Residuals 1496 122 12.3 7.74
Inquisitive dimension Overall model 95.4 2 47.68 12.92 0.003 EG: 12.3
group 53.0 1 53.01 6.48 0.020 CG: 11.0
Pretest 42.4 1 42.36 0.037
Residuals 1156.3 122 9.48
Self-Discipline dimension Overall model 115.9 2 58.0 4.36 0.015 EG: 19.3
group 46.4 1 46.4 4.64 0.033 CG: 18.0
Pretest 69.5 1 69.5 6.95 0.009
Residuals 1220.6 122 10.0
Doubt dimension Overall model 35.0 2 17.49 6.81 0.002 EG: 7.81
group 18.6 1 18.65 6.41 0.013 CG: 8.59
Pretest 16.3 1 16.32 5.61 0.019
Residuals 355.1 122 2.91
Flexible dimension Overall model 125.0 2 62.50 17.2 <.001 EG: 12.0
group 59.0 1 58.98 14.5 <.001 CG: 10.6
Pretest 66.0 1 66.02 16.2 <.001
Residuals 497.4 122 4.08

EMM: Estimate Marginal Means.

p = 0.037, n2p = 0.035]. of normality, linearity, and homogeneity of variance. The pretest scores
After controlling for the pretest score, there was a statistically sig­ of the respondents in both CG and EG were computed leveraging the
nificant effect on the posttest score of the self-discipline dimension [F reflective thinking scale and its dimensions. Later, the one-way
(1,122) = 4.64, p = 0.033, n2p = 0.037], as described in Table 3. To ANCOVA test was leveraged to analyze any disparities between the
communicate it differently, the posttest scores of the self-discipline two groups. A Shapiro-Wilk test indicated that the reflective thinking
dimension for the EG (m = 19.0, SD = 2.82) were discernibly higher scale scores were normally spread in the CG and EG, reflective thinking
than the self-discipline dimension posttest scores of the CG (m = 18.25, scale (W = 0.981, p = 0.084, understanding (W = 0.980, p = 0.079),
SD = 3.58). The pretest score of the self-discipline dimension was habitual (W = 0.997, p = 0.094), critical reflection(W = 0.983, p =
discernibly linked to the posttest score [F(1,122) = 6.95, p = 0.009, n2p 0.087), and reflection (W = 0.987, p = 0.090). Levene’s test demon­
= 0.054]. strated that the assumptions of normality were not violated, reflective
Besides, there was a statistically discernible effect on the posttest thinking scale [F(1, 123) = 1.39], p = 0.186], understanding [F(1, 123)
score of the doubt dimension [F(1,122) = 6.81, p = 0.002, n2p = 0.050] = 1.11, p = 0.533], habitual [F(1, 123) = 2.20, p = 0.141], critical
after controlling for the pretest score of the doubt dimension, as seen in reflection [F(1, 123) = 0.390, p = 0.081], and reflection [F(1, 123) =
Table 3. In other words, the posttest scores of the doubt dimension for 1.84, p = 0.177]. These results indicate that the scores met the required
the EG (m = 8.63, SD = 1.48) were significantly higher than the doubt assumptions for the analysis. The EMM for the construct and its di­
dimension posttest scores of the CG (m = 7.77, SD = 1.94). The Pretest mensions are illustrated in Table 4.
score was discernibly linked to the posttest score [F(1,122) = 5.61, p = The study employed covariance analysis to calculate whether there
0.019, n2p = 0.091] of the doubt dimension. was a statistically discernible variance between the posttest scores of the
Lastly, the flexible dimension scores of the CG and EG were reflective thinking scale, Understanding, Habitual, Critical reflection,
approximated. Table 3 shows the results of one-way ANCOVA executed and Reflection dimensions of the EG and CG. The results are detailed in
on the posttest flexible dimension scores of the EG and CG, F(1,122) = Table 4.
17.2, p < 0.001, n2p = 0.106. In other words, the posttest scores of the After controlling for the pretest score, there was a statistically
flexible dimension for the EG (m = 12.10, SD = 1.66) were discernibly discernible effect on the posttest scores of the RTS [F(1,122) = 23.20, p
higher than the flexible dimension posttest scores of the CG (m = 10.55, < 0.001, n2p = 0.160], as described in Table 4. To describe it differently,
SD = 2.50). The pretest score of the flexible dimension was discernibly the posttest scores of the reflective thinking for the EG (m = 56.6, SD =
linked to the posttest score [F(1,122) = 16.2, p < 0.001, n2p = 0.117]. 6.21) were discernibly higher than the reflective thinking posttest scores
of the CG (m = 51.9, SD = 5.72). The pretest score of the reflective
thinking was discernibly linked to the posttest score [F(1,122) = 6.71, p
5.3. Reflective thinking = 0.011, n2p = 0.052].
Additionally, there was a statistically discernible effect on the post­
To address research RQ 3, data were garnered from the respondents test score of the understanding dimension [F(1,122) = 4.63, p = 0.033,
and experimented for statistical discernibility between the RTS and its n2p = 0.037] after controlling for the pretest score of the understanding
dimensions (Understanding, reflection, Critical reflection, and Habitual) dimension, as detailed in Table 4. In other words, the posttest scores of
scores of the CG and EG. CG’s reflective thinking pretest score was m = the understanding dimension for the EG (m = 13.4, SD = 2.44) were
36.4, and the posttest score was m = 51.9. In contrast, the pretest score discernibly higher than the understanding dimension posttest scores of
for the EG was m = 35.1, while their posttest score was m = 56.6. the CG (m = 12.9, SD = 1.76). The pretest score was discernibly linked
Precursory inspections were conducted to estimate the assumptions

7
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

Table 4
Covariate analysis of reflective thinking scale and its dimensions.
Source Sum of squares df Mean square f p EMM

Reflective thinking Overall model 1015 2 507.3 21.10 <.001 EG: 58.1
group 228 1 227.6 41.31 <.001 CG: 53.1
Pretest 787 1 787.0 4.11 0045
Residuals 4139 122 33.9
Understanding dimension Overall model 135.7 2 67.83 18.29 <.001 EG: 13.5
group 16.3 1 16.26 4.63 0.033 CG: 12.4
Pretest 119.4 1 119.40 33.96 <.001
Residuals 428.9 122 3.52
Habitual dimension Overall model 89.73 2 44.87 14.67 <.001 EG: 15.6
group 82.38 1 82.38 23.49 <.001 CG: 13.9
Pretest 7.35 1 7.35 2.10 0.150
Residuals 427.90 122 3.51
Critical reflection dimension Overall model 300 2 149.78 13.2 <.001 EG: 14.7
group 185 1 184.74 18.8 <.001 CG: 12.2
Pretest 115 1 114.83 11.7 <.001
Residuals 1201 122 9.84
Reflection dimension Overall model 175.2 2 87.62 17.44 <.001 EG: 15.7
group 137.4 1 137.42 20.84 <.001 CG: 13.5
Pretest 37.8 1 37.83 5.74 0.018
Residuals 804.5 122 6.59

EMM: Estimate Marginal Means.

5.4. Student opinions on use ChatGPT


Table 5
Distribution of themes arising from students responses (n = 15).
Table 5 displays the results of the main open-ended questions that
Questions Themes n(%) were asked to students after the intervention, inquiring about their ex­
Positive aspects Enhanced self-efficacy and confidence in research 12 (80.0) periences with the ChatGPT mechanism and its advantages and disad­
Heightened knowledge and performance 10 (66.7) vantages. While all students were aware of ChatGPT, none of them was
Interactive, human-like, and engaging 13 (86.7) expecting it to be used by their instructor as a supporting tool because
Pacing in learning 13 (86.7)
Efficient in providing response 15 (100)
they have heard that other institutions have banned it. Their knowledge
Negative aspects Provided incorrect information 5 (33.3) before this experiment, however, was limited to using it “similarly to
Generated false citations/references 12 (80.0) Google search engine” or “as a live Wikipedia”.
Future aspects Should be used in all lessons/courses 11 (73.3) Below there are some illustrative interview responses cited by the
Want to use for research activities 14 (93.3)
students in the experimental group (EG).
“I think the ChatGPT activity was particularly helpful because it gave
to the posttest score [F(1,122) = 33.96, p < 0.001, n2p = 0.218] of the me the chance to practice pinpointing conditions where each sam­
understanding dimension. pling method would be appropriate. This helped me feel more
After controlling for the pretest score, there was a statistically confident in my ability to apply sampling methods in my own
discernible effect on the posttest score of the habitual dimension [F research studies.” (participant 2)
(1,122) = 23.49, p < 0.001, n2p = 0.161], as depicted in Table 4. To
represent it differently, the posttest scores of the habitual dimension for “Using ChatGPT to learn about sampling methods permitted me to
the EG (m = 15.2, SD = 2.08) were discernibly higher than the habitual learn at my own pace, which made the material less intimidating and
dimension posttest scores of the CG (m = 13.4, SD = 1.68). The pretest more manageable." (participant 3)
score of the habitual dimension was not discernibly linked to the posttest “I really appreciated the ChatGPT activity because it allowed me to
score [F(1,122) = 2.10, p = 0.150]. practice applying sampling techniques to real-life scenarios. It hel­
Further, the critical reflection dimension scores of the CG and EG ped me feel more confident in my understanding of the material.”
were compared. Table 4 illustrates the results of one-way ANCOVA (participant 4)
executed on the posttest critical reflection dimension scores of the
CGand EG, [F(1,122) = 13.2, p < 0.001, n2p = 0.133]. Similarly, the “ChatGPT gave intriguing suggestions to the prompts I provided.
posttest scores for the EG (m = 14.5, SD = 3.89) were discernibly higher ChatGPT also helped me understand that selecting the right sampling
than the critical reflection dimension posttest scores of the CG (m = method is critical for obtaining accurate and reliable results. I now
12.3, SD = 2.57). The pretest score of the critical reflection dimension realize that a poorly designed sampling plan can greatly undermine
was discernibly linked to the posttest score [F(1,122) = 11.7, p < 0.001, the credibility of a study.” (participant 8)
n2p = 0.087]. “At a point in time, I felt like I was communicating with my lecturer.
Ultimately, the flexible dimension scores of the CG and EG were It feels more human-like than I thought. Besides, I found the ChatGPT
approximated. Table 4 shows the results of one-way ANCOVA executed activity to be an effective way to learn about the different sampling
on the posttest flexible dimension scores of the EG and CG, F(1,122) = methods. The interactive nature of the platform kept me engaged and
20.84, p < 0.001, n2p = 0.146. In other words, the posttest scores of the made it easier to absorb the material.” (participant 9)
flexible dimension for the EG (m = 15.9, SD = 2.78) were discernibly
higher than the flexible dimension posttest scores of the CG (m = 13.4, “ChatGPT enhanced my understanding of sampling methods. I could
SD = 2.45). The pretest score of the flexible dimension was discernibly control and suggest how I want it to explain my prompts.” (partici­
linked to the posttest score [F(1,122) = 5.74, p < 0.018, n2p = 0.045]. pant 10)
“The ChatGPT activity helped make the material more engaging and
interactive. I learned a lot more than I would have from a traditional
lecture.” (participant 11)

8
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

“ChatGPT was really helpful in understanding complex topics like dimensions among the EG points out that ChatGPT can facilitate critical
sampling methods. Next time, it would be great if it could provide thinking skills by providing a platform for students to interact with
more real-world examples and practical exercises in other courses diverse perspectives and ideas. Likewise, the study findings illustrated
too, such as historical analyses." (participant 13) that the pre-critical thinking score was discernibly linked to the
post-critical thinking score. This finding illustrates that students who
“In mathematics, ChatGPT could adapt its explanations based on the
had higher levels of critical thinking skills prior to the intervention
students’ level, ensuring a more personalized result.” (participant
exhibited more remarkable advancement in their critical thinking skills
15)
after leveraging ChatGPT for in-class tasks. This finding is invariant with
According to the interviews conducted with EG students, it can be prior studies that have illustrated a positive nexus between prior critical
concluded that they found the use of ChatGPT as a learning mechanism thinking skills and the effectiveness of interventions to improve critical
for sampling methods to be beneficial and influential. All of the students thinking (Hapsari & Wu, 2022 & Goda et al., 2014). In the same line,
reported relishing the ChatGPT activity’s flexibility, which allowed recent studies suggest that AI Chatbot models can revolutionize educa­
them to learn at their own pace and in a less intimidating mode. They tion and develop new prospects for students to comprehend and develop
also stated that the interactive nature of the platform made it easier to critical thinking skills to enhance their learning experiences (Jamal
absorb the material and increased their confidence in their acquaintance et al., 2023; Tsai et al., 2023). Thus, leveraging ChatGPT can elicit
with sampling methods. The students further asserted that the ChatGPT critical thinking skills and problem-solving (Kasneci et al., 2023) by
activity assisted them in developing critical thinking skills by enabling providing a platform for students to engage with diverse perspectives
them to apply sampling methods to real-life scenarios. Besides, they and ideas.
described that the activity stimulated reflective thinking by illustrating The study’s second research question involved comparing the crea­
the significance of choosing the appropriate sampling method to obtain tive thinking skills of students who used ChatGPT for their in-class tasks
valid and reliable results. Ultimately, the students declared that the and those who did not use it. The findings indicate that leveraging
activity stimulated creative thinking by permitting them to practice ChatGPT for students’ in-class activities contributes to developing cre­
recognizing conditions where each sampling technique would be suit­ ative thinking skills, courage, innovative search dimension, inquisi­
able. ChatGPT was suggested for diverse type of courses, such as history tiveness, self-discipline, doubt, and flexible skills. Thus, students in the
and mathematics. EG increased their creative thinking due to their interactions with
ChatGPT in executing in-class tasks. Literature has yet to reach a
6. Discussion definitive conclusion regarding the influence of AI Chatbot models on
students’ creative thinking abilities (Tang et al., 2022). However, some
This experimental study examined how leveraging the ChatGPT studies support our findings. Chang and Yu (2015) discovered that
large language model affects students’ critical, reflective, and creative students who engaged in an AI-driven online synchronous learning
thinking skills. The study involved providing didactic assistance to the system (Chatbots) exhibited more outstanding scores regarding their
students in the EG using ChatGPT during class activities, while the CG creative thinking outputs according to their study. Chang (2017) con­
received didactic assistance without ChatGPT. The outcomes of the ducted a study that yielded comparable findings, as it investigated the
study are explained and discussed below. impact of AI cloud-based mobile learning tools, including Facebook and
The first research question of the study involved examining the the Cubie app, on the creative thinking outputs of students. Scholars
students’ critical thinking skills in both the EG and CG. According to the such as Gangadharbatla (2010) and Middleton (2005) have recognized
study findings, there was a discernible variance in critical thinking creative thinking as an essential element of educational technology, for
scores between the EG and CG at both pretest and posttest. Moreover, that matter, AI chatbot models. ChatGPT is implied to contribute to
the EG demonstrated a significant increase in critical thinking, reflective creative thinking via pleasure (fun), as identified by EG students inter­
skepticism, and critical openness compared to the CG. Thus, the study’s viewed, and this finding is in line with Navarrete (2013). Additionally,
quantitative results illustrated that leveraging ChatGPT for in-class tasks Root-Bernstein and Root-Bernstein (1999) based their descriptions of
effectively improved students’ critical thinking skills. One feasible personal pleasure on the premise that creative thinking is closely linked
illustration of the significance of ChatGPT in enhancing critical thinking with emotions and intuition. Thus, the findings in the study indicate that
skills is that it furnishes students with the possibility to engage in di­ incorporating creative thinking in a student-centered learning approach
alogues with an AI ChatGPT model that prompts them to think critically. can offer students a rewarding and enjoyable educational experience
The ChatGPT may offer students feedback and guidance, which can help through an authentic AI chatbot model, such as ChatGPT, that facilitates
them develop a deeper understanding (Rospigliosi, 2023) of the topic profound and insightful learning (Navarrete, 2013) that advances cre­
and enhance their critical thinking skills (Darvishi et al., 2022; Kasneci ative thinking skills. While there may be a connection between creative
et al., 2023). Another possible justification is that ChatGPT may provide thinking and intelligence, socio-constructivist perspectives or personal
students with a more personalized learning experience. This personal­ and relevant experiences may provide a more accurate description of the
ized learning procedure can benefit students who struggle with critical creative experience (Gangadharbatla, 2010). Although leveraging
thinking and may require additional support to improve their skills. ChatGPT influenced students’ creative thinking, the findings suggest
Unlike conventional, lecture-based classroom didactics, ChatGPT can that using ChatGPT, or any AI chatbot model, may only partially capture
adjust the difficulty level and pace of instruction to match individual the creative experience influenced by personal and contextual factors.
students’ needs and learning styles (Dwivedi et al., 2023; Tlili et al., While there may be a connection between creative thinking and intel­
2023). AI ChatGPT models have demonstrated a significant impact on ligence, socio-constructivist perspectives or personal and relevant ex­
critical thinking skills in a variety of empirical studies. For instance, periences may provide a more accurate description of the creative
multiple studies (Liang, 2022 & Long et al., 2016) discovered that experience (Gangadharbatla, 2010 & Ambrose et al., 2003). Despite this,
leveraging technology-based learning intervention can improve uni­ certain studies have demonstrated that AI chatbot models could occa­
versity students’ critical thinking skills. Notably, the improvement in sionally decline students’ creative thinking because of their higher
critical thinking skills was observed not only in the overall score of the cognitive load demands (Tang et al., 2022). Rubino et al. (2018)
critical thinking scale but also in its two dimensions. The first dimension, discovered that using a digital platform reduced creative thinking
critical openness, refers to the willingness to consider alternative per­ among students. The researchers assessed the creative thinking process
spectives and to be exposed to novel ideas. The second dimension, by independently observing the behaviors and dialogues of students in
reflective skepticism, directs to the ability to question assumptions, the classroom. These findings imply that since ChatGPT can promote
pursue evidence, and evaluate arguments. The improvement in these students’ creative thinking in this study, it can potentially decline

9
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

students’ creative thinking. Thus, while ChatGPT may provide valuable efficient learning approaches in fostering reflection (Carless, 2019).
insights and generate creative ideas, it may only partially replicate the The final research question had to do with an interview session
nuances and complexity of human creativity shaped by socio-cultural where students in the EG expressed their opinions about the positives
factors and personal experiences. Nonetheless, certain studies have and negatives, as well as future use of ChatGPT for in-class tasks. The
specified that the impact of AI chatbot models would rely on the scaf­ findings from the students’ interviews suggested that leveraging
folding strategies employed by teachers (Chang, 2017), which aim to ChatGPT for in-class tasks was effective in terms of enhancing critical,
promote independent thinking and learning among students and reduce creative, and reflective thinking and reducing the burden of the
their dependence on teachers (Wodaj & Belay, 2021). Finally, the fact instructor for correcting in-class tasks. Likewise, the students recognized
that the participants in both groups were pursuing a Quantitative that leveraging ChaptGPT enhanced the effectiveness of learning
Research Design course, inherently encouraged a heightened level of research methodology, furnishing more opportunities for reflection as
analytical thinking. This academic background, coupled with the stu­ they had less information to analyze and synthesize. Lin (2019) argued
dents’ ability to scrutinize AI-generated content, likely played a pivotal that too much information could overwhelm students, but constructing
role in their readiness and inclination to engage in metacognitive ex­ their knowledge with effective learning strategies can help them benefit
amination of AI data and discussion of its prompts within their respec­ from the presented information. Moreover, the findings authenticated
tive zones of proximal development (ZPDs). This unique characteristic of that the students were highly motivated to learn. They expressed will­
the student groups could account for the significant improvements ingness to leverage ChatGPT in other courses and recommend it to
observed in their critical, creative, and reflective thinking skills. It un­ different teachers and students. These findings are inline with the study
derscores the importance of considering the composition of the student by Liu et al. (2021). These findings indicate that students perceive
body in interpreting the study’s outcomes and highlights how their prior ChatGPT as a valuable educational tool that has the potential to enhance
preparation and context were conducive to meaningful engagement their learning experiences in diverse contexts. However, the study also
with the AI-driven learning environment implemented in the course. demonstrated some adverse characteristics of ChatGPT’s usage. Some
The third research question addressed in the study concerned the students reported that ChatGPT furnished inaccurate information, while
comparison of reflective thinking skills between students who utilized others expressed concerns that it generated false citations and refer­
ChatGPT for in-class activities and those who did not use any LLMs. The ences. These concerns highlight the importance of guaranteeing that AI
study’s findings demonstrated that leveraging ChatGPT for in-class tasks chatbot models are accurate and reliable in higher education. Thus,
contributed to maturing reflective thinking skills of students. Regarding McDaniel et al. (2021) stress the importance of incorporating person­
the dimension of reflective thinking, the present study reported a sig­ alized learning services for students in AI-supported learning tasks.
nificant influence on the students’ understanding, habitual action, crit­
ical reflection, and reflection. In line with the present study, Yilmaz 6.1. Conclusions
(2020) discovered that there was a statistically positive significant
impact on student reflective thinking skills although the study employed In conclusion, this experimental study has demonstrated the poten­
AI analytics. The study conducted by the author also showed that stu­ tial benefits of leveraging the ChatGPT to promote students’ cognitive
dents in the EG demonstrated significantly higher levels of under­ skills. Besides, the findings of this study illustrate that leveraging
standing, habitual action, critical reflection, and reflection compared to ChatGPT during didactic assistance in-class activities can positively
the CG. A study by Liu et al. (2023) discovered that students leveraging impact students’ critical, creative, and reflective thinking skills. Spe­
reflective thinking-promoting mechanisms in AI-supported conditions cifically, the EG exhibited significant improvement in their critical,
(in the form of a Chatbot) reported positive learning experiences and creative, and reflective thinking scores compared to the CG, suggesting
perceptions of the approach, such as reflection, engagement, augmented that ChatGPT can be an effective didactic mechanism for enhancing
motivation, and feedback utilization. Moreover, the authors discovered these skills. Despite the fact that this study did not assess learning out­
that the AI-supported condition enhanced the students’ writing skills in comes, and the tests primarily measured changes in concrete skills
the EG, heightened their self-regulated learning and self-efficacy, and rather than specific learning achievements, we could say leveraging
considerably decreased their cognitive load (Shadiev & Huang, 2020). ChatGPT for in-class tasks enhances the development of cognitive skills.
By leveraging an AI-supported feedback model to enhance in-class tasks As technology in education continues to grow, we suggest that ChatGPT
and elucidate writing standards, multiple studies (Allen & McNamara, can be a valuable mechanism for academics in higher education to
2015; Roscoe et al., 2015; Snow et al., 2015) discovered similar findings consider. By integrating AI chatbot models in classroom tasks, aca­
that encouraged students’ self-regulated learning. In their study, Wang demics can assist students in developing their critical, reflective, and
et al. (2023) found that providing AI-supported feedback to students creative thinking skills.
resulted in a statistically significant positive impact on their
self-regulated learning abilities and engagement level, and the di­ 6.2. Implications for practice
mensions of the self-regulating scale (planning before writing, planning
during writing, attention regulation, organization, checking and cor­ The implications of this study for higher education are manifold.
recting, and content monitoring) showed significant differences favoring ChatGPT emerges as a valuable tool with the potential to revolutionize
the experimental group. Although AI chatbot models (automated feed­ the teaching of critical, creative, and reflective thinking in the realm of
back systems) can assist learning, technological limitations must be education. Our research demonstrates that students can make substan­
considered for pedagogical purposes (Palermo & Wilson, 2020) to foster tial strides in honing these cognitive skills when guided by ChatGPT.
reflective thinking successfully. These findings illustrate that when This is particularly pertinent in the context of higher education, where
students leverage ChatGPT to assist in in-class tasks, they can observe fostering critical thinking is of paramount importance. The findings
and monitor their learning patterns, create strategies to improve them from this study strongly advocate for the integration of ChatGPT into
and evaluate their effectiveness. In this study, the significant difference higher education curricula as a means to bolster students’ critical, cre­
observed in the post-reflective think scores between the EG and CG ative, and reflective thinking abilities. Such an approach holds the
illustrated that presenting excessive information to students from promise of enhancing various courses that demand rigorous thinking,
different aggregators and databases may increase the effort required to such as research methodology classes and those within the domains of
perform a particular task and affect their reflection; however, leveraging social sciences, humanities, and education. Consequently, the results of
ChatGPT to assist them in constructing their knowledge of this learning this study serve as compelling evidence supporting the adoption of
information can benefit students. Moreover, the findings indicated that innovative teaching methodologies in higher education to elevate the
using ChaGPT as a support system emphasized the importance of quality of student learning outcomes.

10
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

6.3. Limitations and future studies Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
(ref. number: EIST-EC/REF No.:/January 01, 2023).
While the findings of this study are promising, it is crucial to
acknowledge some limitations that warrant consideration. First, the Funding
study focused exclusively on third-year undergraduate students within a
specific department, all of whom were enrolled in a research method­ N/A.
ology course that employed the FC approach for one academic semester.
This homogeneity of participants and course context may introduce a Declarations
Hawthorne effect, wherein participants may have altered their behavior
due to their awareness of being observed or participating in a novel No conflict of interest has been reported by the authors. All authors
educational experiment (McCambridge et al., 2014). To address this, have read and approved the final version of the paper.
future research should broaden its scope by recruiting participants from
diverse departments and academic levels, including first-year, second-­ Data availability and sharing policy
year, fourth-year, and postgraduate students. Additionally, exploring
different teaching methods and strategies in various courses can yield The data that support the findings of this study are available from the
more generalizable results and mitigate potential Hawthorne effects. corresponding author, [DV], upon reasonable request.
Moreover, it important to consider that the initial variance in critical
thinking scores between the EG and CG at the pre-test stage may have CRediT authorship contribution statement
influenced the observed percentage increase in scores. We acknowledge
the importance of ensuring as much equivalence as possible between the Harry Barton Essel: Writing – original draft, Methodology, Formal
two groups, particularly after the pre-test. In this study, the students analysis, Data curation, Conceptualization. Dimitrios Vlachopoulos:
were initially randomized into EG and CG to minimize potential biases. Writing – review & editing, Supervision, Project administration, Inves­
Future studies could explore methods to achieve greater equivalence tigation. Albert Benjamin Essuman: Writing – original draft, Software,
between the groups post pre-test, ensuring that the results accurately Formal analysis, Data curation. John Opuni Amankwa: Writing –
reflect the impact of ChatGPT on critical thinking skills while controlling original draft, Validation, Formal analysis.
for any initial variations. Furthermore, future studies should consider
conducting more extended investigations to measure the long-term Declaration of competing interest
impact of ChatGPT usage on students’ cognitive abilities. This
approach would provide valuable insights into the sustainability and The authors declare that they have no known competing financial
durability of the observed improvements. Another limitation to consider interests or personal relationships that could have appeared to influence
is the reliance on self-reported data, which may be susceptible to social the work reported in this paper.
desirability bias. Future research could employ a mixed-methods
approach, combining self-reported data with objective assessments to Acknowledgements
enhance the validity of findings. Moreover, this study focused primarily
on the influence of ChatGPT on critical, creative, and reflective thinking N/A.
skills. To gain a comprehensive understanding of AI’s impact on edu­
cation, future studies should explore its influence on other essential Appendix A. Supplementary data
cognitive skills and competencies, such as metacognitive skills,
problem-solving, and digital literacy. Additionally, investigating how Supplementary data to this article can be found online at https://doi.
ChatGPT supports the development of non-cognitive skills like org/10.1016/j.caeai.2023.100198.
emotional intelligence and social skills would contribute to a more ho­
listic assessment of its educational value. It is essential to recognize that Study acronyms
this study was conducted within the specific context of a Ghanaian
university. The socio-cultural and economic backgrounds of the partic­ LLMs Large Language Models
ipants may not be representative of other settings. Therefore, future ChatGPT Chat Generative Pre-trained Transformer
research should aim to replicate these findings in diverse educational AI Artificial Intelligence
contexts, universities, and countries to ascertain their consistency and IoT Internet of Things
generalizability. Ethical considerations and challenges associated with NLP Natural Language Processing
ChatGPT usage in education also warrant further exploration in future DL Deep Learning
studies. Understanding the ethical implications and potential drawbacks FC Flipped Classroom
of AI integration is crucial for responsible implementation in educa­ TPC Theory Practical Credit
tional settings. Additionally, future research could delve into the impact CWA Cumulative Weighted Average
of socio-cultural and personal factors on creative thinking compared to CFA Confirmatory Factor Analysis
using ChatGPT. This could involve measuring creative thinking abilities CFI Comparative Fit Index
in students exposed to different socio-cultural contexts and personal GFI Goodness of Fit Index
experiences and comparing their results to those who have access to IFI Incremental Fit Index (IFI)
ChatGPT. Lastly, an intriguing avenue for research would be to compare AGFI Adjusted Goodness of Fit Index
the effectiveness of different Language Models, such as Google Bard, in RMSEA Root Mean Square Error of Approximation
the classroom to identify the most suitable AI tools for specific educa­ SRMR Standardized Root Mean Square Residual
tional contexts. EMM Estimated Marginal Means
EG Experimental Group
Ethical approval CG Control Group
CTS Critical Thinking Scale
Written informed consent was obtained from the participants of the MCTS Creative Thinking Scale
study. This research is approved by the Department of Educational In­ RTS Reflective Thinking Scale
novations in Science and Technology Ethics Committee (EIST-EC) at LMS Learning Management System

11
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

References Lenth, R. (2020). emmeans: Estimated marginal means, aka least-squares means. In [R
package]. Retrievfrom. https://cran.r-project.org/package=emmeans.
Liang, W. (2022). Towards a set of design principles for technology-assisted critical-thinking
Akpur, U. (2020). Critical, reflective, creative thinking and their reflections on academic
cultivation: A synthesis of research in English Language education. Thinking Skills and
achievement. Thinking Skills and Creativity, 37, Article 100683.
Creativity, Article 101203.
Al Darayseh, A. S. (2023). Acceptance of artificial intelligence in teaching science:
Lin, C. J. (2019). An online peer assessment approach to supporting mind mapping
Science teachers’ perspective. Computers and Education: Artificial Intelligence, 4,
flipped learning activities for college English writing courses. Journal of Computers in
Article 100132.
Education, 6(3), 385–415.
Allen, L. K., & McNamara, D. S. (2015). Promoting self-regulated learning in an
Li, Y., Sha, L., Yan, L., Lin, J., Raković, M., Galbraith, K., Lyons, K., Gašević, D., &
intelligent tutoring system for writing. In Artificial intelligence in education: 17th
Chen, G. (2023). Can large language models write reflectively. Computers and
international conference. Madrid, Spain: AIED 2015. June 22-26, 2015. Proceedings
Education. Artificial Intelligence, 4, Article 100140. https://doi.org/10.1016/j.
17 (pp. 827–830). Springer International Publishing.
caeai.2023.100140
Ambrose, D., Cohen, L. M., & Tannenebaum, A. J. (Eds.). (2003). Creative intelligence:
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt,
Toward theoretic integration. Cresskill N.J.. Hampton Press).
and predict: A systematic survey of prompting methods in natural language
Biasi, M. R. D., Valencia, G. E., & Obregon, L. G. (2019). A new educational
processing. ACM Computing Surveys, 55(9), 1–35.
thermodynamic software to promote critical thinking in youth engineering students.
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., & Tang, J. (2021). GPT
Sustainability, 12(1), 110.
understands, too. https://doi.org/10.48550/arXiv.2103.10385. arXiv.
Blomhøj, M. (2011). Modelling competency: Teaching, learning and assessing
Long, J. D., Gannaway, P., Ford, C., Doumit, R., Zeeni, N., Sukkarieh-Haraty, O., …
competencies–Overview. Trends in Teaching and Learning of Mathematical Modelling,
Song, H. (2016). Effectiveness of a technology-based intervention to teach evidence-
ICTMA14, 343–347.
based practice: The EBR Tool. Worldviews on Evidence-Based Nursing, 13(1), 59–65.
Carless, D. (2019). Feedback loops and the longer-term: Towards feedback spirals.
Luckin, R., Cukurova, M., Kent, C., & du Boulay, B. (2022). Empowering educators to be
Assessment & Evaluation in Higher Education, 44(5), 705–714.
AI-ready. Computers and Education: Artificial Intelligence, 3, Article 100076.
Chang, H. Y. (2017). How to augment the learning impact of computer simulations? The
Mbakwe, A. B., Lourentzou, I., Celi, L. A., Mechanic, O. J., & Dagan, A. (2023). ChatGPT
designs and effects of interactivity and scaffolding. Interactive Learning Environments,
passing USMLE shines a spotlight on the flaws of medical education. PLOS Digital
25(8), 1083–1097.
Health, 2(2), Article e0000205.
Chang, Y. S., & Yu, K. C. (2015). The relationship between perceptions of an innovative
McCambridge, J., Witton, J., & Elbourne, D. R. (2014). Systematic review of the
environment and creative performance in an online synchronous environment.
Hawthorne effect: New concepts are needed to study research participation effects.
Computers in Human Behavior, 49, 38–43.
Journal of clinical epidemiology, 67(3), 267–277.
Chen, M. R. A., Hwang, G. J., & Chang, Y. Y. (2019). A reflective thinking-promoting
McDaniel, M. A., Einstein, G. O., & Een, E. (2021). Training college students to use
approach to enhancing graduate students’ flipped learning engagement,
learning strategies: A framework and pilot course. Psychology Learning and Teaching,
participation behaviors, reflective thinking and project learning outcomes. British
20(3), 364–382.
Journal of Educational Technology, 50(5), 2288–2307.
Middleton, H. (2005). Creative thinking, values and design and technology education.
Darvishi, A., Khosravi, H., Sadiq, S., & Gašević, D. (2022). Incorporating AI and learning
International Journal of Technology and Design Education, 15(1), 61–71.
analytics to build trustworthy peer assessment systems. British Journal of Educational
Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: An expanded
Technology, 53(4), 844–875.
sourcebook. Sage.
Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., … Wright, R.
Mollick, E. R., & Mollick, L. (2022). New modes of learning enabled by AI chatbots: Three
(2023). “So what if ChatGPT wrote it?” Multidisciplinary perspectives on
methods and assignments. SSRN Electronic Journal. https://doi.org/10.2139/
opportunities, challenges and implications of generative conversational AI for
ssrn.4300783
research, practice and policy. International Journal of Information Management, 71,
Naseem, U., Razzak, I., Khan, S. K., & Prasad, M. (2021). A comprehensive survey on
Article 102642. https://doi.org/10.36227/techrxiv.21789434.v1
word representation models: From classical to state-of-the-art word representation
Essel, H. B., Vlachopoulos, D., Tachie-Menson, A., Johnson, E. E., & Baah, P. K. (2022).
language models. Transactions on Asian and Low-Resource Language Information
The impact of a virtual teaching assistant (chatbot) on students’ learning in
Processing, 20(5), 1–35.
Ghanaian higher education. International Journal of Educational Technology in Higher
Navarrete, C. C. (2013). Creative thinking in digital game design and development: A
Education, 19(1), 1–19.
case study. Computers & Education, 69, 320–331.
Firat, M. (2023). How chat GPT can transform autodidactic experiences and open education?.
OpenAI. (2023). ChatGPT: Optimizing language models for dialogue. https://openai.
https://doi.org/10.31219/osf.io/9ge8m
com/blog/chatgpt/.
Fox, J., & Weisberg, S. (2020). In car: Companion to applied regression. [R package].
Özgenel, M., & Çetin, M. (2017). Development of the marmara creative thinking dispositions
Retrieved from. https://cran.r-project.org/package=car.
scale: Validity and reliability analysis (Vol. 46, pp. 113–132). Marmara University
Gangadharbatla, H. (2010). Technology component: A modified systems approach to
Atatürk Education Faculty Journal of Educational Sciences.
creative thought. Creativity Research Journal, 22(2), 219–227.
Palermo, C., & Wilson, J. (2020). Implementing automated writing evaluation in
Goda, Y., Yamada, M., Matsukawa, H., Hata, K., & Yasunami, S. (2014). Conversation
different instructional contexts: A mixed-methods study. Journal of Writing Research,
with a chatbot before an online EFL group discussion and the effects on critical
12(1), 64–108.
thinking. Journal of Information Systems Education, 13(1), 1–7.
Root-Bernstein, R., & Root-Bernstein, M. (1999). Sparks of genius: The thirteen thinking
Hapsari, I. P., & Wu, T. T. (2022). AI chatbots learning model in English speaking skill:
tools of creative people. Houghton, Mifflin and Company.
Alleviating speaking anxiety, boosting enjoyment, and fostering critical thinking. In
Roscoe, R. D., Snow, E. L., Allen, L. K., & McNamara, D. S. (2015). Automated detection
Innovative technologies and learning: 5th international conference, ICITL 2022, virtual
of essay revising patterns: Applications for intelligent feedback in a writing tutor.
event, august 29–31, 2022, proceedings (pp. 444–453). Cham: Springer International
Grantee Submission, 10(1), 59–79.
Publishing.
Rospigliosi, P. A. (2023). Artificial intelligence in teaching and learning: What questions
Haque, M. U., Dharmadasa, I., Sworna, Z. T., Rajapakse, R. N., & Ahmad, H. (2022).
should we ask of ChatGPT? Interactive Learning Environments, 31(1), 1–3.
I think this is the most disruptive technology: Exploring sentiments of ChatGPT early
Rubino, I., Barberis, C., & Malnati, G. (2018). Exploring the values of writing
adopters using Twitter data. arXiv preprint arXiv:2212.05856.
collaboratively through a digital storytelling platform: A mixed-methods analysis of
Ivankova, N. V., Creswell, J. W., & Stick, S. L. (2006). Using mixed-methods sequential
users’ participation, perspectives and practices. Interactive Learning Environments, 26
explanatory design: From theory to practice. Field Methods, 18(1), 3–20.
(7), 882–894.
Jamal, A., Solaiman, M., Alhasan, K., Temsah, M. H., & Sayed, G. (2023). Integrating
Sanusi, I. T., Olaleye, S. A., Agbo, F. J., & Chiu, T. K. (2022). The role of learners’
ChatGPT in medical education: Adapting curricula to cultivate competent physicians
competencies in artificial intelligence education. Computers and Education: Artificial
for the AI era. Cureus, 15(8).
Intelligence, 3, Article 100098.
Jamovi project. (2021). In Jamovi (2.3.24 version) [Computer Software]. https://www.
Shadiev, R., & Huang, Y. M. (2020). Investigating student attention, meditation,
jamovi.org.
cognitive load, and satisfaction during lectures in a foreign language supported by
Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., …
speech-enabled language translation. Computer Assisted Language Learning, 33(3),
Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large
301–326.
language models for education. Learning and Individual Differences, 103, Article
Snow, E. L., Allen, L. K., Jacovina, M. E., Perret, C. A., & McNamara, D. S. (2015). You’ve
102274.
got style: Detecting writing flexibility across time. In Proceedings of the fifth
Kassymova, G. K., Kenzhaliyev, O. B., Kosherbayeva, A. N., Triyono, B. M., &
international conference on learning analytics and knowledge (pp. 194–202).
Ilmaliyev, Z. B. (2020). E-learning, dilemma and cognitive competence. Journal of
Sosu, E. M. (2013). The development and psychometric validation of a critical thinking
Talent Development and Excellence, 12(2s), 3689.
disposition scale. Thinking Skills and Creativity, 9, 107–119.
Kember, D., Jones, A., Loke, A., McKay, J., Sinclair, K., Tse, H., Webb, C., Wong, F.,
Sullivan, M., Kelly, A., & McLaughlan, P. (2023). ChatGPT in higher education:
Wong, M., & Yeung, E. (1999). Determining the level of refective thinking from
Considerations for academic integrity and student learning. Journal of Applied
students’ written journals using a coding scheme based on the work of Mezirow.
Learning and Teaching, 6(1).
International Journal of Lifelong Education, 18(1), 18–30.
Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston, MA:
Kooli, C. (2023). Chatbots in education and research: A critical examination of ethical
Pearson.
implications and solutions. Sustainability, 15(7), 5614.
Tang, C., Mao, S., Xing, Z., & Naumann, S. (2022). Improving student creativity through
Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., … Tseng, V.
digital technology products: A literature review. Thinking Skills and Creativity, Article
(2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical
101032.
education using large language models. PLOS Digital Health, 2(2), Article e0000198.
Tlili, A., Shehata, B., Adarkwah, M. A., Bozkurt, A., Hickey, D. T., Huang, R., &
Leippold, M. (2022). Thus spoke GPT-3: Interviewing a large-language model on climate
Agyemang, B. (2023). What if the devil is my guardian angel: ChatGPT as a case
finance. Finance Research Letters, Article 103617.
study of using chatbots in education. Smart Learning Environments, 10(1), 15.

12
H.B. Essel et al. Computers and Education: Artificial Intelligence 6 (2024) 100198

Tsai, M. L., Ong, C. W., & Chen, C. L. (2023). Exploring the use of large language models Wodaj, H., & Belay, S. (2021). Effects of 7E instructional model with metacognitive
(LLMs) in chemical engineering education: Building core course problem models scaffolding on students’ conceptual understanding in biology. Journal of Education in
with Chat-GPT. Education for Chemical Engineers, 44, 71–95. Science, Environment and Health, 7(1), 26–43.
Tsankov, N. (2018). The transversal competence for problem-solving in cognitive Yılmaz, R. (2020). Enhancing community of inquiry and reflective thinking skills of
learning. International Journal of Cognitive Research in Science, Engineering and undergraduates through using learning analytics-based process feedback. Journal of
Education, 6(3), 67. Computer Assisted Learning, 36(6), 909–921.
Wang, X., Liu, Q., Pang, H., Tan, S. C., Lei, J., Wallace, M., & Li, L. (2023). What matters
in AI-supported learning: A study of human-AI interactions in language learning
using cluster analysis and epistemic network analysis. Computers and Education, 194.

13

View publication stats

You might also like