Master'S Thesis: Behavioral Detection of Cheating in Online Examination
Master'S Thesis: Behavioral Detection of Cheating in Online Examination
MASTER'S THESIS
Matus Korman
D Master thesis
Computer and Systems Sciences
Department of Business Administration and Social Sciences
Division of Information Systems Sciences
I would like to thank everyone, who contributed in, opposed to, assisted with, or
otherwise helped me carrying out the study as well as writing this thesis – a result
of the study.
My thanks go to Dan Harnesk, PhD. (supervisor), Sören Samuelsson, PhD., and
John Lindström, PhD., for the valuable advice and research guidance I was given;
to Hugo Quisbert, PhD., Artjom Vassiljev and Viola Veiderpass for constructive op-
position; to Lars Furberg for the ideas, which helped me to navigate to the research
problem chosen and the interesting discussions we had; to Neil Costigan, PhD., for
his inspiring work and presentations; to professor Ann Hägerfors for managing is-
sues also related to my study; and to my family for their mental support and advice.
My further thanks go to Amir Molavi, Onur Yirmibesoglu, Marko Niemimaa, Elina
Laaksonen, Nebojsa Mihajlovski, Vladimir Kichatov, Ali Fakhr, Darya Plankina,
Anna Selischeva, Sana Rouis, Svante Edzén, Peter Anttu, and others, who con-
tributed to my thoughtflow through discussions, or supported me in different other
ways.
Also thanks to the contributions of all of you, the study has been done the way
it has, and I feel having learned valuable knowledge and gained practice, for which
there is use in the future.
Abstract
The need for and use of online or computer-based examination seems to be growing,
while this form of examination gives students a broader spectrum of opportunities
including those for cheating, as compared to non-computerized ways of examination.
The times are changing, there are many different reasons for examination dishonesty,
many ways of performing it, and many ways of coping with it. Given an equilib-
rium at this level, new ways of violation deserve new ways of prevention, or at least
detection.
1 Introduction 1
1.1 Topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research goals and delimitation . . . . . . . . . . . . . . . . . . . . . 3
1.3 Significance of the study . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 7
2.1 Examination cheating . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 What’s wrong with cheating? . . . . . . . . . . . . . . . . . . 8
2.1.2 Why do students cheat? . . . . . . . . . . . . . . . . . . . . . 9
2.1.3 The mission: preventing cheating . . . . . . . . . . . . . . . . 16
2.1.4 How do students cheat? . . . . . . . . . . . . . . . . . . . . . 19
2.1.5 Detecting cheating as a means of prevention . . . . . . . . . . 21
2.1.6 Cheating review summary . . . . . . . . . . . . . . . . . . . . 22
2.2 Specifics of distance operation . . . . . . . . . . . . . . . . . . . . . . 26
3 Conceptual framework 29
3.1 Cue leakage theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Pattern recognition theory . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Anomaly detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Behaviometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4.1 Biometrics in general . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.2 Specifics of behaviometrics . . . . . . . . . . . . . . . . . . . 38
3.4.3 Keystroke dynamics . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.4 Mouse dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.5 Linguistic dynamics . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.6 ‘Special purpose’ behaviometrics . . . . . . . . . . . . . . . . 44
3.5 Vision of a behavioral cheating detection approach . . . . . . . . . . 48
3.5.1 The angle of attack . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5.2 Behavioral characteristics as the cheating detection unifier . . 50
3.5.3 The detection mechanism . . . . . . . . . . . . . . . . . . . . 50
4 Methodology 53
4.1 My setting and the research method . . . . . . . . . . . . . . . . . . 53
4.2 Validity of a research design . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Reliability and validity of a measure . . . . . . . . . . . . . . . . . . 56
4.4 Research design and research process . . . . . . . . . . . . . . . . . . 57
4.4.1 Empirical inputs . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4.3 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Appendices 97
List of Tables
iv
3.1 Meta-functions of a computer mediated communication text analysis
framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Text analysis linguistic features 1 . . . . . . . . . . . . . . . . . . . . 45
3.3 Text analysis linguistic features 2 . . . . . . . . . . . . . . . . . . . . 46
Introduction
1
a mutually successive manner. Distance examination is in different forms used to
validate the level of knowledge, skills or abilities of students/examinees. The most
common distance examination method seems to be online examination, which uses
a network-enabled computer environment (e.g., the Internet) to set up a two-way
communication.
Although dependent on specific environment, while major concerns in distance
education compared to on-site education are mostly related to finding, achieving and
maintaining effective means of teaching/tutoring, learning, student support and ad-
ministration (Holmberg, 1995; Keegan, 1996; Bates, 2005; Kim et al., 2008), the
problems of fairness assurance and trust seem to be often more challenging in on-
line examination compared to traditional/conventional examination means (Rowe,
2004). Public trust and fairness in education including examination is an important
attribute (Rumyantseva, 2005; Heyneman, 2002), yet seemingly tricky to achieve and
maintain (Herberling, 2002). The technology, which on one hand enables distributed
and asynchronous education, opens up a broad range of cheating possibilities within
an examination process on the other hand. Controlling or at least perceiving largely
unknown and distant examination environments as a way to detect and prevent ex-
amination dishonesty seems to be non-trivial. Also as a matter of this fact, distance
education often renders less accepted than conventional (on-site) education (Colum-
baro & Monaghan, 2009; Bourne et al., 2005). In a more general context, Allen &
Seaman (2003) shows that online education is perceived inferior to conventional ed-
ucation, however, near future beliefs (for three years later) show an optimistic turn
in the balance. Around six years later, Columbaro & Monaghan (2009) show that
such beliefs have been and might tend to be too optimistic, since more than 95%
of employers would prefer to accept a traditional degree to an online one in several
different fields according to their study.
Examination cheating and academic dishonesty in general seem to have been an
educational problem since a long time ago (Cizek, 1999). According to UC Berke-
ley (2009), cheating can be defined as “fraud, deceit, or dishonesty in an academic
assignment, or using or attempting to use materials, or assisting others in using
materials, that are prohibited or inappropriate in the context of the academic as-
signment in question” (no page numbering). Students often tend to shortcut achiev-
ing their grades and maintaining their sense of personal integrity otherwise than
through investing adequate amount of effort and time (Diekhoff et al., 1996). Aca-
demic cheating is prevalent and at the same time, it seems to have growing tendency
(Cizek, 1999; Dick et al., 2003; McCabe et al., 2006; Wehman, 2009; Howell et al.,
2009). A study in McCabe et al. (2006) shows that cheating was reported by 56%
business students and 47% non-business students. An earlier McCabe’s study (also
mentioned in the paper) shows that 66% of all students reported at least one serious
cheating incident in the past year, while among engineering students the number
was 72% and business students led with 84%. According to a survey carried out in
the United States, around 94% students reported cheating in any form, around 65%
students reported test cheating and more than 50% students reported plagiarism.
According to Stumber-McEwen et al. (2009), there is a wealth of studies on preva-
lence of cheating available, however, their quantitative results vary greatly based
on the type of survey and specific survey conditions. As to on-site examination,
cheating also applies to distance examination (Underwood, 2006; Wehman, 2009).
2
Different sources perceive the cheating prevalence among on-site and online exam-
ined students differently (Stumber-McEwen et al., 2009; Herberling, 2002; Watson
& Sottile, 2010). Assuming that an online examination environment tends to be less
cheat-constraining and less perceivable by examiners than an on-site one, students
may generally tend to cheat more from distance as also believed by Rowe (2004).
Following an information security approach (Whitman & Mattord, 2008), the
occurrence of online examination cheating as an undesirable activity is a form of
risk, and the higher the cheating severity and probability, the greater the risk con-
trol importance. The ultimate goal of risk control, in this context applied to the
educational field, is to effectively reduce risk related to the educational process.
Effectively reducing risk of online examination cheating is a problem.
There are multiple approaches to controlling online examination cheating (Olt,
2002), many of them suitable in one way or another. The primary approach usable
with the thesis concerns is the ‘police approach’ – monitoring for and reacting on
suspicion or detection, along with deterrence-based cheating demotivation. This ap-
proach is somewhat analogous to a feedback control system (Åström & Murray, 2008,
chap. 1) and within such one needs to first perceive the examination environment
and detect anomalies in order to be able to make effective control actions. Perceiv-
ing a distant online examination environment and effectively detecting cheating is a
problem.
1.1 Topic
The topic of this thesis is to explore and verify possibilities of detecting specific types
of online examination cheating based on behavioral measures of human-computer
interaction. More specifically, the focus lies on utilizing behaviometrics (behavioral
biometrics) for the analysis of keystroke, mouse and linguistic dynamics.
The primary motivation for this study is to enable or help faculties to both
(1) fight the prevalent and rather invisible online examination cheating, and to (2)
indirectly increase the acceptance of online grades.
By content, this study is focusing on the use of behaviometrics (work with
keystroke, mouse and linguistic dynamics) based on information technology and
machine learning (software, pattern recognition, anomaly detection, visualization)
for detecting examination cheating (an educational concern).
3
2. Histogrammatically displayed amount of stress
Being able to effectively and in a highly automated way provide the above about the
target population (described below) is the research vision (in a longer term). The
goal of this study, however, is to approach this vision with focus on the first and the
third point.
The target population to which the research goal relates are distance students, a
great part of whose might be employed adults (Paulsen & Rekkedal, 2001), mostly
aged between 25 to 40 years. The rest of the target group might be graduate students
aged mostly between 20 and 30 years. The age ranges used are assumptive and they
constitute a part of the study’s delimitation.
The following are research questions, answering which I expect to contribute to
achieving the research goal:
RQ-1 What are the behavioral signs of tasks carried out when cheating that
manifest themselves on keystroke, mouse and linguistic dynamics of the
user’s computer interaction during a computer-based examination?
RQ-2 How distinct is normal behavior from a cheating behavior and how dis-
tinct are different types of cheating behavior from each other?
The following are delimitation statements for this study: (1) A small number of
participants of online examination simulations (observations) are selected based on
convenience, instead of careful alignment to the target population. (2) No special
equipment such as skin humidity, body temperature or heartbeat sensors is used
within the study. (3) Automation of the whole cheating detection process from
gathering inputs to seeing indications of cheating type and amount itself, is not a
part of the study.
4
1.4 Document structure
After having introduced the topic, drawn the research goal, questions and delimi-
tation statements in the introduction chapter, the document describes the problem
background and parts of the state of the art in the background chapter. The chapter
conceptual framework contains description of core theories and concepts applied in
the study. The research method and its details are described as next in the method
chapter. Observations useful to know before analysis, are described in the chapter
named respectively. Findings of the study are summarized in the results and findings
chapter. Finally, the whole research is summarized and concluded in the conclusion
chapter, and different questions are additionally discussed from the author’s points
of view.
5
6
Chapter 2
Background
This chapter summarizes some cheating-related background and the state of the art
in relation to the research problem and the approaches chosen to solve it.
A behavior may be defined as cheating if [at least] one of the two following
questions can be answered in the positive:
• Does the behavior violate the rules that have been set for the assessment
task?
• Does the behavior violate the accepted standard of student behavior at
the institution?
Although the second question asked by Dick et al. uses the term ‘accepted standard
of student behavior’, practical image of which looks rather informal and fuzzy, the
definition seems to reflect the perception of cheating pretty well in general – and
also in the fuzziness, on the other hand. As said afterwards regarding the previous,
7
in both cases, this assumes that the accepted rules and standards have been
clearly laid out for students. (Dick et al., 2003, p. 172)
Facing the reality, this might not be the case in many academic environments,
though. Another problem with the definition is that technically breaking the rules or
such standard might also be inadvertent (unintentional), or too trivial, so it rather
becomes perceived as poor learning behavior instead of cheating.
Severity is an important parameter of cheating, especially in responding to cheat-
ing or handling it otherwise. Dick et al. (2003) proposes a number of factors to
consider regarding cheating severity (seriousness):
• The presence of direct harm to some other person by the cheating behavior.
• Course-relative value of the assessment task, on which the cheating was present.
Cheating is an important issue that needs to be considered for two main reasons.
The first reason is that students who cheat are likely to have not achieved
competence in a variety of skills that will be necessary for them to use in their
profession. Graduating incompetent professionals is likely to cause:
8
• Damage to the reputation of the institution as employers realise that the
graduates from an institution are sometimes or are often incompetent.
• Damage to the reputation of the degree for the same reason.
The second reason that cheating is an important issue for academics is the harm
it causes to individual students. It
Besides that, cheating can pose a greater risk to the ones who cheated and were
detected:
The student learns little when the opportunity to learn is ignored, the gratifica-
tion of creating something that he or she distinctly owns is lost, and if discovered
by others, the career of the student could be ruined depending upon the con-
text and seriousness of the offense (Whitley & Keith-Spiegel, 2001). (Wehman,
2009, p. 12)
9
Figure 2.1: Model of Ajzen’s (1991) Theory of Planned Behavior extended by Stone et al.
(2009)
to the question is beyond the limits of the thesis focus, this part provides a more
general and somewhat more near-the-surface answer instead.
From a pragmatic perspective and according to Ajzen’s (1991) Theory of Planned
Behavior (TPB) extended by Stone et al. (2009) (outlined on figure 2.1), people
intend to cheat and perform it according to three components – (1) beliefs about
cheating and its outcomes, (2) perceived normative acceptability of cheating, and
(3) the ability (or difficulty) to cheat and remain undetected (thus unpunished).
Although the theory describes an ‘internal cheating control mechanism’, it does not
explain what are the incentives for considering a cheating behavior at all. For the
needs of this study, let us simply assume the following:
For a deeper insight towards more ‘under the hood’ relations between student goals,
motivation and expectancy, one can refer to Covington (2000) and Eccles & Wigfield
(2002).
From a different perspective, Lawrence Hinman’s words say:
People with integrity not only refrain from cheating, but don’t want to cheat.
[...] People with integrity have a sense of wholeness, of who they are, that
eliminates the desire to pretend – through cheating, through plagiarizing, and
the like – that they are someone else. For them, signing their name to something
10
Figure 2.2: Model of student cheating decision based on internal (personal) and external
factors based on Dick et al. (2003)
signifies that it is theirs. They would not want to pass something off as their
own. (Hinman, 1997, no page numbering)
People with integrity also have a clear vision of what is right and what is wrong.
Their world is not the murky world of thoughtless and easygoing relativism, but
a world that is sharply illuminated by the light of their vision of goodness. And
added to this clarity of vision is the strength of will to act of the basis of that
vision. They see what is right, and they stand up for it, even when the personal
cost is high. (Hinman, 1997, no page numbering)
Dick et al. (2003) identified four reasons based on which a student may decide
to cheat: Sensitivity as the ability to interpret a moral situation, judgement as the
ability to determine if a certain action is correct or not, [self-]motivation as the
influence of internal values, character as the ability to resist pressures to perform
an immoral act.
As an extension to the previous model, Dick et al. provide a model of student
cheating decision based on internal factors (‘personal domain’) and the external ones
as shown in figure 2.2. Technology is in this context seen as the enabler of different
possibilities, cheating among other. Societal context refers to e.g. influence of a
student’s peer group, family, media, role models, culture, etc. Situational context
may include e.g. heavy or irrelevant course load, inadequate teaching, difficult as-
signments, lack of environment control from the examiners or proctors, some sort
of dependence on passing the examination, etc. Demographic factors including age,
gender, marital status, socioeconomic status, ethnicity, religiosity. (Dick et al., 2003)
Diekhoff et al. (1996), O’Leary (1999) and McCabe et al. (2006) discuss rela-
tionships between cheating and cheater properties such as age, gender, cultural,
educational or professional background, etc. For instance in environments where
words are perceived as ‘belonging to society’ more than ‘belonging to individual’,
cheating tends to be perceived as more acceptable and hence, more commonplace
(O’Leary, 1999).
The importance of performing well on examination, and hence increased fear-
11
based cheating pressure evoked by conditions with high student population and
grading strongly affecting an individual’s future career, also tends to result in higher
cheating rates among students (Howell et al., 2009). Opposed to that, dominantly
intrinsically motivated students (those with dominant mastery goal orientation),
show less cheating behavior than their dominantly performance goal oriented or
dominantly neutral peers (Rettinger & Kramer, 2008).
According to Whitley & Keith-Spiegel (2002), there are five norms, which are
usually not perceived as academically dishonest by students: (1) Students may study
from old tests without explicit permission (as long as the tests are not stolen),
(2) taking shortcuts such as reading condensed books, listing unread sources in
bibliography, and faking lab reports is permissible, (3) unauthorized collaboration
with others is fine, especially when helping friends, (4) some forms of plagiarism such
as omitting sources and using direct quotations without citation are acceptable, (5)
conning teachers by faking excuses for missing deadlines and so, is permissible. Such
misconceptions make students more leaned towards the respective cheating without
realizing the seriousness of it.
On top of that, Wehman (2009) has identified that fear of negative teacher
evaluations and student morals and habits back from years ago are topics related to
the cheating problem.
Students often know that they are conducting an immoral activity when cheating.
As summarized by Whitley & Keith-Spiegel (2002) and corresponding with TPB,
theory of cognitive dissonance (Aronson, 1969) and neutralization theory (Harris
& Dumas, 2009), students’ justifications for academic dishonesty (seemingly being
applicable to any kind of consciously immoral activity in general) can include denial
of injury (‘it doesn’t hurt anyone’), denial of personal responsibility (‘I got sick and
couldn’t read the stuff’), denial of personal risk (‘they can’t punish me anyhow’),
selective morality (‘I only cheat to pass the classes’, or ‘friends come first, they
needed help’), trivializing (minimizing seriousness) (‘the assignment has a little
weight in final grade’), a necessary act (‘if I don’t do well, my parents will kill me’),
and dishonesty as a norm (‘everyone does it’).
Another argument placing cheating into a more acceptable light is that cheating,
and more specifically plagiarism, versus collaborative spreading of knowledge, seem
to be a bit conflictive and fuzzy in borders:
There is a certain unambiguity about when ‘collaborating in learning commu-
nity to extend knowledge and understanding’ stops and ‘submitting only your
own work’ starts. (Le Heron, 2001, p. 3?)
Extensively interviewing six first-year master’s students from three different pro-
grams at a university, Love & Simmons (1998) identified a set of factors correlated
to plagiarism behavior, which are divided into several groups based on character of
the factors: mediation character (inhibiting vs. contributing), factor type (internal
12
Mediation Type Effect Factor
Personal confidence
Positive professional ethics
Positive Fairness to authors
Internal Desire to work or learn
Fairness to others
Fear of detection consequences
Negative
Inhibiting Guilt
Professors’ knowledge
Probability of being caught
Time pressure
External –
Cheating perceived as dangerous
Type of work required
Need for knowledge in the future
Negative personal attitudes
Internal – Lack of awareness
Contributing Lack of competence
{Grade, time, task} pressure
External –
Professor leniency
Table 2.1: Factors correlated to plagiarism behavior according to Love & Simmons (1998)
vs. external), and emotional effect (positive vs. negative). Those are summarized
in table 2.1. The set of factors is further extended by theoretical summary of Olt
(2007) and Megehee & Spake (2008) as summarized on table 2.2 according to the
apprehension of the author of this thesis. Although the authors focus on plagiarism
behavior, the results seem to have partial relevance to cheating in general.
As an addition to the tables, Iyer & Eastman (2008) found that perceptions
of low social desirability at students are directly correlated to the amount of their
cheating behavior.
In form of an extended application of TPB, figure 2.3 graphically summarizes
causes of cheating and the expected benefits as one of cheating factor groups.
13
Mediation Type Factor
Academic achievement
Inhibiting Internal
Age
Difficulty seeing marks of plagiarism
Disorganization
Cryptomnesia
Fear of failure
Procrastination and laziness
Internal
Sense of alienation
Thrill seeking
Social activities
Cheating rationalization
Absenteeism
Unrealistic assignments
Ambivalence of faculty and administration
Benefits outweigh risks
Competition (jobs and graduate school)
Contributing
Devaluing assignment by the instructor
Ethical lapses
Information overload
Institution’s subscriptions to market ideologies
External Instructor bad example
Prominent bad examples
Opportunity
Peer observation
Social networking
Instructors’ failure to keep pace with tech. advances
Instructors’ failure to rotate curriculum
Instructors’ lenience
Lack of trust between student and instructor
Previous cheating experience
Cultural background
Gender
Internal
Marital status
–
Major
Student perception of instructor
External
Testing environment
14
15
Figure 2.3: Model of cheating causation (inspired by Whitley & Keith-Spiegel, 2002)
Within an analogy between cheaters in the educational field and attackers in
the field of information security, as there are different types of attackers, there
might be similarly different types of cheaters. According to Whitman & Mattord
(2007), attackers have different motivations to intrude such as personal and social
status, the thrill of doing it, revenge, financial gain, ideology, industrial espionage,
etc. Attempting to draw an analogy, cheaters might also cheat for different reasons
such as a notion of personal gain (grades or other academic credit, personal or
social status), providing oneself an additional layer of failure protection (although
a forbidden one), to accommodate oneself with a social environment, or simply
possessing a habit of cheating.
Although students are mostly believed to cheat for grades Cizek (1999), views
and experiences on it may slightly differ, e.g. that cheaters mostly just want to pass
a course or an examination (Le Heron, 2001).
To sum up this section, it seems that there won’t be any existential emergency
for cheating intentions among students at least as long as we use the kinds of school
systems we use today. That could mean a very long time in the future we’ll have
to keep combating cheating in one way or another. Besides, there are a number of
cheating correlates, which might make cheating a clue or a signal directed toward
improving other educational issues at an institution.
• The police approach seeking to detect and punish cheating in reaction to it.
This approach is based on punishment and deterrence (as described and dis-
cussed by e.g. Carlsmith et al., 2002) – in other words, the ‘big brother’ style.
Inspired by the risk management terminology of Whitman & Mattord (2008), all of
the approaches can be seen as a form of cheating avoidance, the last one perhaps
also being partially mitigative.
Similarly, Olt (2002) has identified four basic strategies for minimizing academic
dishonesty in online assessment. For the sake of more clarity, I assigned names to
those (in italics):
16
technical and operational means of perceiving and/or controlling the exami-
nation environment.
17
• Computer-adaptive testing and randomized testing. Instead of having the same
variant of test for each examinee, rest of the test varies based on how one has
answered the answered questions.
Following Cizek (1999), Rowe (2004) and Deubel (2003), there are a few more
cheating fighting ideas as e.g.:
• Planning for unexpected matters, which can occur when using information
technology, or simply examination operation in general. For instance, a student
computer may crash, or may be taken down intentionally. Similarly, students
may ask for using a bathroom or having a drink or a snack innocently, or in
an attempt to realize fraudulent intentions such as cheating.
• Entrapment such as trying to plant fake tests in locations, where curious people
searching exam questions or answers are likely to find those. It is an analogy
to ‘honeypots’ in network security as discussed in Whitman & Mattord (2007).
This method applied to education, however, seems to lay over the border of
professional ethics.
18
In conclusion, there seems to be quite a number of different means to fight
cheating, however and as seemingly generally valid, no silver bullets that simply
‘fix it all alone’. According to what was summarized, an educational institution
needs to employ a broad range of approaches and methods to be effective in this
process. Omitting one or more approaches as e.g. focusing on detection, reaction
and deterrence only, while not cheat-proofing the environment and/or building an
integral culture, might not work very well, especially in the long run. Although
this study primarily aims at ‘the police approach’, this section was also meant to
mention that this approach needs some complementary support, since it is itself too
incomplete to rely on as the only one.
• Using physical resources to cheat. This can occur in form of reading own or
others’ crib, desk or hand notes, papers, books, pieces of clothing or tissues,
looking at other students’ work, or using steganographic methods (e.g. ultra-
violet light) to extract notes or other data protected respectively.
19
• Using electronic resources to cheat. For example, using resources as notes,
papers, e-books, web sites, old student work or old answer sheets from a com-
puter network, computer, telephone or other electronic medium, which are not
allowed to use.
• Impersonation, which means using someone else to take parts or whole exam-
ination instead of the authentic person.
• Plagiarism, which means using parts of someone else’s work without giving
adequate credit.
To sum up, there has been a number of different cheating categories identified
across the existing literature. Some of the categories cover tens or perhaps hundreds
of specific cheating methods. Information about those together with fairly advanced
cheating tactics can be read in Cizek (1999, chap. 3). For the purpose of this
study, however, describing those detailed seems to be marginally important, since
new technologies are being invented, and cheaters keep on modifying the existing
ways to cheat and finding new ones all the time.
Methods used to cheat on tests are like snowflakes: There is an infinite number
of possibilities. The possibilities are, however, related to the type of testing
being considered. (Cizek, 1999, p. 37)
20
Many forms of ‘exam-time’ cheating seem to have a common denominator – ob-
taining information from disallowed sources to give correct answers without having
learned the subject matter (reading, hearing, etc.), or letting someone else answer
instead of the authentic person. The rest of cheating types seems to require longer
time or other than exam conditions to set up, and hence, it is of marginal interest
for this study.
• Checking for identity that it is the authentic person who is being examined.
• Checking for forbidden tools such as crib notes, electronic devices, etc.
• Plagiarism detection systems and Internet searches, which try to detect collu-
sion between students, cut-and-paste plagiarism, and the usage of paper mills
(old paper databases) by e.g. searching in those and searching the Internet
for similar texts among everything freely accessible and indexed by the search
engines (such as Google).
21
TermPaperMania”), inconsistently embedded links (URLs) and other forms of direct
and apparent plagiarism evidence (Harris, 2009).
Additionally, University of Alberta Libraries (2009) identifies a clue that if a sub-
mitted paper exceeds student’s research or writing capabilities, or has an anomalous
tone (too professional, journalistic or scholarly), or simply somehow largely exceeds
expectations from the student, it might signalize plagiarism or some other form of
cheating.
Within cheating detection based on personal vigilance, Dick et al. identify tech-
niques as careful scrutiny, eye inspection, hand analysis, observation, and pattern
spotting. Three comparisons commonly made are
(1) across the students looking for similarities of submissions, (2) within an
individual assessment looking for changes in style or unusual ideas, (3) with
previous work by the same student looking for dramatic changes in quality.
(Dick et al., 2003, p. 181)
As an important note and also relevant to this study, Cizek (1999) points out the
difficulty and pitfalls of taking probabilistic evidence as sufficient to prove cheating.
Although the class of statistical cheating detection methods seems to be the most
promising regarding power and availability, the methods may function rather as an
indicator and deterrent than a tool providing strong evidence alone. Another fact is
that Cizek focused on statistical methods of analyzing examination answers, which
do not take eventual measures during the examination process, building on such
assumptions as e.g. that the methods cannot detect use of cheat sheets (crib notes),
impersonation, electronic communication, etc. In contrast, this study is hoping to
show the opposite.
• Broadening the perceived goal context by e.g. making students understand why
and how it is beneficial for them to all (1) learn the study matter properly, (2)
not getting caught cheating because of its probable consequences, and (3) not
22
contributing to spreading of the cheating culture. This can also be a goal of
an academic integrity program.
• Increasing the risk (penalty and probability) of being caught upon cheating by
e.g. hardening consequences of being detected cheating and increasing cheating
detection capabilities.
Cheating itself can occur in a number of forms. Also thanks to the generally de-
sired and deeply valued student inventiveness, the forms cheating effectively change
over time, which makes it both costly and inefficient to address detection and pre-
vention of narrow cheating form groups one by one. Moreover, doing so can make the
counter-cheaters at best a couple of steps behind the cheaters. Regarding cheating
detection, there are efforts to develop more effective methods capable of detecting a
broader and more general range of cheating forms, i.e. through applying automated
statistical analysis to different measures of human behavior.
Last and not least, in the ways of both detecting and preventing cheating, there
are hinders and limits of different kind – ranging from misalignment between the
counter-cheating and administrative, through fear from reporting cheating, up to
political unsuitability of e.g. cheating detection methods.
23
Figure 2.4: Graphical overview of cheating and counter-cheating relations
24
25
Figure 2.5: Overview of a cheating and counter-cheating process
2.2 Specifics of distance operation
There is no doubt about the great accessibility advantages and freedom in the choice
of study tempo the concept of distance work provides. On the other hand and within
some reflection, the distance mode of operation could affect at least the following
aspects compared to the conventional one:
• The study/examination environment and the student perception of it. A dif-
ference between on-site and distance study/examination environment seems
to be apparent. On-site students can attend school sessions together with
peers in an environment with a strongly academic feel, walking or travel to
school, attend lectures seeing peers and lecturers, and often feel as being a
part of a student community sharing similar goals together with others who
are physically near. One can have a lunch and talk to peers, study together
and cooperate on assignments face to face, etc. Distance students attend
school sessions from behind a computer screen, seeing and hearing peers and
lecturers on a videoconferencing tool, reading course matters from a remote
learning management system and rather seldom having a computer-mediated
peer discussion (Paulsen, 2001), perhaps physically alone for most of the time.
Independently from whether one is in some ways superior or inferior to the
other, there are certainly many differences between how an on-site student and
a distance student can perceive and feel about their studies. Similarly the dif-
ference seems to apply to the examination process. Sitting in a controlled room
with an adequate surveillance feels certainly different from sitting in one’s of-
fice or living room having a microphone and webcamera with a constant and
limited angle of sight on.
26
environment and mode of operation (Rowe, 2004). On a conventional examina-
tion, an examiner can often see parts of the classroom from different angles and
also hear what is happening. Although this could be possible within a distance
examination as well, it could require rather special surveillance equipment for
students, which comes with a cost to obtain and operate. Yet a different type
of problem is the analytical capacity of such detection systems - does it just
record data (e.g. voice, video, keystrokes, etc.) and make the actual detection
of tens or hundreds of students up to a human, or can it operate automatedly?
• Indirectly the extent to which employers accept distance degrees. The public
trust in and employee acceptance of distance degrees seems to be smaller com-
pared to conventional degrees (Columbaro & Monaghan, 2009; Bourne et al.,
2005; Allen & Seaman, 2003). Although it might be tricky to identify the rea-
sons for this mistrust, some of them could presumably be related to different
assumptions about quality limits of distance education, cheating in distance
assessment, or simply doubts about a nonstandard and unconventional way of
studying.
The intention with these lines is not to mark one of the two environments as superior
or inferior to the other. It is to signify that an environment may have practically
beneficial advantages, while at the same time, it may have practical disadvantages,
some of them in form of threats.
A different and more friendly view toward the concept of distance education is
that it best suits adults in need of additional or continued education, who cannot
afford an interruption from their job (Paulsen & Rekkedal, 2001). Moreover, com-
pulsory time-bound sessions have been shown as dramatically reducing application
interest of this type of students (ibid).
Regarding statistics and comparison between cheating among on-site and dis-
tance students, there are a couple of studies showing varied results (Stumber-McEwen
et al., 2009; Herberling, 2002; Watson & Sottile, 2010). Some of them state that
distance students cheat more, some of them state the opposite. Let this be anyhow,
according to the results presented, distance students cheat as well as their on-site
counterparts do – and that seems to be a good reason to find ways of reducing that
matter.
27
28
Chapter 3
Conceptual framework
29
Figure 3.1: A classification example
keyboard should well have different keystroke dynamics and/or text diction than the
same student rewriting text from a book, which is written by someone else having
different language habits.
30
1. Supervised pattern recognition, which operates based on a priori known clas-
sification information. Such classifiers can either be designed with a model of
the classification problem, or they can be trained by training feature vectors
before they classify inputs.
2. Unsupervised pattern recognition, which is just given input patterns, and those
are subsequently clustered to groups based on similarities within the set of
input patterns.
According to Huang (2006), Thomason (1990) and Jain et al. (2000), there are
five approaches to pattern recognition: (1) template matching (the simplest one),
(2) decision-theoretic (Jain et al., 2000), (3) syntactic-structural (Thomason, 1990),
(4) functional (Huang, 2006), and (5) neural network based.
31
• Collective anomaly – if a collection of related data instances is anomalous with
respect to the rest of the data in the whole set.
Regarding the techniques of anomaly detection, three modes have been identified
(ibid), partially resembling or inheriting from the classification of machine learning
algorithms1 :
The applicability of those modes increase from the first down to the third one.
According to Chandola et al. (2009), techniques of anomaly detection vary based
on specific application, which is further related to a specific notion of anomaly and
a specific nature of input data. Those techniques can be (1) classification based,
(2) clustering based, (3) nearest neighbor based, (4) statistical, or (5) spectral. More
specifically, those techniques commonly include statistical profiling using histograms,
artificial neural networks, support vector machines, rule-based systems, parametric
and nonparametric statistical modeling, bayesian networks, clustering-based tech-
niques, nearest neighbor based techniques, information theoretic techniques, spectral
analysis, regression, and mixture models (ibid).
The output of an anomaly detection technique can either be a label denoting
whether a given data instance is normal or anomalous, or a score providing a finer
resolution of the same (ibid).
Within a more specific application of anomaly detection, Stakhanova et al. (2010)
describes a framework for intrusion detection using a fusion of specification-based
and anomaly-based approach.
3.4 Behaviometrics
Behavioral biometrics, or perhaps more precisely called behaviometrics, refer to bio-
metrics (or rather just metrics) using behavioral traits of subjects, such as e.g. hand-
writing, gait, voice characteristics, keystroke dynamics, mouse dynamics of humans,
communication or control behavior of hard systems and many other (Yampolskiy &
Govindaraju, 2008).
1
Machine learning algorithm classification: (1) supervised learning, (2) semi-supervised learning,
(3) unsupervised learning, (4) reinforcement learning (learning how to act given an observation),
(5) transduction (learning to predict), and (6) learning to learn; According to Wikipedia: http:
//en.wikipedia.org/wiki/Machine_learning [Accessed: 2010-04-02]
32
This section will first describe common aspects of biometrics in general, then the
specifics of behaviometrics, and finally continue describing selected behaviometric
methods, which are of interest for this study.
• What a person has such as a key, file, magnetic card, integrated chip card, or
some other authentication token.
• What a person is, which in fact more precisely means what a person seems to
be based on physiological characteristics such as a fingerprint, iris or retinal
pattern, DNA etc.
• What a person produces including how a person produces it (or behaves) such as
voice, signature pattern, gait, keystroke dynamics, and other types of behavior.
The following are the properties desirable for a biometric method working with
a set of personal characteristics, inspired by Jain et al. (1999):
• Uniqueness meaning that no two different persons are equal in terms of the
characteristics.
33
• Collectability as the quantitative measurability of the characteristics, often
including its cost (not necessarily monetary).
• Acceptability as the extent to which people (including the public) are willing
to accept the use of the method.
Biometric methods and systems usually rely on three types of usage operation
according to Jain et al. (2004):
• Enrollment, which measures a subject for the first time, extracts features from
the measurement, creates a biometric profile containing the measurement-
based features and stores the profile in a database.
Although not found in the literature, biometric methods can also identify patterns
within the subject measures, both dependently or independently from a biometric
profile. An example of such special application is automated stress measurement
(Vizer et al., 2009).
Within the operations mentioned above, four main groups of errors can occur,
according to Peacock et al. (2004), Gamboa & Fred (2004) and Jain et al. (2004):
• Failure to capture (FTC), also called failure to acquire, when a system fails
to take subject measures, i.e. an iris scanner fails to scan a person’s iris well
enough.
• False rejection (FR), also called false non match (FNM) or false negative,
which is a type 1 error. It is the case when an authentic subject gets rejected
(evaluated as a non-authentic subject). In a security application, this does
not directly pose security risk, however, it can do indirectly. Frequent false
rejections are highly annoying and under such conditions, people tend to start
ignoring the importance of the respective system alerts, or circumventing such
systems.
34
• False acceptance (FA), also called false match (FM), impostor pass (IP) or false
positive, which is a type 2 error. It is the case when a non-authentic subject
gets accepted (evaluated as an authentic subject). In a security application,
this is what directly poses security risk (compared to type 1 error).
• False rejection rate (FRR), also called false non match rate (FNMR). It is the
statistical probability that a false rejection will occur in a recognition operation
of a biometric system.
• Falce acceptance rate (FAR), also called false match rate (FMR) or impostor
pass rate (IPR). It is the statistical probability that a false acceptance will
occur in a recognition operation of a biometric system.
• Equal error rate (EER), sometimes also called crossover rate. It is the proba-
bility where both false rejection rate and false acceptance rate are equal toward
each other.
• Average error rate (AER), which it not used very commonly, combines FRR
and FAR into one scalar value and can even serve for the approximation of
EER.
• Failure to acquire rate (FTA), describing the percentage of cases for which the
system lacks sufficient power or ability to classify a subject.
• Failure to enroll rate (FTR), describing the percentage of users lacking enough
quality in their input samples to enroll in the system.
• Cost to a user to enroll (CUE), which means the number of units to submit
to the system before enrolling as a valid user. The units can be keystrokes
or fingerprint scans or something else, based on the type of biometric system
used.
Figure 3.2 describes FRR and FAR parameters, and their distribution graphically.
On the left diagram, one can see impostor and genuine subject distribution, and a
matching score threshold the matching mechanism uses on a two-dimensional scale
of matching score and probability. Those parameters largely determine the error
rates of the system (false rejection and false acceptance rate), typical relation of
35
Threshold
1
1
Impostors Genuines Forensic use
FAR
p
Commercial use
High-security use
FRR FAR
0
0
−∞ Matching score +∞ 0 FRR 1
Figure 3.2: Biometric system error rates (inspired by Jain et al., 2004). In terms of Detection
Theory (Abdi, 2007), the impostors are noise, while genuines mean signal.
Figure 3.3: A typical architecture of a biometric system (inspired by Jain et al., 2004)
which (also called receiving operating curve - ROC ) is drawn on the right diagram.
The point where the impostor and genuine subject distribution curves cross over
each other signifies the equal error rate (EER).
Simplified, a typical biometric system design has at least the following compo-
nents (inspired by Jain et al., 2004):
• Sensor, which measures the subject.
• Matching module, which matches the input features with the profile in the
database (if any available).
• Decision module, which makes a decision whether or not to accept the sub-
ject (in authentication mode), or the identity of the subject or an error (in
identification mode).
36
a)
b)
c)
d)
Figure 3.4: Fusion of biometric systems: (a) at capture, (b) at feature extraction, (c) at
matching, and (d) at decision. Inspired by Jain et al. (2004)
37
strong privacy concerns of the public. The concept of privacy is also discussed by
Peacock et al. (2004), Moskovitch et al. (2009) and Yampolskiy & Govindaraju
(2008). From a more technical and cryptography point of view, privacy and secrecy
of biometrics in biometric secrecy systems are discussed by Ignatenko & Willems
(2009).
From a biometrics-wide point of view, Doddington et al. (1998) formulated a
classification of four types of speakers analogized to animals by characteristics of
their recognizability:
• Sheep, who match well against themselves and poorly against others. They
make up most of the population.
Later, Yager & Dunstone (2010) also took a look at user classification with regards
to how well biometrics performs for different users, or how well can different users
perform on biometrics, and extended the previous classification in the following way
(also described in figure 3.5):
• Chameleons, who rarely get false rejections, but are likely to cause false ac-
ceptances toward others.
• Doves, who match well against themselves and poorly agains others. That
makes them the ‘positively ideal’ users of biometrics.
• Worms, who match poorly against themselves, but well against others.
To connect the new classification to the old one, the dove ideal is equal to the sheep
one, while both lambs and wolves just have high impostor ranks. Importantly to
note, impostor rank means both the likelihood to impersonate and the likelihood to
get impersonated.
I perceive this concept as important to realize, since a recognition system is
usually used to recognize all kinds of subjects having different recognition properties.
38
6
Worms Chameleons
Phantoms Doves
-
Genuine rank (∼ (1 - FRR))
Figure 3.5: The biometric menagerie according to Yager & Dunstone (2010)
4. Motor skill based (focusing on muscle usage traits, which rely on function
39
of brain, nervous system, skeleton, joints etc) such as inputs from keyboard,
mouse etc.
• Login-time recognition, which does its work at the beginning of a usage session,
or perhaps also as an isolated periodic, sporadic, or event-based activity later.
• Continuous recognition, which works all the time during a usage session based
on how one interacts with a system (e.g. a computer).
• Rule obedience as the amount of socially less acceptable behavior (e.g. per time
unit). An example of such behavior might be examination cheating, abuse of
language, or parking car on an unsuitable spot.
The following are some of key advantages usually easier achievable by behavio-
metrics compared to physiological biometrics according to Shanmugapriya & Padma-
vathi (2009), Wood et al. (2008), Jain et al. (2004) and Yampolskiy & Govindaraju
(2007, 2008):
• Price for the system and its operation, codetermined by its dependence on
uncommon equipment (e.g. in context of daily computer usage).
From a different angle, some of the major drawbacks of using behaviometrics com-
pared to physiological biometrics follow:
40
• Time requirements, since behaviometrics incorporate timing and it takes a
while before such system can effectively recognize a subject.
Yampolskiy & Govindaraju (2008) identified five areas, which may benefit from
progress in the field of behaviometrics: (1) opponent modeling in game theory and
related fields (also applicable in the military), (2) user modeling for marketing and
customization or optimization purposes, (3) criminal profiling for investigation pur-
poses, (4) jury profiling for juridical predictions, and (5) plan recognition for under-
standing the goals of an intelligent agent.
41
within and across longer time spans of interaction. Tappert et al. (2009) presents
a behaviometric solution based on long-text input keystroke dynamics. As an im-
portant compromise to consider with keystroke dynamics recognition system design,
Gunetti & Picardi (2005) and Hempstalk (2008) found out that in current state
those systems either require large quantities of typing before accepting or rejecting
a subject, or they are susceptible to small fluctuations in the typing patterns.
Shanmugapriya & Padmavathi (2009) categorized the use of keystroke dynam-
ics in the following ways: (1) Static at login (the case of password hardening), (2)
periodic dynamic, (3) continuous dynamic, (4) keyword-specific, and (5) application-
specific. In context of keystroke dynamics, the terms ‘static’ and ‘dynamic’ are
sometimes replaced by terms ‘fixed/structured text’ and ‘free text’, since some re-
searchers believe that the former terms may be misleading (Gunetti & Picardi, 2005).
The behaviometric recognition is also realized in different ways across different
studies and systems. According to Shanmugapriya & Padmavathi (2009), most com-
mon approaches are either statistical, or based on artificial neural networks (ANN).
Other methods include hidden markov models, bayesian classifiers, gaussian classi-
fiers, gaussian mixture modeling, rhythm-based algorithms, k-nearest neighbor algo-
rithms (k-NN), distance-based algorithms (using euclidean, hamming, manhattan,
chebyshev, or some other distance measure), support vector machines (SVM) (Giot
et al., 2009; Jagadeesan & Hsiao, 2009; Hosseinzadeh & Krishnan, 2008). Toward
the ‘more exotic sounding’ ones, Hempstalk (2008) names an application of a modi-
fied LZ78 compression algorithm used for input log prediction, and Karnan & Akila
(2009) uses genetic algorithms (GA) and particle swarm optimization (PSO) in order
to gain better recognition accuracy.
From a development perspective, Hosseinzadeh & Krishnan (2008) proposed a
protocol for the development of behaviometric technology, specifically keystroke dy-
namics, which tries to cover different problematic aspects encountered within previ-
ous work in the area. Those aspects include (1) feature design, (2) data collection,
(3) error reporting, and (4) data acquisition, all seen upon as working in a cycle
(1,2,3,4).
As stated earlier and shown by the results of Wood et al. (2008), keystroke
dynamics change over time. The results show a progressive decline in both identifi-
cation and authentication using this behaviometric method during a period of four
weeks without updating the reference profile for the users.
Keystroke dynamics are influenced by factors as stress (Vizer et al., 2009), alert-
ness, fatigue, mood, illness, injury, time of day, simultaneous activities to writing,
etc. (Gunetti & Picardi, 2005; Hempstalk, 2008). Moreover,
Except the less deterministic environmental effects, a simple change of keyboard can
change the typing dynamics (Gunetti & Picardi, 2005; Villani et al., 2006).
42
Figure 3.6: An example process of mouse dynamics analysis (inspired by Ahmed & Traoré,
2007). This general model is also applicable to keystroke dynamics and basically any other
behaviometric method or technology. Compared to what is apprehensible from the typical
architecture of a biometric system shown in figure 3.3, this process incorporates usage session
identification (for the subsequent analysis steps to see a broader behavioral context) and
noise reduction (to cut information of lesser significance to the recognition process).
Finally, the application of keystroke dynamics can effectively improve the imu-
nity toward security threats stemming from e.g. (1) shoulder surfing, (2) spyware,
(3) social engineering, (4) login guessing, (5) brute force password attacks, or (6)
dictionary password attacks, according to Shanmugapriya & Padmavathi (2009). It
is potentially usable against many kinds of keyboard-based computer usage imper-
sonation and as a basis for an intrusion detection system (IDS) (Gunetti & Picardi,
2005). Keystroke dynamics and many other behaviometric methods can be used to
minimize risks of the attacks mentioned, and on top of that, a more serious matter
called identity theft (financial, criminal, business/commercial, or identity cloning)
(Moskovitch et al., 2009; Jagadeesan & Hsiao, 2009).
43
According to Ahmed & Traoré (2007), mouse dynamics have been mostly used
to aid graphical user interface (GUI) design. Most of security-related research in
mouse dynamics is focused on continuous authentication and identification according
to Bours & Fullu (2009), who prototyped a mouse dynamics login system. A similar
experiment was carried out by Aksarı & Artuner (2009). Beyond the scope of GUI
design and information security, Zavadskas et al. (2008) and Kaklauskas et al. (2009)
used mouse dynamics for emotional state analysis and Vizer et al. (2009) used it for
stress measurement (both applications discussed in a later section). Both of those
applications are somewhat closer to psychological application.
44
Meta-function Information type Analysis type
Topics Topical analysis
Events Event detection
Ideational
Opinions Sentiment analysis
Emotions Affect analysis
Authorship analysis
Style Deception detection
Textual Power cues
Genres Genre analysis
Vernaculars Semantic networks
Social networks
Interpersonal Interaction
Conversation streams
Table 3.2: Text analysis linguistic features categorized by Abbasi & Chen (2008).
45
Type Feature
Words
Verbs
Quantity Modifiers (adjective or adverb)
Function words (prepositions, articles, conjunctions)
Sentences
# of words
Average sentence length ( )
# of sentences
Complexity # of chars
Average word length ( )
# of words
# of punctuation
Pausality ( )
# of sentences
# of passive verbs
Passive verb ratio ( )
# of verbs
# of modal verbs
Modal verb ratio ( )
# of verbs
Non-immediacy You reference ratio
Self reference ratio
# of 1st person plural pronouns
Group reference ratio ( )
# of words
rd
# of 3 person pronouns
Other reference ratio ( )
# of words
# of modifiers
Expresiveness Emotiveness ( )
# of nouns+# of verbs
# of unique
Lexical diversity ( )
# of words
Diversity # of function words
Redundancy ( )
# of sentences
# of unique non-function words
Content word diversity ( )
# of non-function words
# of misspelled words
Informality Typo ratio ( )
# of words
Affect ratio
Sensory ratio
Temporal immediate ratio
Specificity
Temporal non-immediate ratio
Spatial close ratio
Spatial far ratio
Table 3.3: Linguistic features (inspired by Adkins et al. (2004) and Zhou et al. (2003))
46
unconventional.
Stress measurement
Vizer et al. (2009) carried out an exploratory study in automated stress detection
using keystroke and linguistic dynamics. Although the study is directed toward
aging population and the assessment of individuals’ cognitive status, some concepts
and findings seem to be of broader applicability.
According to Vizer et al., a solution purely based on the analysis of keystroke
dynamics and linguistic features
(1) unobtrusively gathers data, (2) facilitates the process of gathering baseline
data, (3) allows data to be captured continuously over a length of time, (4)
leverages behaviors in which the individual is already engaged, (5) requires no
extra equipment, (6) can automatically adjust to the unique characteristics of
each individual, and therefore (7) allows for early detection of changes (Vizer
et al., 2009, p. 871).
47
Moreover, each of the emotions have at least two important attributes: (1) arousal
(intensity) and (2) valence (‘direction’, e.g. in terms of being positive or negative)
(Zimmermann et al., 2003; Picard, 1997).
The study of (Zavadskas et al., 2008) focuses on analyzing emotional state of
computer users with regards to their work performance and productivity. A number
of parameters were measured, including mouse pressure (buttons and the mouse
itself) using force sensors, electrogalvanic skin conductance, palm skin temperature,
behaviometric parameters related to mouse movement and clicks, amplitude of hand
tremble, idle time, and the use of scroll wheel.
Kaklauskas et al. (2009) used the same platform for analyzing emotional state
of students during examination process, and Zimmermann et al. (2003) did an ex-
periment measuring mood using keyboard and mouse dynamics.
Deception detection
The concept of deception detection is largely based on concepts of Interpersonal
Deception Theory (IDT) (Buller & Burgoon, 1996), Cue Leakage Theory (Ekman,
1985; DePaulo et al., 2003), Reality Monitoring (Johnson & Raye, 1981), McCor-
nack’s Information Manipulation Theory (IMT) in (Fuller et al., 2006), Media Rich-
ness Theory and Media Synchronicity Theory (Dennis & Valacich, 1999), and a few
more (Zhou, 2005; Zhou et al., 2004; Fuller et al., 2006). Although e.g. IDT holds
that around 90% of deceit cues have nonverbal character such as facial, gaze, gesture
and other expressions, and most research within the field of deception detection was
directed toward face-to-face (FtF) dynamics, there is also some research on detect-
ing deceit using linguistic features in computer mediated communication (CMC)
(Adkins et al., 2004; Fuller et al., 2006; DePaulo et al., 2003; Zhou, 2005; Zhou et
al., 2003, 2004; Lee et al., 2009).
Deceptive communication has long been a problem for military, govern-
ment, and business organizations. The Internet has provided another
way to communicate deceptively; a way that offers greater anonymity
and leaner media for disguising intent. (Adkins et al., 2004, p. 122)
In context of CMC, deception detection is tightly bound to linguistic analysis as to
a tool for extracting different cues signalizing deception. Since the CMC-specific de-
ception detection concepts are seen as most relevant for this study, concepts specific
for FtF or other areas of deception detection are omitted here.
DePaulo et al. (2003) did an extensive summary of text-based cues of decep-
tion in CMC. Moreover, Zhou (2005) mentioned nonverbal cues to automated CMC
deception detection such as voice-related and keyboard-related behavior, eye move-
ment, facial expression, body postures etc. She hypothesized a number of relations
between deceit and linguistics within instant messaging, together with listing a num-
ber of cues, however, many of them seem to be mostly related to (if not dependent
on) interactive communication.
48
Figure 3.7: Deterrence mechanism of cheating detection linked to Ajzen’s (1991) theory of
planned behavior extended by Stone et al. (2009), and the model of student cheating decision
from Dick et al. (2003). For description of the models, see 2.1.2.
49
3.5.2 Behavioral characteristics as the cheating detection unifier
First of all, it seems important to specify what behavior means in this context. I
see it as a set of actions performed by a system during a non-zero time interval.
Following this definition, behavior is not only related to what we deliberately (and
consciously) say, how we decide, etc. It is also how we say that, how we write what
we write, what is the word selection, etc., part of which has always unconscious and
habitual roots. I.e. the field of behaviometrics (see 3.4) is based on this and not
being this so, one would hardly be able to effectively authenticate people based on
their behavioral traits, since it would be trivial to fake for anyone.
Following the concept of cue leakage, an activity is among other reflected by per-
ceivable behavioral cues. More specifically, a student cheating on an examination
performs a set of activities signifying or being typical for a specific kind of cheat-
ing, and those activities get reflected by some of the behavioral cues the student
leaves in different kinds of his/her behavior. Considering an online computer-based
examination, student writes his/her exam using at least keyboard and/or mouse.
Comparing this approach to the approach of using examination proctors to de-
tect when students read from crib notes, other unauthorized resources, or they talk
to each other, I see the following advantages: it is (1) more automatable, (2) op-
erationally cheaper, and (3) more broadly applicable (to both detect the usage of a
full range of cheating methods, and to detect them in an audiovisually unperceived
environment). On the other hand and at the same time, I see it as (1) less definite
(i.e. if a proctor sees a student reading from a crib note, it is a very strong cheating
indication, while if a student’s behavior shows likelihood of cheating, the indication
is much weaker, because there can be a number of other factors affecting it and being
ignored by the detection mechanism), and (2) dependent on information technology.
1. Not only that people have own habits and dynamics of motor behavior, much
of it is also rooted in neuropsychology and as such, unconsciously influenced
(Stelmach & Requin, 1980; Kelso, 1982). Those often slightly differ from in-
dividual to individual, and hence, what could be contextually considered as
anomalous behavior for one student, might be normal for another student, and
vice versa.
50
2. There are many factors influencing behavior (Vizer et al., 2009; Hempstalk,
2008; Gunetti & Picardi, 2005), while most of them remain unknown to an
analyst or a cheating detection system. Those unknown factors cause rather
unavoidable error in conclusions of a detection process. This is also a reason
why relying on probabilistic cheating detection methods based on statistical
analysis (and classification) alone is not perceived as sufficient to trigger actions
in order to bring personal consequences (Cizek, 1999).
An approach to overcome the first is to profile a student’s behavior for signs of both
normal and suspicious behavior before a cheating analysis is performed. This is,
what is commonly used in behaviometrics and biometrics in general for authenti-
cation and identification purposes (Jain et al., 1999, 2004). In practice, the second
problem seems to be pretty out of control to me. Perhaps the solution lies in the
usage – not to rely on such methods alone and watch out for their indicatory outputs
being misinterpreted as proofs by those who use them. In order to overcome the
third problem, I see the following solutions: either (1) limiting the perceived rele-
vance of cheating detection results to a specific range of problems/questions, or (2)
extending the cheating detection so that it takes into account both relevant exami-
nation information and the specific context in which the examined student operates,
in order to increase overall relevance of the cheating indication.
Figure 3.8 outlines a model of a cheating detection method the study aims for.
While a student is writing an examination, his/her human-computer interaction
behavior is being recorded (measured). Either directly or after the examination,
it can be analyzed. The analysis consists of several steps as follows: (1) feature
extraction based on models of behavior on a molecular level, which also incorporates
noise reduction, (2) anomaly detection, which compares the actual inputs to the a
priori created and known profiles of the student, and (3) classification of the anomaly
trying to indicate whether and how the student is cheating.
The anomaly detection is semi-supervised, since it only learns from profiled nor-
mal behavior. The output of the anomaly detection process is the amount of behav-
ioral anomaly relative to the profiled normal behavior, in form of a multidimensional
vector. The type of anomaly is contextual (conditional ) according to the classifica-
tion of Chandola et al. (2009).
The classification process classifies behavioral anomaly according to both built-in
generalized models of behavior and profiled suspicious behavior. Thus, the classifi-
cation is supervised.
Precision of a method like the one described here would seemingly necessarily
diverge by time unless at least the normal behavior profile was being updated from
time to time (Wood et al., 2008). Also regarding that fact, the system should be
able to run at least in the following modes: (1) enrollment of a student as the
process of profiling his/her behavior, (2) recognition of eventual cheating based on
both profiled behavior and generalized models, and (3) profile adjustment, which
51
Figure 3.8: Model of the cheating detection approach
can run manually or automatically, e.g. after each recognition based on segments
of near-normal behavior. Discussing the operational perspective in more detail is
beyond the scope of the thesis.
Finally, according to the classification of Yampolskiy & Govindaraju (2009),
this approach as a biometric, would fall into four out of the five categories identi-
fied: authorship-based (linguistic dynamics), direct human-computer behavior based
(keystroke and mouse dynamics), motor skill based (keystroke and mouse dynamics),
and purely behavioral (linguistic dynamics).
52
Chapter 4
Methodology
This chapter describes the research process and the methodology to gather and
analyze data within this study.
Given the research goals (see 1.2), this study has a dominantly descriptive char-
acter, trying to characterize/describe a phenomenon (specific meanings in behavioral
dynamics in relation to a specific activity on which the behavior manifests). Ac-
cording to Leedy & Ormrod (2005), descriptive research involves identifying charac-
teristics of the observed phenomenon, or exploring possible correlations among two
or more phenomena, while the situation is examined as it is, without changing or
modifying the situation under investigation. Moreover, descriptive research is not
intended to determine cause-and-effect relationships (ibid).
A real-world phenomenon such as specifics of a person’s human-computer inter-
action behavior and their dependencies on specific activities the person performs
at the same time, is fairly complex both within the boundaries of the phenomenon
itself and the rest of the related environment. Such behavioral specifics depend on
a broad range of factors (situational, personal, technological, societal, etc.), and on
top of that, the factors work together, and are dynamically interrelated. A way to
explore the relations within such phenomenon is to simulate situations, when the
phenomenon is expected to occur. Within such a simulation, however, practical
problems arise: How to validly and reliably simulate such situations, gather data
(observations, measurements, etc.), analyze and interpret those in order to meet the
research goals?
For the first issue (the simulation), there are two parameters, both in some way
mutually antagonistic: control over the situation in terms of both influence and
measurability/perceivability, and ecological validity in terms of the genuineness of
the situation, its resemblance to reality, or simply, its non-artificiality (Clark-Carter,
2009). The problem here is to chose a research method and design, which maximizes
control and minimizes compromises to ecological validity.
The subsequent issues (the data, analysis, etc.), will be discussed and covered
gradually in this chapter.
53
Although those two approaches are not categorically distinct, since the process of
qualitative research involves quantitative methods, and quantitative research always
involves some interpretation by the researcher, which has a deeply qualitative char-
acter. Since the major research concerns of this study are related to human behavior
and its relation to cognition, and hence, psychology, the research methods and ap-
proaches are mostly discussed from the perspectives of this field. According to
Clark-Carter (2009), the quantitative approach is generally related to experiment-
ing, measuring, asking questions, observing, and statistically analyzing. Even if the
inputs are textual, before a statistical analysis, they usually need to be assigned
numerical values. Qualitative approach mostly differs in the analysis process, since
the input data are often collected as textual, and as such they are also analyzed.
Compared to the quantitative approach, the qualitative one is generally related
to exploring, describing and interpreting experiences of participants (Smith, 2008).
On the philosophical plane, quantitative approach is influenced by positivism, which
among other assumes that a subject can always be objectively described by a system
of measurable variables and their deterministic interactions. This applies especially
to behaviorism, which adopted a radically positivist view. Cognitivism as the ma-
jor replacement of the behaviorist trend also contains some underlying positivism,
according to Ashworth (2008). The qualitative approach is somewhat more leaned
towards humanism as opposed to naturalism, while constructivist, interpretivist,
and critical theorist views are more common. Constructivism as the epistemological
opposite of positivism, is in short based on the assumption that knowledge is being
constructed within a mind instead of being observed from reality. Interpretivism is
further extended by the assumption that all knowledge is a matter of interpretation
as a form of construction (Ashworth, 2008). Critical theory, which also builds on
interpretation, is defined as “the examination and critique of society and culture,
drawing from knowledge across the social sciences and humanities”1 . Critical the-
ory is based on values and holds that knowledge is “generated through ideological
critiques of power, privilege and oppression”2 as rooted in feminist and advocacy
research.
According to the character of the research problem, I have chosen quantitative
approach as the dominant one, yet not the only one. In a measurement and de-
terminism based contextual validation of concepts (models), which are products of
interpretation and introspection, I accept positivism in a context-aware cognitivist
approach in the lowest, quantitative layer of the study, seeing animal and human
behavior as a co-product of cognition and mental state. Within the more abstract,
qualitative layer, I use the constructivist viewpoint holding that our knowledge as a
result of individual mental construction on top of individual perception and cogni-
tion, is individually possessed. Looking deeper into epistemology for some thoughts
on justification of knowledge, it can seem valid in certain contexts, invalid in other
contexts, while the resolution of this problem might lie in some concepts either not
taken into account or not eliminated within a specific reasoning. Having rejected the
positivist notion of ultimate reality, I see the value of this study’s findings through
1
According to Wikipedia: http://en.wikipedia.org/wiki/Critical_theory [Accessed 2010-
04-01]
2
According to anonymous presentation slides: http://www.docstoc.com/docs/8558617/
Research-Philosophy/ [Accessed 2010-04-05]
54
a coherentist viewpoint (Kuukkanen, 2007). In this view, the findings present a
tiny drop in the sea of concepts, which linked to other findings support and/or op-
pose, and get supported and/or opposed by some of those. In the long run, the
findings might either help us converging to a more powerful model of reality, or
be rejected/corrected in case they render as erroneous or otherwise invalid. I feel,
however, no ability to judge the external validity in an absolute sense.
Clark-Carter (2009) mentions modeling, artificial intelligence, experiment, inter-
view, questionnaire, observation, content analysis, meta-analysis, and case study as
quantitative methods of psychological research. Among qualitative methods, there
are at least phenomenology, interpretative phenomenological analysis, grounded the-
ory study, narrative study, conversation analysis, discourse analysis, focus group
study, and cooperative inquiry, as all described in (Smith, 2008).
In order to achieve the research goals of this study, I have chosen observational
study as the dominant research method, since the primary concern is rather covert
human behavior and its causal relations from cognition. Covert behavior is a behav-
ior which cannot be observed directly, such as physiological responses, as character-
ized by Clark-Carter (2009). For some classification, Clark-Carter recognizes three
types of behavior: (1) overt non-verbal, (2) verbal, and (3) covert. The subject to be
observed and the concepts to be described are related to distinguishing characteris-
tics in the dynamics of human-computer interaction behavior (criterion variables) in
relation to specific tasks or activities performed simultaneously in specific conditions
(predictor variables), while the tasks primarily include writing, reading, listening,
and different types of cognition. Because of perceived difficulties controlling extrane-
ous influences, more than a single observation is used. Three systematic continuous
real-time observations are complemented by other means of data collection, such
as a questionnaire, which is largely a subject of qualitative interpretation. Using
several different methods focusing on the same area of research is referred to as
triangulation (ibid), which is also used in the study.
55
can allow the results of a study to be generalised to other people – whether
they are representative of the group from whom they come, and whether they
are representative of a wider range of people. (Clark-Carter, 2009, p. 40)
As quoted, the concerns are mostly task, setting and time with regards to different
conditions; aspects of the participants, and generalizability to other groups. Clark-
Carter also mentions two main ways to improve external validity of a research design:
replication and sampling (selection of participants).
or synonymically as
dependability, stability, consistency, reproducibility, predictability, and lack of
distortion (Kerlinger & Lee, 2000, p. 642).
Do measure or characterize what the authors claim, and that the inter-
pretations do follow from them. The structure of a piece of research
determines the conclusions that can be drawn from it and, most impor-
tantly, the conclusions that should not be drawn from it. (Sapsford &
Jupp, 1996, p. 1)
56
If an item is unreliable, then it must also lack validity, but a reliable item is
not necessarily also valid. It could produce the same or similar responses on all
occasions, but not be measuring what it is supposed to measure. (Bell, 2005,
p. 117-118)
The concept of validity can be further divided into several types: (1) Face validity
as the validity perception the people being measured and the people administering
the measures have of the measures; (2) construct validity as the extent of assessing
some theoretical construct well; (3) content validity as the degree to which a measure
covers the full range of behavior related to what is being measured; and (4) criterion
validity as the extent to which a measure fulfills certain criteria – mostly in terms
of concurrency and predictability (Clark-Carter, 2009; Kerlinger & Lee, 2000).
57
Figure 4.1: Research process overview
Based on the keyboard input events and primarily plain text, it is also possible to
extract typed linguistic features, which also fall under the category of automatedly
gathered data. The data are recorded using standard computer hardware (keyboard
58
and mouse) and a custom software, which records the input events from the hardware
together with timestamps with nominal precision of tens of milliseconds.
The manually gathered data are not completely specified. At least the follow-
ing are focused on: (1) observer feelings about the environment, (2) observer notes
about the weather, (3) observer notes about the lighting conditions, (4) observer
notes about the room temperature, and (5) observer notes about any signitficant
events or anomalies during the observations.
The participants are anonymous in terms of omitting the association of the gath-
ered inputs with the personally identifiable data of the participants, such as name,
nickname, or personal number.
For the manually gathered data, only manual remarks are taken (in the ‘pen and pa-
per’ fashion). To reliably gather the data describing molecular behavior, however,
automated recording able to record the data as specified above was chosen. The
reasons for the choice were perceived needs for (1) relatively high time accuracy,
implying the need for relatively high time resolution, (2) reliable continuous gath-
ering of data without loss in form of leave-outs, (3) minimal obtrusiveness during
the gathering process, (4) high efficiency and automation of gathering during the
gathering process, (5) efficient storage and transfer, (6) efficient and trivial recon-
structibility of the input event flow. With regards to those needs, a custom software
based input event recording method rendered as most suitable from the recording
methods realistic for the study.
4.4.2 Observations
The observations are the only process of obtaining empirical inputs for the study.
Despite of initially more courageous planes, I have chosen three single-participant
observations instead of one or more multi-participant ones, mainly because of prac-
tical limits being faced. This was done at the cost of losing probability of locating
eventual external factor based effects on participant behavior. Each of the observa-
tions happens in a different place and at a different time, while the tasks to perform
are of same types and theirselves nearly the same.
59
Figure 4.2: The observation design used in the study
Design
• having complete observer, since the observer only observes participants, while
the observer’s behavior does not participate in the observations,
• ecological, since context and setting in which the behavior occurs is of interest,
and meanings together with intentions also play a role, and finally
The design of the observation is outlined in figure 4.2. Continuous real-time sampling
is seen as the most suitable for the observation, because it enables to record most
of the behavior, while it is both technologically inexpensive and unobtrusive with
regards to the data of interest and recording method used (see 4.4.1).
Each participant within the group has to be observed during each level of the
predictor variable as described in table 4.1. The levels of the predictor variable are
mere instructions what the participants should perform – both in an automated
or a manual manner. With respect to the subject phenomenon of the study, the
effects of specific predictor variable levels on the observed behavior are assumed to
be contemporaneous. Significant carry-over effects are not expected and therefore,
no artificial delays are introduced in between different changing level of the predictor
variable. Although order effects are not expected either, there is a countermeasure
against them in form of varied predictor variable levels for each participant. In
addition, there is a separate ‘copying’ template (different text to copy) for each
predictor variable level that involves copying within a single observation (those differ
from one observation to another). Later within each observation, two levels are
repeated in order to observe the behavior with increased similarity with the text
being copied.
60
Level Level name Level description
PV:AW Authentic writing Writing a text and drawing a diagram
as being formulated or constructed by
self (not reading or hearing it)
PV:VCC Verbatim copying from Rewriting a text and redrawing a dia-
computer screen gram 1:1 (without changes) – from the
computer screen
PV:VCP Verbatim copying from Rewriting a text 1:1 (without changes)
paper – from a physical paper
PV:VCL Verbatim copying by Listening to a text and rewriting it us-
listening ing computer (without deliberate refor-
mulation)
PV:RCC Reformulative copying Rewriting a text and with own reformu-
from computer screen lation – from the computer screen
Process
The process of each of the observations is outlined in figure 4.3 and it will be held
in the following sequence:
2. Letting the participant install required data gathering software on his or her
computer and thus, set up the observational environment.
3. Starting the observation process by starting the automated data collection and
recording.
• PV:AW
• PV:VCC
• PV:VCP
6. Reading a text to all participants, which they have to rewrite using their
computers (corresponding to PV:VCL).
7. Letting the participant finish the observation by performing the a task under
predictor variable PV:RCC.
61
Figure 4.3: The observation process (including questionnaire)
Sampling
For the participant selection, a variant of nonprobability sampling between purposive
and convenience sampling (Leedy & Ormrod, 2005, p. 206) has been chosen. In
purposive sampling, people are chosen for a specific purpose – being believed to
belong to the target group in this case. Convenience sampling takes people as they
are readily available (ibid).
The sample consists of three purposively selected participants. Although the
whole target population consists of millions of people, with around 3,9 millions
only in the United States in year 2007 (Allen & Seaman, 2008), the sample size
is small because of limiting practical research conditions. As argued by Leedy &
Ormrod (2005), a sample size of 400 people would be adequate for a descriptive
study. Unfortunately, a number even close to this high I perceive as beyond the
research possibilities of this study.
Since the behavioral patterns dependent on tasks one performs simultaneously
with interacting with a computer when being examined are expected to be dis-
tributed as largely general within study’s target group, there are no special require-
ments regarding the sample variety or size besides what is mentioned above.
Environment
The observational environment is a room of a flat or a shared corridor, such as a
living room. Because of limited control over the student examination environment
in conditions of the intended (‘sharp’) use, no special care is taken regarding the
room selection except for (1) silence in the room, (2) comfortable light conditions
and (3) comfortable temperature.
4.4.3 Questionnaire
Within the observations, each physical participant has been asked to fill in an elec-
tronic questionnaire, while already being observed. The questionnaire is further
described in appendix C.
62
Figure 4.4: Data flow and control relations of the data gathering and analysis processes
4.4.4 Analysis
As mentioned earlier, the analysis has two stages – the statistical and the triangula-
tive one. In the former, quantitative data (keystroke and mouse events) taken within
observations are translated into composite constructs having subjectively more di-
rectly applicable and meaningful parameters (see appendix A), in order to describe
the behavior and its dynamics. In the latter, outputs of the former stage are com-
pared and analyzed together with the interpretations of qualitative empirical inputs,
in order to identify possible relations between actions, subject-related factors and
behavior.
Figure 4.4 provides an overview of data flow and control relations of both data
gathering and analysis processes. First, the data are gathered from participants,
their behavior and the environment. Subsequently, the major part of the data is
mediated in the automated process branch (in figure 4.4). Before producing analysis
results, the automatedly gathered data are statistically analyzed and visualized, and
activity-behavior relationships are identified and/or verified using a triangulative
analysis. Both manual data gathering and triangulative analysis are carried out
manually.
Triangulation in context of this work is used to identify and/or verify action-
behavior relationships. Inputs to the triangulative analysis have both quantitative
and qualitative character (as described by figure 4.4). The following are the trian-
gulation input categories:
• Keystroke, mouse and linguistic dynamics together with single keyboard and
mouse interaction events of the participant behavior
63
• Context of the participant tasks including timing
Visualization
Different parameters and their changes in time are visualized by a custom software
written in Java (J2SE) environment.
Statistical analysis
Similarly to how visualization is treated, statistical analysis is done using custom
software. The complexity of statistical measures is fairly low, since those only include
mean and standard deviation of behavioral features.
64
Chapter 5
In this chapter, the analysis process, as well as three observations, which were the
source of empiry for the study, are briefly described. The observation description
includes brief facts about the observation process, observation conditions, and a
brief qualitative representation of quantitative parameters extracted from the obser-
vations.
5.1 Analysis
This section strives to describe the process of analysis, which is dividable into three
layers or parts, according to time sequence as well as level of abstraction. The first,
quantitative molecular level, focuses on the automated extraction of properties the
recorded behavior, such as single key latency and other ones described by measures
listed in appendix A, and selection of time interval based sampling parameters for
an appropriate visualization. The second, qualitative molecular level, focuses on
the transformation of the numeric and plotted (graphically visualized) quantitative
measures to a qualitative description, one by one. Finally, the third, qualitative
molar level, focuses on identifying possible relations between manual observations
and the results of the previous two levels/parts of the analysis, not only within the
analysis of a single observation session, but also across the observation sessions.
The ultimate goal of the analysis was to identify seemingly general or individual
behavioral cues appearing at participants when performing specific tasks as being
observed. The validity of the statement that a behavior tends to leave specific cues
usable for its identification (Ekman, 1985), which can be a subject of computer-based
analysis (Zhou et al., 2003, 2004; Lee et al., 2009) was taken for granted within this
study.
65
basi & Chen (2008), Zhou et al. (2003), and Adkins et al. (2004). The analysis of
mouse dynamics used was somewhat inspired by Ahmed & Traoré (2007) in terms
of mouse operation units (mouse move, drag & drop and point & click), as well as
an angular measurement of those. From a large part, it was designed within the
study. Keystroke dynamics analysis contained largely custom measurement features
designed within the study, besides the well known ones such as timings of single key
and digraph uses (i.e. see Gunetti & Picardi, 2005).
Putting the parameters together adding time, sample duration and overlay (the
sampling parameters) were chosen for a visually optimal resolution of time-based
changes of the measures. Sample duration means the time interval from the first
event within the sample until the last one. Sample overlay means the amount of
mutual overlay between two upcoming samples. The following sampling settings
were chosen based on the readability of plots of the parameters measured:
• Keystroke, mouse and silence dynamics analysis. Sample duration: 4000 mil-
liseconds; Sample overlay: 0.5.
66
for the occurrence of eventual temporary deviations or other specific phenomena,
linking those to their possible causes.
5.2 Observations
Within the analysis process, there is around hundred different parameters of human-
computer interaction dynamics (see appendix A), falling into four major groups –
keystroke dynamics, mouse dynamics, silence dynamics, and linguistic dynamics.
The first three groups are intertwined and mutually time-bound (within the analy-
sis). Because of the huge amount of the data representing those different parameters,
this chapter only describes a few of them, and does so in a qualitative way.
According to the analysis results, part of the computer interaction dynamics
(some of its features) had similar or the same tendency in all observations. Another
part of the features were distinct within one session, as the tasks were performed
by a single participant. In the rest of the features, much difference has not been
visually identified, which might either mean that the difference simply did not exist,
or it was too small for the observer to realize from a visual graphical reading.
Since one of the potential threats of examination is impersonation, it seems
relevant to note that each of the participants shown visibly different typing dynamics
features (e.g. key flight, break consistency, key downtime, key rate, mouse speed
and acceleration in different angles and in general, mouse speed center, silence ratio,
linguistic writing diction, etc). This observation allows for the conclusion that the
participants or people examined as well as impersonation would be identifiable, to
some extent at least (see Jain et al., 2004, on biometrics in general).
Among session-specific highlights, dynamics feature designations are used, which
are characterized in appendix A.
5.3 Observation 1
The first observation took place in the evening at the participant’s home. The light
in the room was getting shady, temperature comfortable, although the participant
did not sit in a comfortable position, and the keyboard was positioned slightly
higher according the participant. Because of prior technical difficulties resulting in
observation data loss, this observation was repeated with a different computer and
keyboard from the participant’s own, all of which might have affected the results.
The environment was silent and there were only few distractions such as one phone
call and reading a message on phone (by the participant).
67
connecting and naming them.
68
- copying: higher peaks
MMSm, MMSsd (speed mean and standard deviation)
- copying: slightly higher peaks
MMACsd (acceleration standard deviation)
- copying: higher peaks
MMCm (curvature mean)
- making: higher positive peaks
- copying: higher negative peaks
* PLAIN MOVES *
MMLsd (move length standard deviation)
- copying: slightly higher peaks
MMDm, MMDsd (move duration mean and standard deviation)
- making: slightly higher, slightly higher peaks
* DRAG MOVES *
MMSm, MMSsd (speed mean and standard deviation)
- making: slightly higher base, lower peaks
- copying: slightly lower base, higher peaks
MMSCsd (speed center standard deviation)
- making: higher peaks
MMlmSPsd (last max speed position standard deviation)
- making: higher peaks
MMACsd (acceleration standard deviation)
- copying: more regular peaks that are slightly higher
MMCm (curvature mean)
- copying: slightly more negative
* CLICKS *
MCCCsd (click count standard deviation)
- copying: higher peaks
MCCRm (click rate mean)
- copying: higher peaks
MCCRsd (click rate standard deviation)
- copying: higher peaks
* DRAGS *
- not apparent
* SILENCE *
S# (silence count):
- copying: slightly less varied
- hearing: slightly more varied than copying
- reformulating: more varied than hearing - about as
much as formulating
SLm, SLsd (latency mean and standard deviation)
- copying, reformulating: slightly higher and higher
peaks, more dense
SR (silence ratio)
- copying: lower
- hearing: higher peaks, but not as high as formulating
* LINGUISTICS *
LWPSm, LWPSsd (words per sentence mean and standard deviation)
- formulating: slightly higher
LWLm, LWLsd (word length mean and standard deviation)
- copying, reformulating: higher
- formulating, hearing: lower
LAr (article ratio)
- copying: slightly higher
LQR (quantifier ratio)
- formulating: higher
LCWR (capital words ratio)
- formulating: higher
69
5.4 Observation 2
The second observation also took place in the evening, in a small flat, however not
belonging to the participant. The room lighting was artificial and according to the
participant, sufficiently bright. The temperature was slightly above the comfortable
level and the participant expressed slightly negative emotions when realizing the ef-
fort asked to complete parts the observation, and felt slightly bored in the beginning.
This observation is not complete, since (1) the participant has cognitively simplified
one task (making up and drawing diagram), and (2) the last question (rewriting text
with reformulation) has been omitted by misunderstanding.
70
than formulating on average
- hearing: closer to copying than formulating
WDLATm (word deliminator latency mean)
- copying: higher than formulating
- hearing: much like formualting
WDLATsd (word deliminator latency standard deviation)
- copying: higher, also peaks higher
- hearing: much like formuating
NWLATm (next word latency mean)
- copying: slightly higher than formulating
- hearing: between formulating and copying, somewhat
more like copying
NWLATsd (next word latency standard deviation)
- copying: more dense than formulating (less 0-values);
otherwise difficult to say (maybe slightly higher alsio)
- hearing: much like copying
* SINGLE KEYS *
SKDTm (single key downtime mean)
- formulating: peaky, less uniform than copying
- copying: visibly more uniform and with smaller peaks
SKDTsd (single key downtime standard deviation)
- formulating: medium density, some peaks
- copying: high density
- hearing: low density
SKRm (key rate mean)
- formulating: slightly lower
- copying: slightly higher
- hearing: between
SKRsd (single key rate standard deviation)
- just density (formulating, hearing ~= copying)
* DIGRAPHS *
DDm (digraph duration mean)
- copying, hearing: slightly more uniform
- about the same values
DDsd (digraph duration standard deviation)
- copying, hearing: slightly more uniform
(this all might be the density issue)
DKRm (digraph key rate mean)
- all much like Dm (digraph duration)
DKRsd (digraph key rate standard deviation)
- formulating, copying: quite much zero
- hearing: higher than the rest
- copying and hearing more dense than formulating
DKFLm (digraph key flight mean)
- copying more dense than formulating
- copying has peaks more often than both
formulating and hearing
DKFLsd (digraph key flight standard deviation)
- much like KRsd (digraph key rate)
W# (word count), K# (single key count), D# (digraph count)
- formulating: more dispersed
- copying: more fluent
- hearing: between formulating and copying
* CLICK MOVES *
MMDIm (move distance mean)
- making: lower peaks; arc-like distributed
- copying: higher peaks, higher in general;
arc^(-1)-like distributed
71
MMDIsd (move distance standard deviation)
- much like MDDIm (move distance mean)
MMAm (move angle mean)
- making: varies more uniformly
- copying: more dispersed
MMAsd (move angle standard deviation)
- making: more arc-like
MMLm (move length mean)
- making: more arc-like
MMLsd (move length standard deviation)
- making: more arc-like
MMDm (move duration mean)
- making: less varied (less zero-values), quite
uniform distribution
- coping: more varied (from 0 to peaks, which are
a bit higher than with making), quite
uniformly distributed, too
MMDsd (move duration standard deviation)
- similar to MMDm
MMmSm (move max speed mean)
- making: higher, less varied, arc-like
- copying: shorter in the middle, more varied,
especially in the ends, arc^(-1)-like
MMmSsd (move max speed standard deviation)
- highly varied; more difficult to see differences
except the arc-like distribution with making
MMSm, Ssd (move speed mean and standard deviation)
- similar to MMmSm and MMmSsd
MMSCm, MMSCsd (speed center mean and standard deviation)
- no visible differences
MMACm, MMACsd (acceleration mean and standard deviation)
- making: much less varied (uniform) than copying; arc-like
- copying: varying from 0-values to peaks
* PLAIN MOVES *
- much like click move, but little less apparent
* DRAG MOVES *
MMLm (move length mean)
- making: higher, more peaky
MMLsd (move length standard deviation)
- making: higher
MMmSm, MMmSsd (max speed mean and standard deviation)
- much like MMLm, MMLsd
MMSm, MMsSd (speed mean and standard deviation)
- much like Lm, Lsd
MMACm, MMACsd (acceleration mean and standard deviation)
- making: higher
MMaCm (acceleration center mean)
- making: lower
MMaCsd (acceleration center standard deviation)
- making: a little more dense
* CLICKS *
MC# (click count)
- copying: more uniformly distributed
* DRAGS *
MDDm (drag duration mean)
- making: slightly shorter
MDTTm (tailing time mean)
- making: shorter, varying to both negative and
72
positive direction from zero
- copying: longer, all negative
* SILENCE *
S# (silence count)
- more varied (jumping to 0)
SDsd (duration standard deviation)
- slightly more dense
SR (silence ratio)
- making: slightly more (boosted sampling time to 40
seconds compared to other measures)
* LINGUISTICS *
LPR (paragraph ratio)
- making: slightly more paragraphs
LWLm, LWLsd (word length mean and standard deviation)
- making: slightly shorter
LDiv (lexical diversity)
- making: slightly lower
LAR (article ratio)
- making: slightly lower
5.5 Observation 3
The third observation was taken on longer distance. The light was natural and
bright, the room temperature was perceived slightly high and the participant has
been under a significant physical load around 2 hours prior to the observation.
The participant was connected from a distant location, which hindered any direct
manual observation. As a quasi-replacement, questions were asked by the observer
in a telephone call. The copying by listening part of this observation is not available.
73
WKRm, WKRsd (word key rate mean and standard deviation)
- formulating: higher peaks
- copying: lower peaks, less varied
- reformulating: much like formulating
WDLATm, WDLATsd (word deliminator latency mean and std. dev.)
- formulating: more varied, less dense
- copying: opposite, does not jump much to 0-values
NWKATm, NWLATsd (next word latency mean and std. dev.)
- much like DLATm, DLATsd
* SINGLE KEYS *
SKRm, SKRsd (single key rate mean and standard deviation)
- formulating: more 0-jumpy, slightly less dense
* CLICK MOVES *
MMDIm (move distance mean)
- making: slightly more peaky; more arc-like distribution
- copying: rather arc^(-1)-like distribution
MMAm (move angle mean)
- making: more arc-like; less varied
- copying: more varied across all angles
MMLm (move length mean)
- making: increasing; higher than with making in the end
MMmSm, MMmSsd (move max speed mean and standard deviation)
- making: arc-like
MMlmSPm, MMlmSPsd (last max speed position mean and std. dev.)
- making: lower, but with slightly higher peaks
* PLAIN MOVES *
- in general much less apparent changes compared to click move
- arc-like distribution difference between making and copying lost
MMAsd (move angle standard deviation)
- moving: slightly higher
MMSm, MMSsd (move speed mean and standard deviation)
- making: slightly lower
MMACm, MMACsd (acceleration mean and standard deviation)
- making: slightly lower (as Sm, Ssd)
* DRAG MOVES *
MMDIsd (move distance standard deviation)
- making: more varying
MMaCm, MMaCsd (absolute curvature mean and standard deviation)
- making: higher, higher peaks
* CLICKS *
MC# (click count)
- making: slightly less
* DRAGS *
MDDm, MDDsd (drag duration mean and standard deviation)
- making: lower
MDMLm, MDMLsd (drag move latency mean and standard deviation)
- making: lower
* SILENCE *
S# (silence count)
- copying: slightly higher
SDm, SDsd (silsnce duration mean and standard deviation)
- copying: significantly shorter
SLATm, SLATsd (silence latency mean and standard deviation)
- copying: very slightly higher
SR (silence ratio)
- copying: significantly smaller
* LINGUISTICS *
LPNR (punctuation ratio)
74
- copying: slightly smaller
75
76
Chapter 6
This chapter qualitatively presents main findings of the study. Those are based on
the observations and aim to provide answers to the research questions of the study.
77
than merely indicating that there is an anomaly from authentic writing for a specific
person.
Logically, better resolution (more different independent parameters are available
about the behavior) gives us better possibilities for valid detection, classification, and
thus indication of an anomaly. The analysis results have shown the difficulty and
high time demands of personally inspecting tens or hundreds of different interaction
dynamics parameters, which pretty much casts a shadow at this approach. The
easier way seems to be through using some automation. A way of automating
the process leads through anomaly detection (Chandola et al., 2009) and pattern
recognition (Theodoridis & Koutroumbas, 2006), both of which offer a range of
differently suitable approaches within.
Since prototyping of the whole cheating detection process in an automated way
was beyond the limits for the study, automated anomaly detection and classification
have not taken place. Instead, these parts were performed manually through reading
and processing input event maps and histograms of the parameters measured, which
was the point of conversion for quantitative descriptive data into qualitative ones.
78
Chapter 7
This chapter presents a brief conclusive summary of the study, followed by discussion
of aspects related to behaviometrics and technology, cue leakage, as well as the
cheating-related background. Besides, it discusses future research options perceived
as beneficial to the topic.
7.1 Conclusion
The study found and described keyboard and mouse related behavioral tendencies
with different modes of writing imposed by different tasks involving activities along
with the writing itself. Thereby, it provided an answer to the first research question
(see section 1.2). The second research question is answered by the description of
significance of the behavioral measure differences provided (see section 5.2).
This work starts from the very fundamentals by rather broadly summarizing the
phenomenon of cheating including its consequences, forms, causes, means of per-
ceiving and combating it. From this point the work continues toward concepts of
automated behavioral analysis, further to its core – the research method, observa-
tions and findings – and ends with conclusions and discussions regarding the core,
conceptually and pragmatically linked to its background. Compared to the research
goals, the scope might arguably seem too broad, although I believe it is important
and valuable to connect the core of the study to as much of its basis (background)
as possible, to make it easier to realize relevant connections and inspire thoughts
leading to further validation or exploration of the field.
The results of the study contain description of behavioral changes imposed by
performing a sequence of specifically assigned tasks. The descriptions have highly
qualitative character, and resolution seemingly too low for a direct applicability to
developing an effective solution of automated cheating detection. Instead, as the
results of an approach prototype, they may have higher potential to inspire and
encourage further research in the area.
79
7.2 Cheating detection and prevention approach discus-
sion
This section discusses the chosen approach and its relation to both behaviometrics
and the phenomenon of examination cheating.
1. Impersonation, which would make a person able to write an exam for someone
else without this being detected within the analysis.
80
Property Remarks
Universality The behavior of anyone who uses a keyboard and mouse to write
a computer-based examination can be measured and analyzed. I
have not identified any principal-level exceptions to this. Exceptions
can occur on the technological level, where they can be limited by
incompatibility of a specific technological solution and the platform
(operating system environment) a person (student) uses.
Uniqueness In case of person authentication (impersonation detection), this pa-
rameter is inherited from behavioral biometrics in general. In case
of detection of the authentic person’s cheating, each specific behav-
ior appeared to have its specifics compared to the other ones for the
person.
Permanence The invariance of specific behavioral characteristics for people has
not been a concern within the study. Supposedly it slightly changes
and a countermeasure against deviation from profile is to regularly
update the profile with examination analyses.
Collectability Collectability requires technology-driven recording of input events,
and the time to write an examination session or a task (part of the
session). Besides, running a technological solution for automated
cheating detection deserves some time spent on administration of the
records stored in the system’s database, etc.
Performance Implemented in Java 6 runtime environment without specific atten-
tion to computation speed optimizations, quantitative analysis of a
1-hour examination takes around 1-3 seconds on a 2.5 GHz Intel Core
2 Duo CPU (T9300), dependent on sampling settings. Operating
memory requirements floated between 0.5 GB to 2 GB, dependent
on sampling settings. This indicates that the quantitative part of the
analysis can be performed on a common workstation, taking almost
negligible time.
Acceptability Acceptability has not been a concern within this study. Supposedly
there might be privacy concerns with respect to (1) gained ability
for future identification of the person based on computer interaction
style, as well as (2) potential ability to extract personality features
from computer interaction style. Besides, people might feel uneasy
being aware that their computer inputs are being recorded during an
examination session.
Circumvention Circumvention possibilities have not been a concern within the study.
Table 7.1: Biometric properties of the approach (with regards to Jain et al. (1999))
81
Figure 7.1: The cheating prevention approach
A potentially helpful part of cheating detection, stress detection, was not taken
into account for this study, because of its perceived difficulty, as understood from
Vizer et al. (2009).
The analysis of linguistic dynamics used in this study was limited to analysis
of lexical units, without reaching further to syntax and semantic relations of the
language (English in case of the study). Presumably, the resolution provided by a
in-deep linguistic analysis, based on e.g. the theory of systemic functional linguistics
(Eggins, 2004; Fawcett, 2008), would certainly outperform the one used, although
the difficulty of performing such analysis did not allow for its application in this
study.
82
figure 3.7, including the extension of Stone et al. (2009). In my opinion, the studied
approach has a high potential for the way of cheating prevention described above,
although the validation of this statement is left up to some other study.
Unfortunately, there is a problem related to combating cheating in general, res-
olution of which remains untouched by this approach. There are usually conditions
when faculty can get liable for student harm, including malicious false accusation,
use of names of individuals not involved in a given cheating case, or violation of
student’s right to due process by ignoring the institution’s procedures for resolv-
ing academically dishonest accusations – as mentioned by Whitley & Keith-Spiegel
(2002) in Wehman (2009). This is also a motivational problem addressing perceived
behavioral control of theory of planned behavior (this time related to combating
cheating). This fact itself gets even strengthened by the notion of misclassification
problems relevant for the approach studied, that triggering cases based on full re-
liance on the studied approach alone might get a faculty in trouble, given certain
considerable situations.
In the end, I believe that the most effective effect of the approach is the preemp-
tive one, while the actual strength to prove already existing cheating is less critical
given that most of the students would be ashamed and strive to avoid being even
detected, not talking about getting punished within an official cheating case.
This approach might well be no exception to what Dick et al. (2003) stated: “an
ounce of prevention is worth pound of cure” (p. 182).
Handling cheating detection issues without automation seems obtrusive, since one
needs to search for it, which imposes certain level of suspicion toward the people
examined. Suspicion as an emotion has a negative character and spending a lot of
time with it might have effects on forming personality on individual level as well
as effects on forming culture on the social level. Besides, it also seems difficult to
achieve equal judgment of cheating for examinations from one student to another
(Cizek, 1999). Human perception has its limits varying based on different factors,
to which technology such as computers seem to be immune. Therefore, although
strength of qualitative analysis tends to be greater at humans, strength of quantita-
tive analysis as well as stability and consistency of routine tasks tend to be higher
at machines. In my own experience, leaving quantitatively difficult routine tasks to
human often leads to low and unstable performance, compared to machines. Applied
to cheating detection or pattern recognition, this is consistent with e.g. threshold
theory (of psychophysics) (McNicol, 2004, chap. 7), which states that a stimulus
needs to be significant enough in order to be taken into account within our per-
ception. For humans, this equilibrium seems to depend on a range of factors, and
changes in time. Therefore, I see a strong need for automating the somewhat trivial,
yet computationally difficult routine tasks, to both achieve a better effectivity and
efficiency.
83
7.3 Research approach discussion
In this study, a rather limited set of inputs was used – those one could capture
through the keyboard and mouse of a computer. I believe that adding more such
as voice/speech, video recognition through commonly available hardware as micro-
phones and web cameras, would improve the potential of indication effectiveness.
Not to mention less commonly accessible measurements such as electrocardiogram
or electroencephalogram, which would increase the detection resolution even more.
In spite of all this, delimiting the study to keyboard and mouse inputs are perceived
reasonable because of the current common availability of this equipment compared
to the rest of the equipment mentioned.
The study involved quite extensive development and use of information tech-
nology in order to be done, because a reliable machine-based recording of the user
inputs together with some automated quantitative analysis support seemed to be
the far best approach to performing the study. At the time of writing, the author is
not aware about any more effective and practically applicable approaches.
Within a short reflection, the validity of this study’s research design and findings
is heavily limited by the population size, purposiveness of the sampling and time
requirement optimizations of the observation process. Carrying out the study again,
I would primarily attempt to lessen those limits. In table 7.2, there is a brief
discussion on four properties of measure validity as categorized by Clark-Carter
(2009) and Kerlinger & Lee (2000).
In the end, there is a need to admit that the applicability of this study’s findings
toward combating cheating is rather indirect. There seems to be a lot to exploration
to be done further, followed by appropriate application of technology and more,
in order to develop a well functioning behavioral anomaly indication approach or
mechanism. There are no silver bullets and in the field of behavioral analysis, it
seems to be actual specifically. Hopefully, this small set of findings is another drop
in the sea furthering the process of achieving the above mentioned goal.
84
Type of validity Remarks
Face validity No doubts about the validity of measures have been noted
during the observations. Personally, I believe different mea-
sures can only help – those patterns in which were not iden-
tified were simply ignored for the session instance, while the
others were taken into account. There have been measures
that appeared relevant and also those, in which I didn’t
identify any patterns across the different tasks performed
by the participants. In context of the analysis, however,
excluding a parameter could limit the analytic capabilities
for other samples, in which the parameter might distinguish
the behavior while performing some tasks. There was no
list of cues to look for – the cues are encoded in combi-
nations of behavioral dynamics measures and time, which
is why every measure with some independence toward the
set of measures already taken into account, is potentially
useful.
Construct validity Since no theory discussing specific behavioral measures was
used in the study or known to me, I’m not able to judge
the construct validity of the measures used.
Content validity The completeness of the measures toward the measured
phenomenon was certainly limited. Although there was
strong effort to maximize the amount of different and in-
dependent measures in order to measure most of the phe-
nomenon, it was practically impossible to measure the phe-
nomenon fully.
Criterion validity Concurrency, or how much do the measures show same re-
sults as other measures of the same phenomenon taken at
the same time, is a parameter I cannot judge, since no alter-
native measures were taken for the validation. I hope and
believe that the concurrency of the measures is high, close
to 100 %. Predictability, or how much do the measures re-
flect past or future states, is limited by a broad range of
factors and maturation as a continuous change or drift of
a person’s behavioral dynamics. Therefore, it is surely be-
low 100 %, although hopefully still high enough, since the
successful use of today’s behaviometric technology for au-
thentication and identification imposes rather low amount
of such changes (Wood et al., 2008).
Table 7.2: Discussion of measure validity with regards to Clark-Carter (2009) and Kerlinger
& Lee (2000)
85
human-computer interaction behavior could be studied, which seems to be a broad
and demanding area, requiring experimental approach. Studying those phenomena
are close to measuring of the emotional state itself (Zavadskas et al., 2008; Kak-
lauskas et al., 2009). Secondly, (5) possibilities of intentional circumvention of the
behavioral measures could be studied, with regards to masquerading driven by e.g.
impersonation of a person, or sabotage of the approach effectivity. Thirdly, (6)
properties of behavior under typical examination conditions could be studied and
described more thoroughly, e.g. in terms of general properties of behavior identified
by Yampolskiy & Govindaraju (2008): speed, correctness, redundancy, consistency
and rule obedience. Finally, on the organizational level, (7) the organizational limi-
tations, appetite for, and scope of applicability of the behavioral cheating detection
approach could be studied. Those might include requirements for maintenance, pri-
vacy aspects, regulations, psychological hygiene issues, etc. Since the use of the
automated cheating detection approach and technological solutions lies on humans
in the organization, organizational aspects deserve attention for the assurance of
such use’s effectivity.
86
Bibliography
Abbasi, A., Chen, H., 2008. CyberGate: A Design Framework And System For
Text Analysis of Computer-Mediated Communication. MIS Quarterly, 32(4), pp.
811-837.
Abdi, H., 2007. Signal Detection Theory (SDT). In Salkind, N., ed., 2007. Encyclo-
pedia of Measurement and Statistics. Thousand Oaks, Canada: Sage.
Adkins, M., Twitchell, D.P., Burgoon, J.K., Nunamaker, J.F., 2004. Advanced in
Automated Deception Detection in Text-Based Computer-Mediated Communica-
tion. Proceedings of SPIE, Bellingham: SPIE, Vol. 5423, pp. 122-129.
Ahmed, A.A.E., Traoré, I., 2007. A New Biometric Technology Based on Mouse
Dynamics. IEEE Transactions on Dependable and Secure Computing, 4(3), pp.
165-179.
Ajufor, N., Amalraj, A., Diaz, R., Islam, M., Lampe, M., 2008. Refinement of a
Mouse Movement Biometric System. In Proceedings of Student-Faculty Research
Day, CSIS, Pace University. New York City, USA, 2 May 2008.
Ajzen, I., 1991. The Theory of Planned Behavior. Organizational Behavior and Hu-
man Decision Processes, 50(2), pp. 179-211.
Allen, E.I., Seaman, J., 2003. Seizing the Opportunity: The Quality and Extent of
Online Education in the United States, 2002 and 2003. Needham: Sloan Consor-
tium.
Allen, E.I., Seaman, J., 2005. Growing By Degrees: Online Education in the United
States, 2005. Needham: Sloan Consortium.
Allen, E.I., Seaman, J., 2007. Online Nation: Five years of growth in online learning.
Needham: Sloan Consortium.
Allen, E.I., Seaman, J., 2008. Staying the Course: Online Education in the United
States, 2008. Needham: Sloan Consortium.
87
Anderson, R.J., 2008. Security Engineering: A Guide to Building Dependable Dis-
tributed Systems. 2nd edition. Indianapolis: Wiley Publishing, Inc.
Anolli, L., Balconi, M., Ciceri, R., 2001. Deceptive Miscommunication Theory
(DeMiT): A New Model for the Analysis of Deceptive Communication. In Anolli,
L., Ciceri, R., Riva, G., eds., 2001. Say not to Say: New perspectives on miscom-
munication, IOS Press.
Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S., 2007. Stylis-
tic Text Classification using Functional Lexical Features. Journal of the American
Society for Information Science and Technology, 58(6), pp. 802-822.
Åström, K.J., Murray, R.M., 2008. Feedback Systems: An Introduction for Scientists
and Engineers. Princeton: Princeton University Press.
Bandura, A., 2002. Selective Moral Disengagement in the Excercise of Moral Agency.
Journal of Moral Education, 31(2), pp. 101-119.
Bates, A.W.T., 1995. Creating the Future: Developing a vision in open and distance
learning. In F. Lockwood, ed. 1995. Open and Distance Learning Today. London:
Routledge. Ch. 5.
Bates, A., 2005. Technology, E-learning and Distance Education. 2nd ed. London:
Routledge
Bell, J., 2005. Doing Your Research Project: A Guide for First-Time Researchers in
Education, Health and Social Science, 4th ed. Berkshire: McGraw-Hill Education,
2005.
Bourne, J., Harris, D., Mayadas, F., 2005. Online Engineering Education: Learning
Anywhere, Anytime. Journal for Asynchronous Learning Networks, 9(1), pp. 15-
41.
Bours, P., Fullu, C.J., 2009. A Login System Using Mouse Dynamics. In Proceed-
ings of the Fifth International Conference on Intelligent Information Hiding and
Multimedia Signal Processing, Kyoto, Japan, 12-14 September 2009.
88
Carlsmith, K.M., Darley, J.M., Robinson, P.H., 2002. Why do we punish? Deter-
rence and just deserts as motives for punishment. Journal of personality and social
psychology, 83(2) pp. 284-299.
Chandola, V., Banerjee, A., Kumar, V., 2009. Anomaly Detection: A Survey. ACM
Computing Surveys, 41(3), article 15.
Cizek, G.J. 1999. Cheating on Tests: How to Do It, Detect It and, Prevent It.
Mahwah: Lawrence Erlbaum Associates Inc.
Cizek, G.J., 2003. Detecting and Preventing Classroom Cheating: Promoting In-
tegrity in Assessment. Thousand Oaks: Dorwin Press, Inc.
Covington, M.V., 2000. Goal Theory, Motivation and School Achievement: An In-
tegrative Review. Annual Review of Psychology, 51(1), pp. 170-200.
Crossley, S.A., Louwerse, M.M., McCarthy, P.M., McNamara, D.S., 2007. A Linguis-
tic Analysis of Simplified and Authentic Texts. The Modern Language Journal,
91(1), pp. 15-30.
Dennis, A.R., Valacich, J.S., 1999. Rethinking Media Richness: Towards a Theory of
Media Synchronicity. In Proceedings of the 32nd Hawaii International Conference
on System Sciences, Maui, Hawaii, 5-8 January 1999.
DePaulo, B.M., Malone, B.E., Lindsay, J.J., Muhlenbruck, L., Charlton, K., Cooper,
H., 2003. Cues to Deception. Psychological Bulletin, 129(1), pp. 74-118.
Deubel, P., 2003. Learning from Reflections - Issues in Building Quality Online
Courses. Online Journal of Distance Learning Administration, [Online]. 6 (3),
Available at: http://www.westga.edu/~distance/ojdla/browsearticles.php
[Accessed 2010-03-02].
Dick, M., Sheard, J., Bareiss, C., Carter, J., Joyce, D., Harding, T., Laxer, C. 2003.
Addressing student cheating: Definitions and solutions. ACME SIGCSE Bulletin,
35(2), pp. 172-184.
Diekhoff, G.M., LaBeff, E.E., Clark, R.E., Williams, L.E., Francis, B., Haines, V.J.,
1996. College Cheating: Ten Years Later. Research in Higher Education 37(4),
pp. 487-503.
89
Doddington, G., Liggett, W., Martin, A., Przybocki, M., Reynolds, D., 1998.
SHEEP, GOATS, LAMBS and WOLVES: A Statistical Analysis of Speaker Per-
formance in the NIST 1998 Speaker Recognition Evaluation. Proceedings of In-
ternational Conference on Spoken Language Processing, 1998.
Eccles, J.S., Wigfield, A., 2002. Motivational Beliefs, Values and Goals. Annual
Review of Psychology, 53(1), pp. 109-132.
Ekman, P., 1985. Telling Lies: Cues to Deceit in the Marketplace, Politics, and
Marriage. New York: W. W. Norton & Company Inc.
Faucher, D., Caves, S., 2009. Academic Dishonesty: Innovative cheating techniques
and the detection and prevention of them. Teaching and Learning in Nursing,
4(2), pp. 37-41.
Fawcett, R.P., 2008. Invitation to Systemic Functional Linguistic through the Cardiff
Grammar, 3rd ed. London: Equinox Publishing Ltd.
Fuller, C., Burgoon, J.K., Twitchell, D.P., Biros, D.P., Adkins, M., 2006. An Analy-
sis of Text-Based Deception Detection Tools. In Proceedings of the Twelfth Amer-
ican Conference on Information Systems, Acapulco, Mexico, 4-6 August 2006.
Furnell, S., Evangelatos, K., 2007. Public awareness and perceptions of biometrics.
Computer Fraud and Security, 2007(1), pp. 8-13.
Gamboa, H., Fred, A., 2004. A Behavioural Biometric System Based on Human
Computer Interaction. In Proceedings of SPIE, 2004.
Giot, R., El-Abed, M., Rosenberger, C., 2009. Keystroke Dynamics with Low Con-
straints SVM Based Passphrase Enrollment. In Proceedings of IEEE Third Inter-
national Conference on Biometrics: Theory, Applications and Systems, Washing-
ton, USA, 28-30 September 2009.
Graesser, A.C., McNamara, D.S., Louwerse, M.M., Cai, Z., 2004. Coh-Metrix: Anal-
ysis of text on cohesion and language. Behavior Research Methods, Instruments
& Computers, 36(2), pp. 193-202.
Gunetti, D., Picardi, C., 2005. Keystroke Analysis of Free Text. ACM Transactions
on Information and System Security, 8(3), pp. 312-347.
Harris, R., 2009. Anti-Plagiarism Strategies for Research Papers. [Online] Available
at: http://www.virtualsalt.com/antiplag.htm [Accessed: 2010-03-11].
Hawkridge, D., 1995. The Big Bang Theory in Distance Education. In F. Lockwood,
ed. 1995. Open and Distance Learning Today. London: Routledge. Ch. 1.
90
Hempstalk, K., 2008. You are what you type? In Proceedings of New Zealand
Computer Science Research Student Conference, Christchurch, New Zealand, 14-
18 April 2008.
Herberling, M., 2002. Maintaining Academic Integrity in On-Line Education. Online
Journal of Distance Learning Administration, [Online]. 5 (1), Available at: http:
//www.westga.edu/~distance/ojdla/browsearticles.php [Accessed 2010-02-
28].
Heyneman, S.P., 2002. Education and Corruption. Annual Meeting of the Associa-
tion for the Study of Higher Education, 20 November 2002, Sacramento, Califor-
nia.
Hinman, L.M., 1997. Cultivating Integrity to Combat Plagiarism. [Online]. San
Diego: San Diego Union-Tribune. Available at: http://ethics.sandiego.edu/
lmh/op-ed/combat-plagiarism/index.asp [Accessed: 2010-03-09].
Holmberg, B., 1995. Theory and practice of distance education. 2nd ed. London:
Routledge
Hosseinzadeh, D., Krishnan, S., 2009. Gaussian Mixture Modeling of Keystroke
Patterns for Biometric Applications. IEEE Transactions on Systems, Man, and
Cybernetics – Part C: Applications and Reviews, 38(6), pp. 816-826.
Howell, S.L., Williams, P.B., Lindsey, N.K., 2003. Thrirty-two Trends Affecting Dis-
tance Education: An Informed Foundation for Strategic Planning. Online Jour-
nal of Distance Learning Administration, [Online]. 6 (3), Available at: http:
//www.westga.edu/~distance/ojdla/browsearticles.php [Accessed 2010-02-
28].
Howell S.L., Sorensen, D., Tippets, H.R., 2009. The New (and Old) News about
Cheating for Distance Educators. Online Journal of Distance Learning Admin-
istration, [Online]. 12 (3), Available at: http://www.westga.edu/~distance/
ojdla/browsearticles.php [Accessed 2010-03-02].
Huang, J.K., 2006. A Functional Approach to Pattern Recognition Theory. In pro-
ceedings of IEEE International Conference on Granular Computing, 10-12 May
2006, IEEE Computer Society, pp. 700-703.
Ignatenko, T., Willems, M.J., 2009. Biometric Systems: Privacy and Secrecy As-
pects. IEEE Transactions on Information Forensics and Security, 4(4), pp. 956-
973.
Ilonen, J., 2003. Keystroke dynamics. [Lecture paper] Lappeenranta: Lappeenranta
University of Technology.
Irele, M.E., 2005. Can Distance Education be Mainstreamed? Online Journal of
Distance Learning Administration, [Online]. 8 (2), Available at: http://www.
westga.edu/~distance/ojdla/browsearticles.php [Accessed 2010-02-26].
Iyer, R., Eastman, J.K., 2008. The Impact of Unethical Reasoning on Academic
Dishonesty: Exploring the Moderating Effect of Social Desirability. Marketing
Education Review, 18(2), pp. 21-33.
91
Jagadeesan, H., Hsiao, M.S., 2009. A Novel Approach to Design of User Re-
Authentication Systems. In Proceedings of 3rd IEEE International Conference on
Biometrics: Theory, Applications and Systems, Washington, USA, 28-30 Septem-
ber 2009.
Jain, A.K., Bolle, R., Pankanti, S., 1999. Introduction to Biometrics. In Jain, A.K.,
Bolle, R., Pankanti, S., eds., 1999. Biometrics: Personal Identification in Net-
worked Society. Norwell: Kluwer Academic Publishers.
Jain, A.K., Duin R.P.W., Mao, J., 2000. Statistical Pattern Recognition: A Review.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), pp. 4-37.
Jain, A.K., Ross, A., Prabhakar, S., 2004. An Introduction to Biometric Recognition.
IEEE Transactions on Circuits and Systems for Video Technology, Special Issue
on Image- and Video-Based Biometrics, 14(1), pp. ?-?.
Johnson, P.K., Raye, C.L., Reality Monitoring. Psychological Review, 88(1), pp.
67-85.
Kaklauskas, A., Krutinis, M., Seniut, M., 2009. Biometric Mouse Intelligent System
for Student’s Emotional and Examination Process Analysis. In Proceedings of
Ninth IEEE International Conference on Advanced Learning Technologies, Riga,
Latvia, 15-17 July 2009.
Karnan, M., Akila, M., 2009. Identity Authentication based on Keystroke Dynamics
using Genetic Algorithm and Particle Swarm Optimization. In Proceedings of 2nd
IEEE International Conference on Computer Science and Information Technol-
ogy, Bejing, China, 8-11 August 2009.
Keegan, D., 1996. Foundations of distance education. 3rd ed. London: Routledge
Kelso, J.A.S., ed., 1982. Human Motor Behavior: An Introduction. Mahwah, USA:
Lawrence Erlbaum Associates, Inc.
Kerlinger, F.N., Lee, H.B., 2000. Foundations of Behavioral Research, 4th ed. New
York, USA: Thomson Learning.
Kim, N., Smith, M.J., Maeng, K., 2008. Assessment in Online Distance Education: A
Comparison of Three Online Programs at a University. Online Journal of Distance
Learning Administration, [Online]. 11 (1), Available at: http://www.westga.
edu/~distance/ojdla/browsearticles.php [Accessed 2010-03-11].
Koul, B.N., 1995. Trends, Directions and Needs: A view from developing countries.
In F. Lockwood, ed. 1995. Open and Distance Learning Today. London: Routledge.
Ch. 3.
Kuukkanen, J.-M., 2007. Kuhn, the correspondence theory of truth and coherentist
epistemology. Studies in History and Philosophy of Science, 38(1), pp. 555-566.
92
Le Heron, J., 2001. Plagiarism, learning dishonesty or just plain cheating: The
context and countermeasures in Information Systems teaching. Australian Journal
of Educational Technology, 17(3) pp. 244-264.
Lee, C., Welker, R.B., Odom, M.D., 2009. Features of Computer-Mediated, Text-
Based Messages that Support Automatable, Linguistic-Based Indicators for De-
ception Detection. Journal of Information Systems, 23(1), pp. 5-24.
Leedy, P.D., Ormrod, J.E., 2005. Practical research: planning and design. 8th ed.
Upper Saddle River: Pearson Prentice Hall.
Love, P.G., Simmons, J., 1998. Factors influencing cheating and plagiarism among
graduate students in a college of education. College Student Journal, 35(4), pp.
539-551.
Mason, R., 1995. Using Electronic Networking for Assessment. In F. Lockwood, ed.
1995. Open and Distance Learning Today. London: Routledge. Ch. 20.
McCabe, D.L., Pavela, G., 2004. Ten (Updated) Principles of Academic Integrity.
Change, 36(3), pp. 10-16.
McNicol, D., 2004. A primer of signal detection theory. Mahwah, USA: Lawrence
Erlbaum Associates, Inc.
Megehee, C.M., Spake, D.F., 2008. The Impact of Perceived Peer Behavior, Probable
Detection and Punishment Severity on Student Cheating Behavior. Marketing
Education Review, 18(2), pp. 5-19.
Monrose, F., Rubin, A.D., 2000. Keystroke dynamics as a biometric for authentica-
tion. Future Generation Computer Systems, 16(1), pp. 351-359.
Moore, M., Kearsley, G., 1996. Distance Education: A Systems View. 1st ed. Bel-
mont: Wadsworth Publishing Company
Moskovitch, R., Feher, C., Messerman, A., Kirschnick, N., Mustafić, T., Camtepe,
A., Löhlein, B., Heister, U., Möller, S., Rokach, L., Elovici, Y., 2009. Identity
Theft, Computers and Behavioral Biometrics. IEEE Intelligence and Security In-
formatics, Richardson, USA, 8-11 June 2009.
93
O’Leary, D.P., 1999. 12 Professional Ethics. [Online]. College Park: University of
Maryland, Department of Computer Science. Available at: http://www.cs.umd.
edu/%7Eoleary/gradstudy/node13.html [Accessed: 2010-03-09].
Olt, M.R., 2002. Ethics and Distance Education: Strategies for Minimizing Aca-
demic Dishonesty in Online Assessment. Online Journal of Distance Learning Ad-
ministration, [Online]. 5 (3), Available at: http://www.westga.edu/~distance/
ojdla/browsearticles.php [Accessed 2010-02-28].
Parker, A., 2003. Motivation and Incentives for Distance Faculty. Online Journal
of Distance Learning Administration, [Online]. 6 (3), Available at: http://www.
westga.edu/~distance/ojdla/browsearticles.php [Accessed 2010-02-28].
Paulsen, M.F., Rekkedal, T., 2001. Voksne kan og vill lære på Internett. In Paulsen,
M.F., ed., 2001. Nettbasert utdanning: Erfaringer og visjoner. Bekkestua: NKI
Forlaget.
Paulsen, M.F., 2001. Studenters syn på nettbasert utdanning. In Paulsen, M.F., ed.,
2001. Nettbasert utdanning: Erfaringer og visjoner. Bekkestua: NKI Forlaget.
Peacock, A., Ke, X., Wilkerson, M., 2004. Typing Patterns: A Key to User Iden-
tification. IEEE Security & Privacy Magazine, pp. 40-47. September/October,
2004.
Rettinger, D.A., Kramer, Y., 2008. Situational and Personal Causes of Student
Cheating. Research in Higher Education, 50(3), pp. 293-313.
Reushle, S., Dorman, M., Evans, P., Kirkwood, J., McDonald, J., Worden, J., 1999.
Critical Elements: Designing for online teaching. Proceedings of ASCILITE99
Responding to Diversity: 16th Annual Conference, QUT, Brisbane, 5-8 December.
Reushle, S., McDonald, J., 2004. Online learning: Transcending the physical. In
Logan Campus, Griffith University: ETL Conference, 2004. Brisbane, Australia,
04-05 November 2004.
94
Rybnik, M., Panasiuk, P., Saeed, K., 2009. User Authentication with Keystroke
Dynamics using Fixed Text. International Conference on Biometrics and Kansei
Engineering, Cieszyn, Poland, 25-28 June 2009.
Sapsford, R. and Jupp, V., 1996. Data Collection and Analysis. London: Sage.
Shanmugapriya. D., Padmavathi, G., 2009. A Survey of Biometric Keystroke Dy-
namics: Approaches, Security and Challenges. International Journal of Computer
Science and Information Security, 5(1), pp. 115-119.
Shen, J., Bieber, M., Cheng, K., Hiltz, S.R., 2004. Traditional In-class Examination
vs. Collaborative Online Examination in Asynchronous Learning Networks: Field
Evaluation Results. Proceedings of the Tenth Americas Conference on Information
Systems, New York, August 2004.
Shen, C., Cai, Z., Guan, X., Sha, H., Du, J., 2009. Feature Analysis in Mouse
Dynamic in Identity Authentication and Monitoring. In Proceedings of IEEE In-
ternational Conference on Communications, Dresden, Germany, 14-18 June 2009.
Shon, P.C.H., 2006. How College Students Cheat On In-Class Examinations: Cre-
ativity, Strain, and Techniques of Innovation. Plagiary: Cross-Disciplinary Studies
in Plagiarism, Fabrication, and Falsification, 1(10): pp. 1-20.
Smith, J.A., 2008. Qualitative Psychology: A Practical Guide to Research Methods.
London: SAGE Publications Ltd.
Stakhanova, N., Basu, S., Wong, J., 2010. On the symbiosis of specification-based
and anomaly-based detection. Computers & Security, 29(1), pp. 253-268.
Stelmach, G.E., Requin, J., eds., 1980. Tutorials in Motor Behavior. Amsterdam:
North-Holland Publishing Company.
Stone, T.H., Jawahar, I.M., Kisamore, J.L., 2009. Using the theory of planned be-
havior and cheating justifications to predict academic misconduct. Career Devel-
opment International, 14(3), pp. 221-241.
Stumber-McEwen, D., Wiseley, P., Hoggatt, S., 2009. Point, Click and Cheat:
Frequency and Type of Academic Dishonesty in the Virtual Classroom. On-
line Journal of Distance Learning Administration, [Online]. 12 (3), Available
at: http://www.westga.edu/~distance/ojdla/browsearticles.php [Accessed
2010-03-02].
Tappert, C.C., Villani, M., Cha, S., 2009. Keystroke Biometric Identification and
Authentication on Long-Text Input. In Wang, L., Geng, X., eds. 2009. Behav-
ioral Biometrics for Human Identification: Intelligent Applications, Hershey: IGI
Global, pp. 342-367.
Theodoridis, S., Koutroumbas, K., 2006. Pattern Recognition, 3rd edition. San Diego,
USA: Academic Press, Elsevier.
Thomason, M.G., 1990. Introduction and Overview. In Bunke, H., Sanfeliu, A., eds.,
1990. Syntactic and structural pattern recognition: theory and applications (Series
in computer science; vol. 7). Singapore: World Scientific Publishing Co. Pte. Ltd.
95
Thorpe, M., 1995. The Challenge Facing Course Design. In F. Lockwood, ed. 1995.
Open and Distance Learning Today. London: Routledge. Ch. 17.
Thorpe, J., Van Oorschot, P.C., Somayaji, A., 2005. Pass-thoughts: Authenticat-
ing with Our Minds. In Proceedings of New Security Paradigms Workshop, Lake
Arrowhead, USA, 20-23 September 2005, pp. 45-56.
Usick, B., 2004. Preventing Plagiarism: A new Three-R Model. Paper presented
on 3rd annual UTS Teaching and Learning Symposium. Winnipeg, Canada, 06
February 2004.
Villani, M., Tappert, C., Ngo, G., Simone, J., Fort, H.S., Cha, S., 2006. Keystroke
Biometric Recognition Studies on Long-Text Input under Ideal and Application-
Oriented Conditions. In Proceedings of Student/Faculty Research Day, CSIS, Pace
University. New York City, USA, 5 May 2006.
Vizer, L.M., Zhou, L., Sears, A., 2009. Automated stress detection using keystroke
and linguistic features: An exploratory study. International Journal of Human-
Computer Studies, 67(10), pp. 870-886.
Watson, G., Sottile, J., 2010. Cheating in the Digital Age: Do students cheat
more in online courses? Online Journal of Distance Learning Administra-
tion, [Online]. 13 (1), Available at: http://www.westga.edu/~distance/ojdla/
browsearticles.php [Accessed 2010-03-11].
Wehman, P., 2009. Faculty Prescriptions for Academic Integrity: An Urban Campus
Perspective. Ph. D., Pittsburgh: University of Pittsburgh.
Whitley, B.E., Keith-Spiegel, P., 2001. Introduction to the Special Issue. Ethics and
Behavior, 11(3), pp. 217-218.
Whitman, M., Mattord, H., 2007. Guide to Network Defense and Countermeasures.
2nd ed. Boston: Course Technology, Cengage Learning.
Whitman, M., Mattord, H., 2008. Management of Information Security. 2nd ed.
Boston: Course Technology, Cengage Learning.
96
Wood, E., Zelaya, J., Saari, E., King, K., Gupta, M., Howard, N., Ismat, S., Kane,
M.A., Naumowicz, M., Varela, D., Villani, M., 2008. Longitudinal Keystroke Bio-
metric Studies on Long-Text Input. Proceedings of Student-Faculty Research Day,
CSIS, Pace University, May 2, 2008.
Yager, N., Dunstone, T., 2010. The Biometric Menagerie. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 32(2), pp. 220-230.
Yampolskiy, R.V., Govindaraju, V., 2007. Direct and Indirect Human Computer
Interaction Based Biometrics. Journal of Computers, 2(10), pp. 76-88.
Zavadskas, E., Kaklauskas, A., Seniut, M., Dzemyda, G., Ivanikovas, S., Stankevic,
V., Simkevičius, C., Jaruševičius, A., 2008. Web-Based Biometric Mouse Intelli-
gent System for Analysis of Emotional State and Labour Productivity. In Pro-
ceedings of The 25th International Symposium on Automation and Robotics in
Construction, Vilnius, Lithuania, 26-29 June 2008.
Zimmermann, P., Guttormsen, S., Danuser, B., Gomez, P., 2003. Affective Comput-
ing – A Rationale for Measuring Mood with Mouse and Keyboard. International
Journal of Occupational Safety and Ergonomics, 9(4), pp. 539-551.
Zhou, L., Twitchell, D.P., Qin, T., Burgoon, J.K., Nunamaker, J.F., 2003. An
Exploratory Study into Deception Detection in Text-based Computer-Mediated
Communication. In Proceedings of the 36th Hawaii International Conference on
System Sciences, Waikoloa Village: Island of Hawaii, 6-9 January 2003.
Zhou, L., Burgoon, J.K., Nunamaker, J.F., Twitchell, D., 2004. Automated
Linguistics-Based Cues for Detecting Deception in Text-based Asynchronous
Computer-Mediated Communication. Group Decision and Negotiation 13 (in
press), pp. 81-106.
97
98
Appendix A
Subjects of automated
observation
The automated path of the empirical inputs is depicted in figure A.2. The inputs
are being captured by the sensor module on the computer the user (participant) is
99
using, until they are analyzed and extracted features from.
Parts of the feature names describe the character of the features. A few terms
used further in this appendix are explained in table A.1.
When working with samples for graphical representation of the features, two
additional parameters play a role:
• Sample duration, which determines how long (timewise) are the samples for
the analysis of a whole session.
• Sample overlap, which determines how large part of two following samples
overlap within the analysis of a whole session.
Adjusting those parameters plays a role for analysis and visualization of the param-
eters across the whole session.
100
Term Description
Duration The time while the composite is being put in.
Latency The time before the occurrence of a specific event or composite.
Rate The frequency of occurrence.
Downtime The time while a key or button is pressed.
Flight The time from releasing former key or button until pressing the
next one. It can have a negative value.
Distance Length of the shortest way from source point to destination point.
Length Length of the trajectory the mouse pointer has gone when going
from source point to destination point.
Center A point in interval where the value integrated across the interval
from 0 to the actual position is half of the value integrated across
the whole definition set (the interval from 0 to the end). The
definition set is session- or sample-relative time.
Ratio A ratio between occurrence of a specific composite across all of
the same type, or a ratio between summed duration of a specific
composite across the duration of all others of the same type.
Tailing time The time from button release and mouse move end.
Digraph Sequence of two keys within a word.
Multikey Multiple key L2 composite.
Single key Single key L2 composite.
Word Word L2 composite.
Mean The average
PN
value across all samples in a specific population of
i=0 valuei
size N . N
Standard deviation The square root of value q Pvariance across all samples in a specific
i=0 (valuei −mean)
N 2
population of size N . N
101
Feature designation Feature name (description)
KA Keyboard activity
GDTm Any key downtime mean
GDTsd Any key downtime standard deviation
GKRm Any key key rate mean
GKRsd Any key key rate standard deviation
MK# Multikey count
MKDm Multikey duration mean
MKDsd Multikey duration standard deviation
MKDTm Multikey downtime mean
MKDTsd Multikey downtime standard deviation
MKKFLm Multikey key flight mean
MKKFLsd Multikey key flight standard deviation
MKKRm Multikey key rate mean
MKKRsd Multikey key rate standard deviation
W# Word count
WLˆ Word length maximum
WLm Word length mean
WLsd Word length standard deviation
WDˆ Word duration maximum
WDm Word duration mean
WDsd Word duration standard deviation
WKDm Word key duration mean
WKDsd Word key duration standard deviation
WKFLm Word key flight mean
WKFLsd Word key flight standard deviation
WDLATm Word deliminator latency mean
WDLATsd Word deliminator latency standard deviation
NWLATm Next word latency mean
NWLATsd Next word latency standard deviation
SK# Single key count
SKDTm Single key downtime mean
SKDTsd Single key downtime standard deviation
SKRm Single key rate mean
SKRsd Single key rate standard deviation
D# Digraph count
DDm Digraph duration mean
DDsd Digraph duration standard deviation
DKRm Digraph key rate mean
DKRsd Digraph key rate standard deviation
DKFLm Digraph key flight mean
DKFLsd Digraph key flight standard deviation
DD1m Digraph key 1 duration mean
DD1sd Digraph key 1 duration standard deviation
DD2m Digraph key 2 duration mean
DD2sd Digraph key 2 duration standard deviation
102
Feature designation Feature name (description)
MM# Mouse move count
MMDIm Mouse move distance mean
MMDIsd Mouse move distance standard deviation
MMAm Mouse move angle mean
MMAsd Mouse move angle standard deviation
MMLm Mouse move length mean
MMLsd Mouse move length standard deviation
MMDm Mouse move duration mean
MMDsd Mouse move duration standard deviation
MMmSm Mouse move maximal speed mean
MMmSsd Mouse move maximal speed standard deviation
MMSm Mouse move speed mean
MMSsd Mouse move speed standard deviation
MMSCm Mouse move speed center mean
MMSCsd Mouse move speed center standard deviation
MMlmSPm Mouse move last maximal speed position mean
MMlmSPsd Mouse move last maximal speed position standard deviation
MMACm Mouse move acceleration mean
MMACsd Mouse move acceleration standard deviation
MMACCm Mouse move acceleration center mean
MMACCsd Mouse move acceleration center standard deviation
MMaCm Mouse move absolute curvature mean
MMaCsd Mouse move absolute curvature standard deviation
MMCm Mouse move curvature mean
MMCsd Mouse move curvature standard deviation
MC# Mouse click count
MCCCm Mouse click click count mean
MCCCsd Mouse click click count standard deviation
MCDTm Mouse click downtime mean
MCDTsd Mouse click downtime standard deviation
MCFLm Mouse click flight time mean
MCFLsd Mouse click flight time standard deviation
MCCRm Mouse click click rate mean
MCCRsd Mouse click click rate standard deviation
MS# Mouse scroll count
MSSCm Mouse scroll scroll count mean
MSSCsd Mouse scroll scroll count standard deviation
MSSRm Mouse scroll scroll rate mean
MSSRsd Mouse scroll scroll rate standard deviation
MD# Mouse drag count
MDDm Mouse drag duration mean
MDDsd Mouse drag duration standard deviation
MDMLATm Mouse drag move latency1 mean
MDMLATsd Mouse drag move latency standard deviation
MDTTm Mouse drag tailing time mean
MDTTsd Mouse drag tailing time standard deviation
103
Feature designation Feature name (description)
S# Silence count
SDm Silence duration mean
SDsd Silence duration standard deviation
SLATm Silence latency mean
SLATsd Silence latency standard deviation
STC Silence time center
SR Silence ratio
104
Appendix B
105
106
Appendix C
Each participant of the observation sessions was asked to fill in the questionnaire
and perform the tasks described below in the appendix.
C.1 Questionnaire
Please try to answer my questions in an essay-like text. There are no
specific demands for formulation or diction besides that I’d like you to
avoid writing in bullets. Please try to reflect on what the questions are
asking and formulate the answers into sentences and even paragraphs if
you wish.
Please try to answer as much ’from the heart’ as possible, and use neutral
answers only if you think they truly reflect your feelings. If you don’t
feel comfortable about answering a question, please omit it and skip to
some further question. :)
Did you drink a tee or coffee before the meeting? If so, how much and how long
ago?
Did you eat lunch or something smaller before the meeting?
Have you been traveling (walking, riding bicycle) or physically exercising in some in
the past minutes?
Have you felt busy or relaxed recently (today)?
Have you experienced anything unusual that could affect your mood in some way
today? If so, can you describe it a little?
Feel free to add any other remarks to how you feel or what has happened to you
these days – something quite positive, negative, or both?
107
How do you feel today (bad, good, happy, sad, sleepy, ...)? Feel free to describe
as much as you wish.
How much light do you have in the room (little, accurately much, too much)?
How is the light quality in your room? Do you have sunlight, fluorescent lamps or
good old light bulbs?
How do you feel about the temperature in your room (colder, comfortable, warmer)?
What do you think or feel about this session so far (somewhat long and boring, ar-
tificial, indifferent, relax for you, or something else)?
How do you feel about the atmosphere where you are in general? Feel free to specify.
How long do you use the computer you are using now (approximation)?
Do you like the comfort your keyboard provides you?
How do you feel about the mouse this way?
If you have anything else to highlight about your equipment, situation or feelings,
feel free to share it – indicate, mention or describe. :)
Please answer the questions below using a couple of sentences for each.
You can be more verbose if you like and if reflecting on the questions
makes you feel happy!
If you think about your past study time, what have been your favorite courses
and why? Why did you like what you did about them?
What are your hobbies or simply activities you like to do in your free time? Why
do you like them and what makes them interesting to you?
Try to imagine yourself in a couple of years from now. What would you like to
do or work with? Where would you like to live? What would you like to have? Do
you have any specific ambitions you want to fulfill one day? Feel free to share it.
108
Figure C.1: Example free diagram
The system delivers functionality and information to clients across the public In-
ternet through one or more Web servers. Larger systems may use multiple Web
servers and multiple application servers to deliver this functionality, all protected
by a demilitarized zone. The application must exchange data with the client. A
percentage of this data will be sensitive in nature.
Now, please open a painting program again, and try to ’copy’ the following dia-
gram (paint it as similar as the one here):
Now, please pick any book of yours and try to copy some paragraph, or
a few sentences (around 10 or more).
109
Figure C.2: Diagram to copy (redraw)
110