0% found this document useful (0 votes)
14 views

Lecture 12 - Evaluation

The document discusses evaluation methods like usability testing and experiments. It explains that usability testing involves observing typical users perform typical tasks with a product in a controlled setting. Data is collected through video recordings and interaction logs to measure performance times, errors, and user satisfaction. Experiments test hypotheses about the relationship between variables, while usability testing aims to check that a system is usable. The document outlines factors to consider for usability testing like representative users and tasks, controlled conditions, and collecting data on metrics like completion times and error rates. It recommends testing with 5-10 participants.

Uploaded by

Red Armani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture 12 - Evaluation

The document discusses evaluation methods like usability testing and experiments. It explains that usability testing involves observing typical users perform typical tasks with a product in a controlled setting. Data is collected through video recordings and interaction logs to measure performance times, errors, and user satisfaction. Experiments test hypotheses about the relationship between variables, while usability testing aims to check that a system is usable. The document outlines factors to consider for usability testing like representative users and tasks, controlled conditions, and collecting data on metrics like completion times and error rates. It recommends testing with 5-10 participants.

Uploaded by

Red Armani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Lecture 12 (Part A)

EVALUATION
The aims
• Explain the key concepts and terms used in evaluation
• Introduce different types of evaluation methods.
• Show how different evaluation methods are used for
different purposes at different stages of the design
process and in different contexts of use.
• Show how evaluators mix and modify methods to meet
the demands of evaluating novel systems.
• Discuss some of the challenges that evaluators have
to consider when doing evaluation.
• Illustrate how methods discussed in Chapters 7 and 8
are used in evaluation and describe some methods
that are specific to evaluation.
www.id-book.com 2
Why, what, where and when to
evaluate
Iterative design & evaluation is a continuous
process that examines:
• Why: to check users’ requirements and that they
can use the product and they like it.
• What: a conceptual model, early prototypes of a
new system and later, more complete prototypes.
• Where: in natural and laboratory settings.
• When: throughout design; finished products can be
evaluated to collect information to inform new
products.
www.id-book.com 3
Bruce Tognazzini tells you why you need to
evaluate
“Iterative design, with its repeating cycle of
design and testing, is the only validated
methodology in existence that will consistently
produce successful results. If you don’t have
user-testing as an integral part of your design
process you are going to throw buckets of
money down the drain.”

See AskTog.com for topical discussions about


design and evaluation.

www.id-book.com 4
Types of evaluation
• Controlled settings involving users, eg
usability testing & experiments in
laboratories and living labs.
• Natural settings involving users, eg field
studies and in the wild studies to see
how the product is used in the real world.
• Settings not involving users, e.g. to
predict, analyze & model aspects of the
interface analytics.
www.id-book.com 5
Living labs
• People’s use of technology in their everyday
lives can be evaluated in living labs.
• Such evaluations are too difficult to do in a
usability lab.
• Eg the Aware Home was embedded with a
complex network of sensors and audio/video
recording devices (Abowd et al., 2000).

www.id-book.com 6
Usability testing & field studies can
compliment

www.id-book.com 7
Evaluation case studies
• Experiment to investigate a computer game

• In the wild field study of skiers

• Crowdsourcing

www.id-book.com 8
Challenge & engagement in a
collaborative immersive game
• Physiological measures
were used.
• Players were more engaged when playing
against another person than when playing
against a computer.
• What precautionary measures did the evaluators
take?

www.id-book.com 9
Challenge & engagement in a
collaborative immersive game

www.id-book.com 10
What does this data tell you?

www.id-book.com 11
Why study skiers in the wild ?

www.id-book.com 12
e-skiing system components

www.id-book.com 13
What did we learn from the case
studies?
• How to observe users in natural settings.
• Unexpected findings resulting from in the wild
studies.
• Having to develop different data collection and
analysis techniques to evaluate user experience
goals such as challenge and engagement.
• The ability to run experiments on the Internet that
are quick and inexpensive using crowdsourcing.
• How to recruit a large number of participants using
Mechanical Turk.Test text

www.id-book.com 14
Evaluation methods
Method Controlled Natural Without users
settings settings

Observing x x

Asking users x x

Asking x x
experts
Testing x
Modeling x

www.id-book.com 15
The language of evaluation
Analytics Informed consent form
Analytical evaluation In the wild evaluation
Living laboratory
Biases
Predictive evaluation
Controlled experiment Reliability
Crowdsourcing Scope
Ecological validity Summative evaluation
Expert review or crit Usability laboratory
User studies
Field study
Usability testing
Formative evaluation Users or participants
Heuristic evaluation Validity
www.id-book.com 16
Participants’ rights and getting their
consent
• Participants need to be told why the
evaluation is being done, what they will be
asked to do and their rights.
• Informed consent forms provide this
information.
• The design of the informed consent form, the
evaluation process, data analysis and data
storage methods are typically approved by a
high authority, eg. Institutional Review Board.
www.id-book.com 17
Things to consider when
interpreting data
• Reliability: does the method produce the
same results on separate occasions?
• Validity: does the method measure what it is
intended to measure?
• Ecological validity: does the environment of
the evaluation distort the results?
• Biases: Are there biases that distort the
results?
• Scope: How generalizable are the results?

www.id-book.com 18
Key points
• Evaluation and design are very closely integrated.
• Some of the same data gathering methods are used in
evaluation as for establishing requirements and
identifying users’ needs, e.g. observation, interviews,
and questionnaires.
• Evaluations can be done in controlled settings such as
laboratories, less controlled field settings, or where
users are not present.
• Usability testing and experiments enable the evaluator
to have a high level of control over what gets tested,
whereas evaluators typically impose little or no control
on participants in field studies.

www.id-book.com 19
Lecture 12 (Part B)
EVALUATION
The aims:
• Explain how to do usability testing

• Outline the basics of experimental


design

• Describe how to do field studies

www.id-book.com 21
Usability testing
• Involves recording performance of typical users
doing typical tasks.
• Controlled settings.
• Users are observed and timed.
• Data is recorded on video & key presses are
logged.
• The data is used to calculate performance times,
and to identify & explain errors.
• User satisfaction is evaluated using
questionnaires & interviews.
• Field observations may be used to provide
contextual understanding.
www.id-book.com 22
Experiments & usability testing

• Experiments test hypotheses to discover new


knowledge by investigating the relationship
between two or more variables.

• Usability testing is applied experimentation.

• Developers check that the system is usable by the


intended user population for their tasks.

www.id-book.com 23
Usability testing & research
Usability testing Experiments for research

• Improve products • Discover knowledge


• Few participants • Many participants
• Results inform design • Results validated
• Usually not completely statistically
replicable • Must be replicable
• Conditions controlled as • Strongly controlled
much as possible conditions
• Procedure planned • Experimental design
• Results reported to • Scientific report to
developers scientific community

www.id-book.com 24
Usability testing
• Goals & questions focus on how well users
perform tasks with the product.

• Comparison of products or prototypes is


common.

• Focus is on time to complete task & number &


type of errors.

• Data collected by video & interaction logging.


• Testing is central.

• User satisfaction questionnaires & interviews


provide data about users’ opinions.

www.id-book.com 25
Testing conditions
• Usability lab or other controlled space.
• Emphasis on:
– selecting representative users;
– developing representative tasks.
• 5-10 users typically selected.
• Tasks usually around 30 minutes
• Test conditions are the same for every
participant.
• Informed consent form explains procedures and
deals with ethical issues.
www.id-book.com 26
Types of data
 Time to complete a task.

 Time to complete a task after a specified time away


from the product.

 Number and type of errors per task.

 Number of errors per unit of time.

 Number of times online help and manuals accessed.

 Number of users making an error.

 Number of users successfully completing a task.


www.id-book.com 27
How many participants is enough
for user testing?
• The number is a practical issue.
• Depends on:
– schedule for testing;
– availability of participants;
– cost of running tests.
• Typically 5-10 participants.
• Some experts argue that testing should
continue until no new insights are gained.
www.id-book.com 28
Usability lab with observers
watching a user & assistant

www.id-book.com 29
Portable equipment for use in the
field

www.id-book.com 30
Portable equipment for use in the
field

www.id-book.com 31
Mobile head-mounted eye tracker

www.id-book.com 32
Usability testing the iPad
• 7 participants with 3+ months experience with iPhones
• Signed an informed consent form explaining:
– what the participant would be asked to do;
– the length of time needed for the study;
– the compensation that would be offered for participating;
– participants’ right to withdraw from the study at any time;
– a promise that the person’s identity would not be disclosed; and
– an agreement that the data collected would be confidential and
would be available to only the evaluators
• Then they were asked to explore the iPad
• Next they were asked to perform randomly assigned specified
tasks

www.id-book.com 33
Examples of the tasks

www.id-book.com 34
Example of the equipment

www.id-book.com 35
Problems and actions
• Problems detected:
– Accessing the Web was difficult
– Lack of affordance and feedback
– Getting lost
– Knowing where to tap
• Actions by evaluators:
– Reported to developers
– Made available to public on nngroup.com
• Accessibility for all users important

www.id-book.com 36
Experiments
• Test hypothesis
• Predict the relationship between two or
more variables.
• Independent variable is manipulated by the
researcher.
• Dependent variable influenced by the
independent variable.
• Typical experimental designs have one or
two independent variables.
• Validated statistically & replicable.

www.id-book.com 37
Experimental designs
• Different participants - single group of
participants is allocated randomly to the
experimental conditions.
• Same participants - all participants appear
in both conditions.
• Matched participants - participants are
matched in pairs, e.g., based on expertise,
gender, etc.
www.id-book.com 38
Different, same, matched
participant design
Design Advantages Disadvantages

Different No order effects Many subjects &


individual differences a
problem
Same Few individuals, no Counter-balancing
individual differences needed because of
ordering effects
Matched Same as different Cannot be sure of
participants but perfect matching on all
individual differences differences
reduced

www.id-book.com 39
Field studies
• Field studies are done in natural settings.
• “In the wild” is a term for prototypes being used
freely in natural settings.
• Aim to understand what users do naturally and
how technology impacts them.
• Field studies are used in product design to:
– identify opportunities for new technology;
– determine design requirements;
– decide how best to introduce new technology;
– evaluate technology in use.
www.id-book.com 40
Technology for context-aware field
data collection

www.id-book.com 41
An in the wild study:
UbiFit Garden

www.id-book.com 42
Data collection & analysis

• Observation & interviews


– Notes, pictures, recordings
– Video
– Logging
• Analyzes
– Categorized
– Categories can be provided by theory
• Grounded theory
• Activity theory

www.id-book.com 43
Data presentation
• The aim is to show how the products are
being appropriated and integrated into
their surroundings.
• Typical presentation forms include:
– Vignettes,
– Excerpts,
– Critical incidents,
– Patterns, and narratives.
www.id-book.com 44
Key points
• Usability testing takes place in controlled usability labs or temporary labs.

• Usability testing focuses on performance measures, eg. how long and how many errors
are made when completing a set of predefined tasks. Indirect observation (video and
keystroke logging), user satisfaction questionnaires and interviews are also collected.

• Affordable, remote testing systems are more portable than usability labs. Many also
contain mobile eye-tracking and other devices.

• Experiments test a hypothesis by manipulating certain variables while keeping others


constant.

• The experimenter controls independent variable(s) in order to measure dependent


variable(s).

• Field studies are evaluation studies that are carried out in natural settings to discover
how people interact with technology in the real world.

• Field studies that involve the deployment of prototypes or technologies in natural settings
may also be referred to as ‘in the wild’.

• Sometimes the findings of a field study are unexpected, especially for in the wild studies
in which explore how novel technologies are used by participants in their own homes,
places of work, or outside.
www.id-book.com 45

You might also like