Eye Movements As Implicit Relevance Feedback: CHI 2008 Proceedings Works in Progress April 5-10, 2008 Florence, Italy
Eye Movements As Implicit Relevance Feedback: CHI 2008 Proceedings Works in Progress April 5-10, 2008 Florence, Italy
2991
CHI 2008 Proceedings · Works In Progress April 5-10, 2008 · Florence, Italy
The typical relevance feedback scenario is as follows: having the goal of a practical application in mind, our
The user tells the search engine which of the results for relevance prediction method will include such a pre-
a given query were relevant or irrelevant (i.e., explicit processing filter.
relevance feedback). The user feedback can be then
used to find an improved query and enhance the rank- Reading Physiology
ing of the search results. However, since most users A lot of research has been done during the last one
are reluctant to provide relevance feedback explicitly, hundred years concerning eye movements while read-
much research is done on determining such relevance ing. The results being most important for reading and
feedback implicitly from user behavior (cf. [4]). skimming detection are as follows (see [7] for compre-
hensive overview): When reading silently the eye
In this paper, we focus on our first steps in the process shows a very characteristic behavior composed of fixa-
of automatically predicting relevance from eye move- tions and saccades. A fixation is a time interval of about
ments. We describe an algorithm that detects and dif- 200-250 ms on average when the eye is steadily gazing
ferentiates reading and skimming behavior. Based on at one point. A saccade is a rapid eye movement from
the output of that algorithm we define and investigate one fixation to the next. The mean left-to-right saccade
the quality of a new eye-movement measure for de- size during reading is 7-9 letter spaces. It depends on
termining the relevance of documents. The identified the font size and is relatively invariant concerning the
correlations between eye movement measures and ex- distance between the eyes and the text. Approximately
plicit relevance feedback will be used for creating pre- 10-15% of the eye movements during reading are re-
diction methods in future research. gressions, i.e., movements to the left along the current
line or to a previously read line.
Closely Related Work
The idea to use gaze data for estimating relevance is Algorithm for Reading Detection
not new and has been approached, e.g., in [2, 6, 8]. This knowledge about eye movement behavior during
However, there is a main drawback in their studies with reading can be exploited in order to detect whether a
regard to a practical application: None of them applied person is reading or skimming. The following algorithm
a preprocessing method like our reading and skimming has been tuned for a Tobii 1750 desk-mounted eye
detection algorithm. Instead, they operated on the raw tracker which has a data generation frequency of 50 Hz
gaze data to detect fixations, etc. Yet, if such a gaze and an accuracy of around 40 pixel at a resolution of
analysis system should be put into practice then a pre- 1280x1024. We use such a kind of eye tracking device
processing step is necessary: For example, when look- since we have the goal of a practical application in mind
ing at a document one is not necessarily engaged in it; and since we believe that such kinds of eye tracking
one might probably think about something else and devices might become widespread in the future. As it
stare at the document. But when one is reading or does not require any head-mounted part and works
skimming a document, the probability is much higher (currently) after a quick calibration, those kinds have
that one is indeed engaged in the document. Therefore, the potential to be used in normal office environments.
2992
CHI 2008 Proceedings · Works In Progress April 5-10, 2008 · Florence, Italy
The general idea of the algorithm is as follows: First, gether with three nearby gaze points as initiator of
fixations are detected. Second, the transitions from one a new fixation (according to step 1, locations 11-
fixation to the next are classified resulting in so-called 14). The slightly larger rectangle in this step, al-
features. Third, scores associated with the features are lows tolerating noise from the eye tracker and very
accumulated. Finally, it is determined whether thresh- small eye movements like microsaccades and
olds for “reading” and “skimming” behavior are ex- drifts.
ceeded. If this is the case, the respective most plausi-
ble behavior is detected. If there are at least 4 successive gaze locations that
cannot be merged with the current fixation, the fixation
The idea of the algorithm is related to that of [3]. How- has ended. Then it is propagated to the next processing
ever, some major modifications have been introduced, step of the reading detection algorithm. In this way, up
primarily concerning the detection of fixations, the ac- to 3 gaze location “outliers” can be ignored, that might
cumulation strategy, and the differentiation between occur from time to time due to eye tracker inaccuracies
reading and skimming behavior. In the following, we or light reflections in the user's glasses (if any). Blink-
describe the steps of the algorithm in detail. ing of the eyes is treated as the end of a fixation. If a
fixation has ended, all the contributing gaze locations
Fixation Detection are averaged to get one specific fixation coordinate.
The fixation detection works in two steps.
1. A new fixation is detected if 4 successive nearby
Outliers
gaze locations from the eye tracker are accumu- 9
2 8
lated (compare figure 1, locations 1-4). Four gaze 30 pixel 1
3 5 14
11 12
locations at 50 Hz correspond to a duration be- 4 6
50 pixel
13
tween 80 and 100 ms. This is the minimum fixation 7 10
duration according to the literature (see above).
Fixation with drift to the lower right New fixation
Gaze points are considered nearby when they fit
together in a rectangle of 30x30 pixel. figure 1. Gaze locations produced by the eye tracker (illus-
2. For any further gaze location generated by the eye trated by the circles; numbers indicate sequence) are agglom-
erated resulting in fixation frames.
tracker, it is checked whether it fits in a 50x50 pix-
el rectangle together with all gaze locations already
belonging to the current fixation. If yes, then the Classification of Fixation Transitions
new gaze location will be assigned to the current Each transition from one fixation to the next is classi-
fixation (fig.4, locations 5-7, 10). If no, then it is fied according to its length and direction. This results in
either ignored as an outlier (e.g., in case that the features that occur more or less often during reading or
one of the three next gaze location belongs to the skimming (e.g., read forward, skim forward, regres-
current fixation, locations 8, 9), or it is used to- sion, reset jump). A list of all possible features is given
in figure 2.
2993
CHI 2008 Proceedings · Works In Progress April 5-10, 2008 · Florence, Italy
Distance and Feature Reading Skimming cause the distribution of the features during reading
direction in detector detector and skimming behavior is different, the reading detec-
letter spaces score sr score ss tor r uses different scores sr than the skimming detector
s: ss. The concrete scores are motivated by the litera-
0 < x <= 11 Read forward 10 5
ture (Rayner [7], Campbell and Maglio [3]).
11 < x <= 21 Skim forward 5 10
21 < x <= 30 Long skim jump -5 8 For each feature sequence represented by a multiset
-6 <= x < 0 Short regression -8 -8 DF of contained features and for each detector d ∈ {r, s},
-16 <= x < -6 Long regression -5 -3 it is tested whether there is enough evidence for read-
x < -16 and y ing or skimming behavior, respectively. That is done by
5 and line 5 and line
according to
Reset jump comparing the accumulated scores to detector-specific
delimiter delimiter
line spacing thresholds td, i.e., testing whether
∑s
All other Unrelated Line delimiter
movements move d ( f ) > td
f ∈DF
figure 2. The transitions from one fixation to the next are
classified resulting in features. Detector-specific scores are for each d (we use tr = 30 and ts = 20). If only one of
associated with each feature.
the detectors has accumulated enough evidence, then
the appropriate behavior is detected. Otherwise, if both
Because the length of a typical saccade during reading detectors have found enough evidence for a text row,
depends on the font size, this transition classification the more plausible behavior is determined simply by
method is based on letter space distances and not on comparing the accumulated scores of the detectors with
absolute pixel distances. Information about the font each other.
size of the currently fixated text is received by the
screen OCR tool OCRopus [5]: it gets a small screen In figure 3 an exemplary result of the reading and
shot around the current fixation as input and returns skimming detection algorithm is shown. The circles rep-
the font size of the nearest text line as output. resent the fixations, while their diameters correspond
to the fixation durations. The classification result of the
Reading and Skimming Detection fixation transitions (i.e., detected features) is shown by
The detection of reading or skimming behavior is done the abbreviations on the connecting lines (R: read for-
on the basis of feature sequences that are separated by ward; S: skim forward; L: short regression; Reset: re-
reset jump features or unrelated moves (compare fig- set jump). Dashed lines mean that the feature se-
ure 2). This is similar to the method described by [1]. quence is more characteristic for skimming behavior.
To differentiate between reading and skimming behav- Likewise, solid lines stand for reading behavior.
ior, two independent detectors analyze these sequences
of features and accumulate the associated scores. Be-
2994
CHI 2008 Proceedings · Works In Progress April 5-10, 2008 · Florence, Italy
figure 3. Visualization of the result of the reading and skimming detection algorithm.
Towards Eye Movement Measures to Predict The upper diagram of figure 4 shows the distribution of
Relevance the explicit relevance ratings (only the categories rele-
The next step towards a method to predict relevance vant and irrelevant) over 20%-intervals of the read-to-
from eye movements is to find specific eye movement skimmed ratio. E.g., all the pages that got a read-to-
measures that are correlated to the users’ explicit rele- skimmed ratio between 0 and 20, around 8% of those
vance feedback. Therefore, we designed a study where pages were rated relevant and 26% were rated irrele-
19 participants had to rate explicitly 16 one screen long vant. It has to be noted that for the upper diagram the
text documents according to their relevance to a given eye movement data was merged across participants.
task. The rating scale had 4 categories: relevant (se- Yet, it is well-known that there are individual differ-
lected 112 times) and irrelevant (90 times) in the ex- ences in eye movements (e.g., see [7]). Therefore, for
tremes and two intermediary categories (together 64 the lower diagram of figure 4, we applied a simple indi-
times, ignored in the following due to space con- vidual normalization: First, for each participant the in-
straints). The participants’ eye movements while view- dividual minimum and maximum of the read-to-
ing the documents were analyzed and filtered by the skimmed ratio was determined (some people read more
reading detection algorithm. quickly on average than others). Next, the absolute
values of the read-to-skimmed ratio were normalized
Based on those filtered eye movements (i.e., move- with respect to the individual [min, max]-intervals.
ments that really belong to reading or skimming), we That resulted in a percentage for each absolute value
calculated, among others, the read-to-skimmed ratio. It stating its relative position in the individual interval. For
is computed as the ratio of the length of all read lines example, if a participant only produced values between
to the length of all read or skimmed lines. Thus, it con- 60% and 100% for the read-to-skimmed ratio (i.e., a
tains information about whether and to which extend slow reader), then the individual [min, max]-interval
different reading velocities have been applied on a page. would be [60, 100]. The specific read-to-skimmed
2995
CHI 2008 Proceedings · Works In Progress April 5-10, 2008 · Florence, Italy
value of 60% for that participant would count as 0% Finally, we aim at applying such a classification test to
with regard to the individual interval. The lower dia- automatically generate relevance feedback in practical
gram of figure 4 shows the rating distribution over the search applications.
personal [min, max]-intervals.
Acknowledgements
We thank Daniel Keysers and Christian Kofler for pro-
viding a customized version of OCRopus. This work was
supported by the German Federal Ministry of Education,
Science, Research and Technology (bmb+f), (Grant 01
IW F01, Project Mymory: Situated Documents in Per-
sonal Information Spaces).
Citations
[1] Beymer, D., and Russell, D.M. WebGazeAnalyzer: a
system for capturing and analyzing web reading behavior
using eye gaze. Proc. CHI '05. 2005.
[2] Brooks, P., Phang, K. Y., Bradley, R., Oard, D., White,
R., and Guimbretire, F. Measuring the utility of gaze detec-
tion for task modeling: A preliminary study. IUI’06. 2006.
[3] Campbell, C.S., and Maglio, P.P. A robust algorithm for
reading detection. Proc. PUI ’01. 2001.
[4] Kelly, D., Teevan, J. Implicit feedback for inferring
user preference: a bibliography. In SIGIR Forum, 37
(2003), 18-28.
[5] OCRopus, an open source document analysis and OCR
figure 4. The read-to-skimmed ratio as eye movement measure.
system (http://code.google.com/p/ocropus/).
[6] Puolamäki, K., Salojärvi, J., Savia, E., Simola, J., and
Kaski, S. Combining eye movements and collaborative fil-
Conclusion and Future Work tering for proactive information retrieval. Proc. SIGIR ‘05.
The preliminary analysis of our experiment shows that ACM Press (2005), 146–153.
the algorithm for reading and skimming detection is [7] Rayner, K. Eye movements in reading and information
very useful. Our eye movement measure, the read-to- processing: 20 years of research. In Psychological Bulletin,
skimmed ratio, seems to be very well discriminating 124 (1998), 372-422.
with respect to explicit relevance feedback. Next steps [8] Salojärvi, J., Puolamäki, K., Simola, J., Kovanen, L.,
will be to analyze further measures based on the read- Kojo, I., and Kaski, S. Inferring relevance from eye move-
ments: Feature extraction. Tech. Rep. A82, Helsinki Univ.
ing detection algorithm and to design binary classifica-
of Technology, Publications in Computer and Information
tion tests to predict relevance from eye movements. Science, 2005.
2996