successful classroom deployment of
successful classroom deployment of
The MIT Faculty has made this article openly available. Please share
how this access benefits you. Your story matters.
Citation: Zyto, Sacha, David Karger, Mark Ackerman, and Sanjoy Mahajan. “Successful
Classroom Deployment of a Social Document Annotation System.” Proceedings of the 2012 ACM
Annual Conference on Human Factors in Computing Systems - CHI ’12 (2012), May 5–10, 2012,
Austin, Texas, USA.
As Published: http://dx.doi.org/10.1145/2207676.2208326
Version: Author's final manuscript: final author's manuscript post peer review, without
publisher's formatting or copy editing
ABSTRACT has been unclear whether the annotation systems were too
NB is an in-place collaborative document annotation website limited, the technical ecology around them was too rudimen-
targeting students reading lecture notes and draft textbooks. tary, or the educational system was not adequately prepared.
Serving as a discussion forum in the document margins, NB Perhaps in consequence, research on the topic has lain rela-
lets users ask and answer questions about their reading mate- tively fallow for the past decade.
rial as they are reading. We describe the NB system and its
In this paper, we offer evidence that the time may be ripe
evaluation in a real class environment, where students used it
for a renewal of research and development on collaborative
to submit their reading assignments, ask questions and get or
annotation systems. We report on NB, an annotation forum
provide feedback. We show that this tool has been success-
that has been successfully deployed and used in 55 classes at
fully incorporated into numerous classes at several institu-
10 universities. Students use NB to hold threaded discussions
tions. To understand how and why, we focus on a particularly
in the margins of online class material.
successful class deployment where the instructor adapted his
teaching style to take students’ comment into account. We an- Our contribution is twofold. First, we provide evidence that
alyze the annotation practices that were observed—including the socio-technical environment of the classroom has evolved
the way geographic locality was exploited in ways unavail- to the point where the barriers that were encountered by ear-
able in traditional forums—and discuss general design impli- lier annotation tools have lowered enough to be overcome by
cations for online annotation tools in academia. motivated teachers and students. While these changed cir-
cumstances do not yet hold in all circumstances, we will ar-
Author Keywords gue that they are common enough to be worth designing for.
Hypertext; annotation; collaboration; forum; e-learning;
Our second contribution is to assess specific features of NB
ACM Classification Keywords that we believe contributed to its being adopted and valued by
H.5.2 Information Interfaces and Presentation (e.g. HCI): its users. Our design of NB’s “situated discussions,” contrast-
User Interfaces. - Graphical user interfaces. ing with the traditional “linked hypertext” model, was moti-
vated by the following design hypotheses:
General Terms • That the ability to comment in the margins, without leaving
Design; Experimentation; Human Factors; the document, would enable students to comment “in the
flow” while reading, reducing the deterrent loss of context
INTRODUCTION involved in commenting elsewhere;
Early hypertext research offered the promise of annotating
texts for educational purposes with the detailed discussion • That the in-place display of comments in the margins
necessary to understand complex material. The Web ampli- would draw students’ attention to relevant comments while
fied that promise. But it has not been fulfilled. reading, and encourage them to respond;
There is at present no collaborative annotation tool in • That the physical location of comments with their subject
widespread use in education. Past work revealed significant matter would provide a valuable organizational structure
barriers to their adoption. For example, Brush’s [3] study of distinct from the chronological organization typical of dis-
an online annotation system reported that because students cussion forums, helping students aggregate related threads
printed and read documents and comments offline, faculty and consider them together;
had to force discussion by requiring replies to comments. It
Taken together, we believed these characteristics would drive
a virtuous cycle, encouraging more students to participate
more heavily, thus providing more helpful material for other
Permission to make digital or hard copies of all or part of this work for students, yielding additional incentive to participate.
personal or classroom use is granted without fee provided that copies are 1
not made or distributed for profit or commercial advantage and that copies Zyto, Karger, and Ackerman designed and deployed NB, gathered
bear this notice and the full citation on the first page. To copy otherwise, or its usage data, analyzed it and wrote up the results. Mahajan was an
republish, to post on servers or to redistribute to lists, requires prior specific early, and to date the most successful, user of the NB system, and
permission and/or a fee. his class is the focus of our evaluation here. He was not involved in
CHI’12, May 5–10, 2012, Austin, Texas, USA. the data gathering or analysis, or authoring this article.
Copyright 2012 ACM 978-1-4503-1015-4/12/05...$10.00.
Figure 1. NB document view. Left: Thumbnails, Center: Document (with pop-up annotation editor on the bottom), Right: Annotations
In this work, we give evidence supporting of all of our hy- discussion. Actually navigating to the discussion causes loss
potheses. We report substantial usage of NB in many classes. of context, making it harder to follow the discussion or return
To understand how and why the tool was used, we exam- to the material. A study of forum use in a class in 2002 [13]
ine one “best case” use of NB in which 91 students in a 1- found that discussion threads tended to branch and lose coher-
semester class produced over 14000 annotations. Given that ence, with many leaves of the discussion rarely read, and ob-
most of those comments had substantive content [8] and that served that “the typical nonlinear branching structure of on-
the professor and students alike praised the system, this ap- line discussion may be insufficient for the realization of truly
pears to be a successful classroom deployment of an anno- conversational modes of learning.” This was 10 years ago,
tation system. Since only limited successes have been pre- and one might believe that the current generation takes better
viously reported in HCI, hypertext, or education literature, to discussion forums. But an examination of MIT’s classroom
we assess the factors that led to this successful use and their discussion system, Stellar, showed that the 50 classes with the
implications for innovative educational uses and future text- most posts in the Spring 2010 semester produced a total of
books. 3275 posts—an average of 65.5 per class—and a maximum
of 415.2 (At the same time at MIT, one 91-student class using
MOTIVATION AND RELATED WORK NB generated over 14,000 posts.)
While there is relatively little current work, the past abounds
with studies of collaborative discussion tools for education. Improving on this “detached” situation, CaMILE [8] offered
Space limits us to projects we found most influential. It is anchor-based discussions: its HTML documents can embed
accepted that students understand material better after dis- hyperlinks from discussion anchors - places where the au-
cussing it [5, 6]. This suggests that discussion forums can be thors thought a discussion could be appropriate. Although
useful in an academic setting. Their use in this context can be this does not offer readers the flexibility to discuss arbitrary
traced back to the Plato system (1960) [4]. CSILE (1984) and points, it is a significant step towards overcoming the limi-
its successor Knowledge Forum (1995) [10] explore mecha- tations of traditional online forums by trying to situate them
nisms for encourage students to achieve knowledge building nearer the context of the document being discussed. How-
and understanding at the group level. ever, reading those annotations still requires navigating to a
different context.
These tools all support discussion of class reading materials,
but the discussions occur in a separate environment. As we 2
An important caveat is that Stellar is not a particularly good discus-
will argue below, this is a drawback: a reader might not be sion system. Recently, a forum tool called Piazzza has begun to see
aware that a topic she is considering has been discussed, so widespread adoption; we have not yet had the opportunity to analyze
might miss the opportunity to contribute to or benefit from the its usage, which clearly outperforms that of Stellar.
The WebAnn project [3] let students discuss any part of a doc- Implementation Details
ument. More significantly, it recorded annotations in-place in At the time of the study, the server-side of NB was based on
the document margins, allowing readers to see the document python, a PDF library and a postgresql database. Since then,
and the discussions on the same page. Setting the context this NB has been re-implemented using the Django framework in
way meant that comments could omit lengthy explanations order to improve portability and maintainability. NB uses a
since they would be visible at the same time as that mate- RESTful data API to exchange data between the client and
rial. The expected consequence was that a wider audience server. This allows third parties to use the NB framework and
would read and participate easily in the discussion. However, implement their own UI. The NB server is open to use by any
at the time of the WebAnn study (Spring 2001), some factors interested faculty at http://nb.mit.edu/ .
limited the benefits of the tool. Mainly, students unanimously
printed the lecture material, and worked on the printout. They
Deployment
then returned to the online site only to record the annotations
they had “planned out” on their printed copies. This intro- To date, NB has has been used by in 49 classes by 32 dis-
duced large lags between comments and replies that inhibited tinct faculty at 10 institutions including MIT, Harvard, Cali-
organic discussion, and meant that many comments arrived fornia State, U. Edinburgh, KTH Sweden, Olin College, and
too late to benefit other students while they were reading. Rochester Institute of Technology. The majority of classes
are in the physical sciences but a few are in social sciences
As people have become more comfortable online, some of the and humanities. Of the 32 faculty, 8 were using the tool for
obstacles impacting tools such as WebAnn may have shrunk. the first time this semester. Of those who started earlier, 9
With this in mind, we deployed NB to assess the present- faculty (28%) made use of the tool in multiple semesters (for
day (and future) appeal of a collaborative annotation system, a total of 18 re-uses), indicating that they have continued to
and have produced evidence that in-margin discussions can adopt it after a semester’s experience of its usage. This seems
now be an effective part of teaching. Deployed at roughly a coarse indication that they believe that the tool is helping
the same time, Van der Pol’s Annotation System [14] is an- them meet their teaching goals. Informal positive feedback
other web-based annotation framework that has been success- from many of the faculty has supported this indication.
fully used in an academic context, and was used to quantify
how both tool affordances and peer-feedback can facilitate The tool saw substantial student use in many classes. Table 1
students’ online learning conversations. shows that total number of comments submitted in the top
15 classes. 13 of these classes received more comments than
SYSTEM DESCRIPTION the maximum (415) captured in any usage of Stellar, MIT’s
NB is a web-based tool where users can read and annotate forum tool. The top five each collected more comments than
PDF documents using standard web browsers. After logging the top 50 classes using Stellar combined (3275).
in, a student typically selects a document and starts reading.
As shown on Figure 1, the document is augmented by annota- Class comments per user
tions that the students and faculty have written, which appear Approximation in Science & Eng. 14258 151
as expandable discussions on the right-hand-side panel. Hov- UI Design and Implementation (*) 10420 83
ering someplace in the document highlights the annotations Math Methods for Business (*) 4436 61
covering that place, whereas clicking somewhere on the doc- Mathematics for CS (*) 3562 23
ument scrolls to the corresponding annotations. Annotations Mathematics for CS (*) 3270 34
UI Design and Implementation (*) 2703 61
in NB are either anchored to a particular location in the doc- Signals and Systems 1996 39
ument or are general comments on the document.3 To add an Electricity and Magnetism 1254 17
annotation somewhere in the document, users click and drag Mathematics for CS (*) 1045 26
to select a region they want to comment on. This region is Pseudorandomness 880 40
Dynamics 789 21
highlighted and an in-place annotation editor pops up (bot- Adv. Quant. Research Methodology 570 9
tom of Figure 1). Math Methods for Business (*) 530 12
Concepts in Multicore Computing 336 21
Users can choose whether their comment should be visible Moral Problems and the Good Life 233 8
to everyone in the class (the default), or to the teaching staff Table 1. Usage of NB in other classes. Starred classes are re-uses by a
only, or to themselves only. The can also choose whether faculty member who had already used NB.
the comment is anonymous (the default) or signed. Once a
comment has been saved, its author can delete it or edit it
as long as there hasn’t been a reply. He can also change its USAGE ANALYSIS
properties (visibility, anonymity). Users can tag each other’s Given that NB is seeing some adoption, we wished to investi-
comments with the following tags: Favorite, Hidden, I agree, gate how and why NB is being adopted and used in the class-
I disagree, and Reply requested. room. Due to space limitations, we focus the remainder of
this article on the single most successful use of NB, in Ap-
proximations in Engineering, Spring Semester 2010 at MIT.
The teacher was Sanjoy Mahajan, our fourth author. The
3
We have found that general comments are rarely used, and do not reader might worry that we are skewing the data, but we be-
discuss them further. lieve this choice is justified for three reasons:
• Our objective here is to demonstrate, not that NB always Analysis of the open-ended comments on the user assess-
works but that NB can work in a real-world setting, which ments was done by carefully reading the comments for
shows the research direction worth pursuing. themes and patterns, as is standard practice with qualitative
data [9]. These themes were discussed by all the authors, then
• Mahajan was made an author after his usage of NB; he had re-read to examine the agreed-upon themes in more detail.
no special incentive to make the NB succeed, aside from
an interest in teaching well. FINDINGS
In this section, we assess the usage of NB by examining the
• Our data from many of the other high-usage classes is qual- corpus of annotations and its creation process. We present ev-
itatively similar, as we aim to report in an extended version idence that substantial amounts of collaborative learning [8]
of this article. occurred within NB. The annotations were primarily substan-
Approximations in Engineering had 91 undergraduate stu- tive content [8] regarding the course. Discussion threads were
dents. The thrice-weekly class lectures came from a pre- extensive. Students became active participants in questioning
print version of Mahajan’s textbook. He assigned sections and interpreting the course material, with a large majority of
of the book, usually about 5 pages long, for each lecture. questions by students answered by other students. Students
The previous four times he had taught the course, Mahajan interleaved annotation with reading, benefiting from the op-
required students to submit a paper-based “reading memo”— portunity to see content and respond to content while in the
annotations taken on the sides of the lecture pages—at the be- midst of reading, instead of navigating to a different discus-
ginning of each class. This method was popularized by Edwin sion site. Exploiting the geographic situatedness of annota-
Taylor [12]. Mahajan required students to make a “reasonable tions, students posted comments that addressed several dis-
effort”, defined in the syllabus as follows: “For reasonable ef- tinct but co-located threads simultaneously.
fort on a reading memo one comment is not enough unless it
is unusually thoughtful, while ten is too many”. Collaborative Learning
Assessing CaMILE [8], Guzdial and Turns identified 3 cri-
NB replaced the previous paper-based annotation system. teria that were deemed necessary to promote collaborative
Mahajan left the reading memo model and instructions un- learning: broad participation, sustained discussion, and fo-
changed but modified the deadline: instead of requiring that cus on class topic. We observed all three of these criteria. We
annotations be delivered in class, he made the online annota- cover each in turn.
tions due 12 hours before class, intending to peruse them prior
to lecturing (we discuss the consequences of this change in Broad Participation
the The Instructor Perspective section). There were no Teach- The 91 students created over 14000 annotations during the
ing Assistants (TAs) for this class. semester (averaging 153), while the instructor created 310.
The average number of annotations authored per student per
Method assignment was 3.67. This quantity increased over the course
Our analysis is based on log data, user questionnaires, and of the semester: a linear regression of this quantity over time
a small focus-group interview. The log data included user shows it increasing from 2.73 to 4.2 per assignment, an in-
actions down to the action level and were kept in a standard crease of 1.57 (p < 10−5 ). Although annotating was re-
log file. All annotations that users produced were stored, with quired, we take this increase over time as a sign of voluntary
the users’ consent. User questionnaires were administered at participation beyond the minimum requirement, suggesting
the end of the semester to both students and faculty. that students found the tool useful.
In total, we obtained over 1.4 million user actions and, as The instructor also posted problem sets, on which no annota-
mentioned, 14258 annotations from this class. These actions tions were required. Nonetheless, 217 annotations were made
include page seen, comment created, time spent with NB both on this material, in another demonstration of voluntary usage.
active and being ”idle”, and so on. We also obtained, to be
discussed later, questionnaires from students and interviews Sustained Discussion
with the instructor. The questionnaires consisted of Likert Of the 14258 annotations, 3426 (21.4%) were isolated
scale ratings concerning satisfaction and how NB might have threads—single annotations with no reply, while the remain-
helped or hindered understanding. In addition, they included ing 10832 (78.6%) were part of discussions—threads con-
open-ended comments about each question where they could taining an initial annotation and at least one reply. For as-
explain their ratings. signments, there were on average 13.9 discussions per page
and 3.48 annotations per discussion. As shown in Figure 2,
Analysis of the log data followed standard quantitative pro- the thread length distribution exhibits a smooth decay, with
cedures. As well, some of this data was analyzed by coding it over 400 discussion of length 5 or more, i.e. 1.4 lengthy dis-
for specific characteristics, such as being a substantive com- cussions per page of material on average.
ment, on randomly selected samples of the data. The details
of these codings and the samples are discussed in the Usage Focus on class topic
Analysis section below. The coding was done by the first We read and categorized all 413 comments in 187 discussions
author. The second and third authors reviewed the coding for a typical 5-page reading assignment (a lecture on dimen-
schemes and also the results. sional analysis, given in the middle of the term). We used
Total questions 116
Resolved by student in same thread 59 (50.8%)
Resolved by student in different thread 14 (12%)
Resolved by faculty 10 (8.6%)
Not resolved 11 (28%)
Table 3. Breakdown of questions asked and their resolution.