Collaborative Filtering and Inference Rules For Context-Aware Learning Object Recommendation

Learning objects strive for reusability in e-Learning to reduce cost and allow personalization of content. We argue that learning objects require adapted Information Retrieval systems. In the spirit of the Semantic Web, we discuss the semantic description, discovery, and composition of learning objects using Web-based MP3 objects as examples. As part of our project, we tag learning objects with both objective and subjective metadata. We study the application of collaborative filtering as prototyped in the RACOFI (Rule-Applying Collaborative Filtering) Composer system, which consists of two libraries and their associated engines: a collaborative filtering system and an inference rule system. We are currently developing RACOFI to generate context-aware recommendation lists. Context is handled by multidimensional predictions produced from a database-driven scalable collaborative filtering algorithm. Rules are then applied to the predictions to customize the recommendations according to user profiles. The prototype is available at inDiscover.net.

Uploaded by

Daniel Lemire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

214 views11 pages

Collaborative Filtering and Inference Rules For Context-Aware Learning Object Recommendation

Uploaded by

Daniel Lemire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Collaborative Filtering and Inference Rules for Context-Aware

Learning Object Recommendation

Daniel Lemire∗, Harold Boley†, Sean McGrath‡, Marcel Ball§
September 6, 2005

Abstract similar services from learning institutions. For teach-

ers, these new expectations can be challenging. The de-
Learning objects strive for reusability in e-Learning to re- sign and production of on-line courses is expensive and
duce cost and allow personalization of content. We ar- time-consuming. When providing digital content, it is
gue that learning objects require adapted Information Re- no longer adequate to merely manually adjust or adapt
trieval systems. In the spirit of the Semantic Web, we the course content for students, students should also be
discuss the semantic description, discovery, and compo- empowered to navigate independently (Lundgren-Cayrol
sition of learning objects using Web-based MP3 objects et al. , 2001).
as examples. As part of our project, we tag learning The Web also faces similar challenges. As the Web
objects with both objective (e.g., title, date, and author) becomes ubiquitous, our needs become more sophisti-
and subjective (e.g., quality and relevance) metadata. We cated. For example, while we may have been satisfied
study the application of collaborative filtering as proto- in the past with weather reports for a given area, we now
typed in the RACOFI (Rule-Applying Collaborative Fil- want to be able to plan our vacations to other areas and
tering) Composer system, which consists of two libraries thus, coordinate data coming from weather reports, ho-
and their associated engines: a collaborative filtering sys- tels, and air travel companies. While we generally know
tem and an inference rule system. We developed RACOFI how to find web sites on a given topic using search en-
to generate context-aware recommendation lists. Con- gines, we still can’t easily find all air travel companies
text is handled by multidimensional predictions produced offering flights from Montréal to Rio next week for under
from a database-driven scalable collaborative filtering al- 1000 dollars. It follows that we can’t expect our com-
gorithm. Rules are then applied to the predictions to cus- puters to suggest travel packages automatically from data
tomize the recommendations according to user profiles. gathered over the Web. We observe that the Web is not
The prototype is available at inDiscover.net. a database (Mendelzon, 1998) in that there are no built
in common schemas or sophisticated data retrieval mech-
anism. Yet the Web is the most successful data man-
1 Introduction agement tool ever developed. We distinguish two differ-
ent future challenges for the Web: Information Retrieval
With the proliferation of the Internet, demand for on-line
and Composition. One approach to the solution to these
learning has grown rapidly. Often used to deliver in-
problems can be found in the Semantic Web (Berners-Lee,
expensive just-in-time information, students now expect
1998). Essentially, the Semantic Web adds to the current
∗ UQAM, Montreal, QC, Canada (Work started while at NRC) Web enough metadata so that the Web can be considered
† NRC, Fredericton, NB, Canada
‡ UNB, Fredericton, NB, Canada
machine-parseable (Koivunen & Miller, 2001). In theory,
§ UNB, Fredericton, NB, Canada it should render the Information Retrieval problem easier,
In International Journal of Interactive Technology and Smart Edu- and one approach to Composition can then be achieved
cation, Volume 2, Issue 3, August 2005. through Artificial Intelligence (AI) planning techniques.

1
1 INTRODUCTION

Just like a Web site, digital knowledge and training

material should be reusable. For example, once basic
arithmetic is achieved, individual reuse this knowledge all
their life; a document about arithmetic is also reusable in a
similar fashion. In this spirit, the notion of a learning ob-
ject (Downes, 2001; Gibbons et al. , 2002) was proposed
as a way to enable content reuse in e-Learning. Essen-
tially, a learning object is any component that can be used
and reused for learning. A map, a Web site, a piece of Figure 1: A few of the options a user has for finding
software, and a video stream are all examples of learn- learning objects.
ing objects. Thanks to projects such as eduSource, learn-
ing objects have become tangible (Bhavsar et al. , 2003;
Bhavsar et al. , 2004). For example, KnowledgeAgora1 each other as explicitly as Web sites. Google searching
offers a learning object taxonomy and a search engine. is based on the assumption that a Web page frequently
Users can sell, buy, exchange or offer learning objects. linked to must offer relevant content and returns such re-
In KnowledgeAgora, unlike the Web at large, Intellec- sults to the searcher. Naturally, course content does in-
tual Property, age range, and education level are explicitly clude links and relationships, possibly through Dublin
handled, which is important for vital for content reuse. Core, but learning objects won’t make up a graph the same
Object composition in education is already present in way the Web is a directed graph of Web pages.
fields such as adaptive testing (Raîche, 2000); however, Effectively, finding the right resource to express a given
learning objects belong to a more heterogeneous and dis- idea remains a challenge. When an instructor or stu-
tributed setting than educators are accustomed to. Few dent wants to find a particular type of learning object
examples of computer supported learning object compo- that is of interest to them, they need a way to specify
sition have been reported (Fiaidhi et al. , 2004; Fiaidhi, their interest to the system that interfaces with the learn-
2004). Learning objects have to be fine-grained enough ing object repository. Fig. 1 shows some of the com-
so that reusability is sensible. That is, we need to have mon methods of doing so: navigating through a tax-
many small objects and we can aggregate to form larger onomy, performing keyword searches, specifying their
objects, such as courses. The problem in locating learning tastes/interests through explicit ratings and using a col-
objects is much harder than simply finding the proper dig- lection of inference rules to filter objects.
italized textbook for a course. An example of the problem Our approach is to focus primarily on explicit ratings
we want to solve would be to find all resources related and inference rules as a means of helping users discover
to “Java inheritance” for 3rd year Computer Science stu- what they are looking for, to increase discoverability.
dents, such that the students are likely to find the content We found that taxonomies can have low user acceptance
of at least average interest, but such that it was rated above in instances where the categories tend to be subjective
average for accuracy by instructors. (such as a taxonomy of musical genres) and that keyword
One might think that information retrieval as accom- searches across a large set of objects tend to produce far
plished on the Web can be applied to learning objects. too many results to efficiently sift through. For exam-
After all, Google is very efficient at finding content as ple, in KnowledgeAgora, a keyword search for “Java” re-
long as requests can be expressed as a list of keywords. turned 58 learning objects shortly after it was launched
However, one should note that Google works well in part and more recently returned 361 records. In the future, it is
because the Web is made of links in a fundamental way, quite likely that such a reques would return thousands of
unlike learning objects. The analogy between the HTML objects. Such information overload is a serious challenge
Web and learning objects is not perfect. Learning objects to content reuse, even for one’s own work, and although
are not necessarily text-based and they are not linked to we can borrow some of the Web techniques, we also need
new, adapted search solutions. However, even if we can
1 http://www.knowledgeagora.com/ find an appropriate object for a concept and a given stu-

2
2 THE SEMANTIC WAY

dent profile, we still have work to do, as a course is not make mistakes or even enter misleading data on purpose.
merely a set of objects. We have to find how to “fit” the For example, a resource target age might end up too often
object into the course. Hence, we are still far from gener- to be entered as 18-99 whether or not it is accurate. The
ating personalized courses automatically from databases. problem will only intensify as learning object repositories
This paper is organized in the following manner. We grow in size.
begin by reviewing the semantic approaches to object fil- Because we believe it will be hardly possible to have
tering and composition, we then present ratings as sub- a (single) common ontology to define learning object se-
jective measures and collaborative filtering as a method mantics, because metadata records are complex, and be-
to leverage them. The RACOFI architecture (Anderson cause learning objects will become even more numerous,
et al. , 2003) is presented as consisting of two compo- we will need to “predict” both the student and course cre-
nents: the S LOPE O NE collaborative filtering algorithm is ator needs, using heuristics. If we look back, we see that
presented as a good choice for learning object reposito- the same thing happened with the Web. More people find
ries, and the use of inference rules, to customize learning what they are looking for using Google than by using Ya-
object selection. We conclude with object composition as hoo’s or DMOZ’s category systems (taxonomies). It may
viewed from the RACOFI model. be telling that, in 2004, Google removed from its main
page a link to its copy of DMOZ’s taxonomy. Google’s
strength is that it leverages knowledge about the use made
2 The Semantic Way of a resource, that is, the number of links a Web site re-
ceived. We need a similar approach for learning objects.
People are already using learning objects with precise de- One vision of the semantic way is to imagine that each
scriptive educational metatags. For example, it is possi- learning object is coupled with an XML file describing
ble to find the author of any object in KnowledgeAgora. the object in an objective and exhaustive way. We rather
However, among the difficulties with such a semantic ap- think that each object will be described in different ways
proach is the need for common ontologies so that we can by different people, maybe using RDF files. In some
exchange objects across institutions. For example, if we cases, like the title of a book, we can expect everyone
are interested in all free learning objects available, we to agree, but as soon as the granularity of the descrip-
must all agree on what a “free” object is. Can you modify tion goes beyond a certain point, differences will appear.
a free object and redistribute it? If we want “rock” music, In fact, the IEEE LOM standard already contains subjec-
does “punk” music qualify? This problem cannot entirely tive fields such as “Interactivity Level”, “Semantic Den-
go away (Downes, 2003): we will always have to be sat- sity” and “Difficulty”. It is obvious that some people will
isfied with links between objects and possible semantics, disagree on the difficulty level of a given learning ob-
that is, not having a unique ontology. ject. Learning object repositories such as Merlot already
In order to enable detailed searches on sets of learn- have peer review systems for learning objects. Other re-
ing objects, metadata specifications and standards have searchers look at integrating learning object reviews from
been proposed including IMS’s LRM (Instructional Man- different judges (Nesbit et al. , 2002; Han et al. , 2003) as
agement System’s Learning Resource Metadata) and part of communities such as eLera2 .
IEEE LOM (IEEE Learning Object Metadata). These Once we accept subjectivity as a key ingredient for de-
standards are rather detailed: SCORM contains about scribing learning objects, it is not hard to imagine each
60 fields upon which a user can complete a search in- student and course creator might be allowed to annotate
cluding date, author, and keywords. Even if we suppose learning objects. For example, we can ask students to re-
that users are willing to take the time needed to fill out view and rate a reference document included in an on-line
many of these fields to make their searches narrow, and course. We could use the results to determine automati-
that the object has been carefully tagged, we can still end cally which students are most likely to benefit from this
up with thousands of records (Downes, 2002). It is not document (Recker et al. , 2003; Weatherley et al. , 2002;
clear that adding more and more fields will lead to bet-
ter results: content creators may omit some information, 2 http://elera.net/

3
3 WHAT ARE RATINGS AND WHAT CAN WE DO WITH THEM?

Downes et al. , 2004). Hence, from subjectivity, we get to seems appropriate to use similar measures with learning
personalized content, which benefits both the student and objects.
course creator.
In the learning object context, we propose to predict a
user’s opinions or a class of users’ overall opinions draw-
3 What Are Ratings and What Can ing on other users in the system. Hence, if we know what
students with a given profile will think of a document, we
We Do with Them? can decide whether or not to include it in a course. We
can also use knowledge about how an object was used in
In Information Retrieval, we often make the following as-
the past to help composition: “Dublin Core”, IEEE LOM
sumptions.
and IMS’s LRM allow us to tag relationships between ob-
1. Semantics is at least partly computer-parseable, jects. Ideally, we could even hope to configure courses
which most often means that content is text-based automatically, the same way we hope to see agents able to
or described by text. surf the web for us to organize our vacations.

2. Users can express their queries in a convenient The term collaborative filtering has its origins with
computer-parseable way such as by a range of dates Goldberg et al. (Goldberg et al. , 1992). It describes
or a few keywords. all techniques leveraging incomplete information about
tastes and opinions of a set of users. We distinguish
3. It is acceptable for the result of the query to be an three types of applications for collaborative filtering: gold
unbounded list of (partially) sorted objects. balling, black balling and white balling. In the first of
the three, we are simply looking for what the user will
The Semantic Web and the various metadata standard ini- prefer (Karypis, 2000). Black balling tries to virtually
tiatives for learning objects try to solve the problems oc- remove objects whereas white balling puts the user in
curring when the first of these assumptions is not entirely charge and allows for thresholds. As an example of white
met. However, even when the semantics is computer- balling, one could search for a document such that a given
parseable, perhaps because it has been carefully described type of student will find it relevant and having an at least
using XML, other assumptions may not be true. Are users average graphical design. In general, we’d like to use both
willing to express their requests using up to 60 fields? white balling and objective meta-data filtering to find, for
For quite some time already, universities have asked example, all images representing an eye that have some
students to rate their courses, professors and other ele- subjective qualities.
ments. If we imagine each course as being a learning ob-
ject, we will have an objective description of the course The most widespread type of algorithms in collabora-
given by its title, number, list of pre-requisites and so tive filtering looks for neighbors to the current user in
on, while, on the other hand, we have evaluations made a set of users and do a weighted average of their opin-
by past students of this same course. All of this data is ions (Breese et al. , 1998; Pennock & Horvitz, 1999;
metadata, “data about the course”, but some of it is more Resnick et al. , 1994; Weiss & Indurkhya, 2001; Her-
subjective than others. So, because we already use subjec- locker et al. , 1999). This approach works well in the case
tive metadata in traditional learning, we believe we should where we have enough users and relatively few objects
also use it in e-Learning, and even go further. rated on a single attribute. As the number of users grows,
When given the option, users will rate what is inter- computational cost increases and other algorithms may
esting to them: sites such as RatingZone, Amazon and become preferable for on-line applications. One way is to
Epinions are good examples. At the very least, we can precompute relations across objects (Lemire & Maclach-
measure implicit ratings: did a user return to this web site lan, 2005; Lemire, 2005; Sarwar et al. , 2001; Linden
and how long did s/he remain on the site. Teaching institu- et al. , 2003). Not much work has been done for the more
tions already measure how much interest courses receive difficult case where each object is rated on multiple di-
by counting the number of registered students and thus, it mensions (using several attributes at once).

4
5 SLOPE ONE COLLABORATIVE FILTERING ALGORITHMS

ity, and production. Objective metadata includes title, au-

thor, release date, and a link allowing the user to listen to
the music. As the user enters more ratings, recommenda-
tions typically become more useful. Since we use several
attributes, the rule engine applies corresponding weights
to each attribute in order to make a recommendation and
the user can control these weights. Special rules are also
used, for example, to increase predicted ratings to all ob-
jects from a given author if the current user liked at least
one of the objects from this author. Overall, we found that
a rule-based approach made it easy to adapt the system to
user expectations.
Since the initial release of RACOFI Music, both the
Figure 2: RACOFI architecture diagram.
collaborative filtering and rule engines have evolved into
RACOFI Composer, a framework designed for creating
4 RACOFI RACOFI powered systems, with one of the key features
being the support of multi-dimensional ratings and rec-
In 2002-2003, the National Research Council of Canada ommendations. The framework is able to “plugin” to ex-
(NRC) and Knowledge Pool Canada collaborated on the isting repositories so that users will be able to rate objects
Sifter Project3 in order to produce tools for the retrieval along any number of dimensions and get multiple lists of
of learning objects in large databases. One of the results filtered recommendations.
of this joint work was RACOFI (Rule-Applying Collabo-
rative Filtering), which is a collaboration between Harold
Boley and Daniel Lemire with 5 students and other re- 5 S LOPE O NE Collaborative Filter-
searchers from the NRC. ing Algorithms
RACOFI’s goals are to process both objective and
subjective metadata describing a given object. Hence, We are not particularly interested in using very sophisti-
RACOFI is built with two software agents: a collaborative cated algorithms well fitted to modest data sets. Rather,
filtering library called COFI4 and a rule engine called OO we believe the challenge is to accommodate very large
jDREW5 . Objective metadata can be processed efficiently distributed data sets. Moreover, we believe that simplic-
using rules written in XML, specifically in RuleML (Bo- ity is a key feature for widespread use. Ideally, all ag-
ley, 2003)6 . The RuleML syntax is useful because it is gregated values should be easily understood by the aver-
very general and interoperable: it makes it possible to age engineer or IT specialist. We follow closely the work
build generic Web services with little programming effort. done in the context of Amazon rating systems (Linden
COFI generates predictions from which OO jDREW then et al. , 2003), which is based on Item-to-Item Similarity
generates recommendations (see Fig. 2). measures. In other words, we are interested in comparing
In order to validate this model, we used it in a Canadian learning objects two by two. However, unlike Amazon,
Music recommender site called RACOFI Music7 . A regis- we will not be content to merely discover association rules
tered user can browse objects, rate them and propose new of the type “the course creator who used this learning ob-
ones. Ratings can be done along 5 attributes on a scale ject also used this other one”. Amazon’s goal is simple:
from 0 to 10: overall impression, lyrics, music, original- sell books, DVDs, CDs, etc. Compared to this, the de-
3 http://sifter.elg.ca/ sign of a course is a more complex task with numerous
4 http://savannah.nongnu.org/projects/cofi/ required outcomes.
5 http://www.jdrew.org/oojdrew/ Collaborative filtering for learning objects implies
6 http://www.ruleml.org/
7 http://racofi.elg.ca • having to manage large sets of objects;

5
5 SLOPE ONE COLLABORATIVE FILTERING ALGORITHMS

• for each object, we have several attributes (multidi- with slope one) where b is an average difference. In gen-
mensional evaluation); eral terms, this means that for any rating given by a user,
we have a way to predict ratings on all other objects using
• an ever growing number of users having used the average differences in ratings, and those can be precom-
system or network; puted. Suppose now that Joe has rated Objects A, B, and
C and we must predict how he would have rated object
• many different functions and needs depending on the
D. Because Joe has 3 ratings, A, B, and C, we have 3 dif-
context.
ferent predictions for how he would have rated object D.
Each one of these characteristics poses scalability issues. Hence, we must weight these 3 predictions and one easily
Having thousands and thousands of objects is a challenge accessible weight is the number of people who rated both
not often addressed in the literature (Linden et al. , 2003). objects as in this example: if 12 people rated both object
To see the problem, imagine a database made of 100,000 A and D, 5 people rated both object B and D, and 321
learning objects each of which can be rated on a single people rated both object C and D, a reasonable formula
12pA,D +5pB,D +321pC,D
subjective attribute. Suppose that we want to analyze re- is 12+5+321 where pA,D , pB,D , pC,D are the
lationships between objects. Observe that the number of 3 predicted ratings for D given the rating on A, B, C re-
different pairs of objects is such that even by allocating spectively. In (Lemire & Maclachlan, 2005), this algo-
only 32 bits per pair of object, we still need over 37 MB rithm is called W EIGHTED S LOPE O NE. At the imple-
in storage. It is easily seen that if each object can be rated mentation level, for each pair of objects, we want to keep
on various attributes, the number of relationships becomes track of how many users rated both, and the sum of the
very large. differences in ratings. As previously noted, these aggre-
To offer fast query times, we propose to precompute gates are simple and easily updated when new data comes
relations among objects as in (Lemire, 2005; Lemire & in. In practice, not all pairs of objects are significant: we
Maclachlan, 2005; Sarwar et al. , 2001). The problem would only store pairs for which the count, i.e., number
then becomes almost solely a storage issue. Because of users who rated both, is significant (Ng et al. , 2001).
the problem is essentially multidimensional, multidimen- Because S LOPE O NE may appear simplistic, we need
sional database techniques can be used (Codd et al. , to benchmark it to make sure it has a reasonable predic-
1993; Lemire, 2002; Kaser & Lemire, 2003) for highly ef- tion accuracy. We used the EachMovie and Movielens
ficient storage including Iceberg Cubes (Ng et al. , 2001). data sets made available by Compaq Research and the
We now describe the S LOPE O NE algorithm (Lemire Grouplens Research Group respectively. EachMovie con-
& Maclachlan, 2005). Imagine that we have two objects sists of movie ratings collected by a web site run over a
1 and 2. Suppose that we have simple unidimensional few years: ratings range from 0.0 to 1.0 in increments
ratings by users Joe, Jill, and Stephen as follows. of 0.2, whereas Movielens has ratings from 1 to 5 in in-
crements of 1. EachMovie and Movielens are relatively
Object 1 Object 2 modest data sets with only slightly more than 1600 and
Joe 2/10 unrated 3000 movies respectively. We selected enough users to
Jill 3/10 7/10 have 50,000 ratings as a training set and tested the ac-
Stephen 4/10 6/10 curacy of the predictions over a test set made of at least
100,000 ratings (Lemire & Maclachlan, 2005). In our
We see that user Joe did not rate “Object 2”. How can experiments, using the mean absolute error (MAE), we
we predict his rating in the simplest possible way? One found W EIGHTED S LOPE O NE to be about 2-5% better
approach is to observe that on average “Object 2” was than the similar but slightly more expensive I TEM -BASED
rated 3/10 higher than “Object 1” based on users Jill and algorithm (Sarwar et al. , 2001). While other algorithms
Stephen, hence we can predict that Joe will rate “Object are more accurate, we feel that given its simplicity and
2” 5/10. We call the algorithm S LOPE O NE because we relative ease of implementation W EIGHTED S LOPE O NE
take an actual rating x and add or subtract something to it, is a good candidate for large learning object data sets.
so that our predictor is of the form x + b (a linear function The approach that we have taken to implement this

6
6 INFERENCE RULES

function predict(user u, object k):

INPUT: user u is modelled by a map from object IDs to rating values.
INPUT: object k is an object ID.
WILL RETURN: predicted rating by u on object k.
denominator ← 0
numerator ← 0
for each object j rated by u do
1. get the count and sum from the dev table
2. calculate the average difference
3. denominator = denominator + count
4. numerator = numerator + count × (average + the value u rated j)
end for
return numerator / denominator

Figure 4: Algorithm used to implement the W EIGHTED

S LOPE O NE scheme.

is; “how many users who rated object i also rated object
j”, which is similar to one of the features found on Ama-
zon8 . In other words, the aggregates we precompute for
Figure 3: RACOFI Composer relational schema. the W EIGHTED S LOPE O NE algorithm have the potential
to be reused for other applications.

W EIGHTED S LOPE O NE algorithm involves the use of

a relational database to store relationships between ob-
6 Inference Rules
ject pairs. These stored relationships are the number of
One shortcoming of using recommendations based soley
users who have rated both objects and the sum of the rat-
on the collaborative filtering predictions is that the rec-
ing differences between them, which we will be referring
ommendations are limited by the subjective metadata that
to as count and sum respectively. We chose to imple-
users have rated. By combining the collaborative filtering
ment the algorithm this way as it allowed us the great-
algorithm with an inference rule system we can use not
est amount of flexibility, portability and scalability in that
only the subjective predicitions, but objective metadata
our database schema (see Fig. 3) supports the simple ad-
about the objects, such as title, author and release date
dition/subtraction of objects, users and new attributes (or
together with information from the user’s profile to make
dimensions) to rate the objects against, as well as being
the recommendations narrower and more personalized.
able to support very small and very large data sets. An-
On the one hand, we argue that it might be prefer-
other benefit of this approach is that it keeps redundancy
able in a distributed and heterogeneous setting to use rule
to a minimum.
sets rather than to supplement directly the collaborative
Using the model outlined by the schema in Fig. 3, we filtering algorithms with content-specific information us-
have what is called a dev (short for deviation) table for ing “bag of words” models, for example (Balabanovic &
each attribute stored in the database (this is indicated in Shoham, 1997; Melville et al. , 2002). It is still possible
Fig. 3 by the # symbol, which becomes the values of all to mine the rules, by e.g. Bayes models, but by feeding
attributeIDs in the attribute table). As can be seen in Al- them in to a rule engine, system administrators, users and
gorithm 4, the implementation is simple. Besides using instructors can remain in control. Further, we can address
these dev tables for the W EIGHTED S LOPE O NE algo- diverse content and are not forced to deal with domain-
rithm, we are able to draw several other interesting con- specific methodologies as part of the system architecture.
clusions by analyzing the count and sum fields for specific
object pairs. One simple conclusion that can be drawn 8 http://www.amazon.com/

7
7 OBJECT COMPOSITION

On the other hand, filtering rules can help make the sys- ‘common-artist’ bonus, and Mary’s not indicating that,
tem more scalable: we can quickly eliminate or predict the rule above could be refined as:
objects of interest using rules and thus, save memory and
run-time cost. If a user u has the ’common-artist’ bonus indi-
One of our main uses of the inference rule system is to cated in their profile
enhance the predictions, based upon objective similarities and the rating of attribute a for a rated object
of the learning objects, to achieve better recommendations ro, is rated highly (e.g. 5) by user u
for the users of the system. For example, our common and the author of ro is the same as the author of
artist bonus assumes that if a user rated objects from a object o,
particular author highly, they are more likely to rate other then increase the predicted rating of attribute a
objects by the same author highly. Based on this, we can of object o for user u.
add a rule to the system to increase the predicted rating
for an attribute of an object if the user rated that attribute Based upon the users’ profiles and on the first condi-
of another object with the same author highly: tion of the refined rule, the common-artist bonus would
apply only to users that had this specified in the profile,
If the rating of attribute a for a rated object ro like John, and not to other users, like Mary.
is high (e.g. 5) We can also augment our filtering rules to take advan-
and the author of ro is the same as object o, tage of the user profiles. For example, if we consider our
then increase the predicted rating of attribute a two users, John and Mary, and they have paid for access
of object o. to different parts of the system, we would not want to rec-
ommend to them types of content that they do not have
Another use for the inference rule system is to serve access to. Therefore, we could augment their user pro-
as a tool for filtering the recommendations in the system. files indicating which types of content they are and are
For example, we could add a filtering rule to the system not allowed to access, and introduce rules similar to the
so that if the predicted value for the attribute of interest is following one:
below a threshold then we would block that object:
If the object o is of type t
If the predicted rating of attribute a of object o and user u does not have access to objects of
is less than 4 type t
and we are building a recommendation list for then block object o for user u.
attribute a
then block object o.

Finally, we envision that users can be represented as a

7 Object Composition
set of specific rules which define their profile. Because
By our definition, learning objects must be reusable,
we can express these rules using the XML application
hence the context in which they are used can vary at least
RuleML (Boley, 2003), these profiles can be understood
slightly: a schema explaining how to set up one’s com-
by a wide range of systems regardless of the specific tech-
puter monitor can be used in a English/French translation
nology or purpose they have. Similarly, metadata about
course as well as in an engineering course.
an object can embed RuleML-represented rules that can
So, we can’t abstract out context and relations among
enforce some logic about the object.
objects in such a way that the problem is fundamen-
By using information stored in user profiles we can
tally multidimensional: An object is not simply ‘good’
augment our previous rules to make them more power-
or ‘bad’. In order to better understand how collaborative
ful. Our first rule could be changed to only apply if the
filtering can be used for object composition, we designed
the rules/facts attached to the user profile indicated that it
a Web site called inDiscover9 (see Fig. 5) allowing users
should apply to that user. So, if you consider two users,
John and Mary, with John’s profile indicating to give the 9 http://www.indiscover.net

8
REFERENCES

using inDiscover community building as a commercial

example.

9 Acknowledgments
We would like to thank Anna Maclachlan for her advice
and contribution to the design of the S LOPE O NE collab-
orative filtering algorithms. We would like to thank Su-
san O’Donnell, Murray Crease, and Jo Lumsden for their
Human-Interface expertise. We would also like to thank
our Group Leader Bruce Spencer for his support and the
NRC e-Learning Group for advising us with e-Learning
topics.
Figure 5: Example of an evaluation form for on-line
MP3s. The screen shot comes from the Web site inDis-
cover prepared as part of the RACOFI Composer project. References
Anderson, M., Ball, M., Boley, H., Greene, S., Howse,
to manage recommendation lists pointing to music files N., Lemire, D., & McGrath, S. 2003 (October).
(MP3 format). The composition of a list for a given con- RACOFI: A Rule-Applying Collaborative Filtering
text is based on having objects with higher predicted val- System. In: Proceedings of COLA’03. IEEE/WIC.
ues earlier in the recommendation list. Richer composi-
tion is possible in principle such as mood development for Balabanovic, M., & Shoham, Y. 1997. Fab: Content-
recommendation lists that are musical playlists. While the based, collaborative recommendation. Communica-
Web site is targeted to independent musicians as a tool to tion of the ACM, 40(3), 66–72.
help them increase their visibility, our long-term motive is
to learn how to compose objects into meaningful, context- Berners-Lee, Tim. 1998. A roadmap to the Semantic Web.
aware lists. In inDiscover, each user can have a recom-
mendation list for each type of context (mood/situation): Bhavsar, Virendra C., Boley, Harold, & Yang, Lu. 2003.
party, workout, relaxation, and so on. A Weighted-Tree Similarity Algorithm for Multi-
Agent Systems in e-Business Environments. Pages
53–72 of: Proc. Business Agents and the Semantic
8 Conclusions Web (BASeWEB) Workshop. NRC 45836.

We take for granted that learning object repositories are Bhavsar, Virendra C., Boley, Harold, Hirtle, David,
part of the future of e-Learning because of the great poten- Singh, Anurag, Sun, Zhongwei, & Yang, Lu. 2004.
tial for cost reduction and personalized instruction. How- A Match-Making System for Learners and Learning
ever, we see several challenges before learning objects can Objects. Learning & Leading with Technology.
fulfill their potential. We believe that collaborative fil-
tering and rules can do for learning objects what Google Boley, Harold. 2003. Object-Oriented RuleML: User-
did for the Web. The challenge will be to meet the real Level Roles, URI-Grounded Clauses, and Order-
(collaborative and inferential) needs expressed by course Sorted Terms. In: Proc. Rules and Rule Markup Lan-
creators and students. We are currently exploring the next guages for the Semantic Web (RuleML-2003). Sani-
steps in MP3 recommendation together with Bell Canada, bel Island, Florida, LNCS 2876, Springer-Verlag.

9
REFERENCES REFERENCES

Breese, J. S., Heckerman, D., & Kadie, C. 1998. Empiri- Kaser, Owen, & Lemire, Daniel. 2003 (November). At-
cal Analysis of Predictive Algorithms for Collabora- tribute Value Reordering for Efficient Hybrid OLAP.
tive Filtering. In: Fourteenth Conference on Uncer- In: DOLAP.
tainty in AI. Morgan Kaufmann.
Koivunen, Marja-Riitta, & Miller, Eric. 2001. W3C Se-
Codd, E. F., Codd, S., & Salley, C. 1993. Providing OLAP mantic Web Activity. In: Semantic Web Kick-off
(On-line Analytical Processing) to User-Analysts: Seminar.
An IT Mandate. Tech. rept. E. F. Codd & Associates.
Downes, S., Fournier, H., & Regoui, C. 2004 (May). Pro- Lemire, Daniel. 2002 (October). Wavelet-Based Relative
jecting Quality. MADLaT. Prefix Sum Methods for Range Sum Queries in Data
Cubes (Best Paper). In: CASCON’02.
Downes, Stephen. 2001. Learning Objects: Resources
For Distance Education Worldwide. International Lemire, Daniel. 2005. Scale and Translation Invariant
Review of Research in Open and Distance Learning. Collaborative Filtering Systems. Information Re-
trieval, 8(1), 129–150.
Downes, Stephen. 2002. Topic Representation and Learn-
ing Object Metadata. Lemire, Daniel, & Maclachlan, Anna. 2005. Slope One
Downes, Stephen. 2003 (August). Meaning, Use and Predictors for Online Rating-Based Collaborative
Metadata. Filtering. In: SIAM Data Mining.

Fiaidhi, J. 2004. RecoSearch: A Model for Collabora- Linden, Greg, Smith, Brent, & York, Jeremy. 2003. Ama-
tively Filtering Java Learning Objects. ITL, 1(7). zon.com Recommendations: Item-to-Item Collabo-
rative Filtering. IEEE Internet Computing, 7(1), 76–
Fiaidhi, J., Passi, K., & Mohammed, S. 2004. Developing 80.
a Framework for Learning Objects Search Engine.
In: Internet Computing’04. Lundgren-Cayrol, Karin, Paquette, Gilbert, Miara,
Gibbons, Andrew S., Nelson, Jon, & Richards, Robert. Alexis, andJacques Rivard, Fréderick Bergeron, &
2002. The Instructional Use of Learning Objects. Rosca, Ioan. 2001. Explor@ Advisory Agent: Trac-
Agency for Instructional Technology. Chap. The Na- ing the Student’s Trail. In: WebNet’01 Conference.
ture and Origin of Instructional Objects.
Melville, P., Mooney, R. J., & Nagarajan, R. 2002.
Goldberg, D., Nichols, D., B.M.Oki, & Terry, D. 1992. Content-Boosted Collaborative Filtering for Im-
Using Collaborative Filtering to Weave and Informa- proved Recommendations. Pages 187–1992 of:
tion Tapestry. CACM, 35(12), 61–70. AAAI/IAAI.
Han, K., Kumar, V., & Nesbit, J.C. 2003 (November). Mendelzon, Alberto O. 1998. The Web is not a Database.
Rating learning object quality with bayesian belief Pages 0– of: Workshop on Web Information and
networks. In: E-Learn. Data Management.
Herlocker, J., Konstan, J., Borchers, A., & Riedl, J. 1999.
An Algorithmic Framework for Performing Collab- Nesbit, J., Belfer, K., & Vargo, J. 2002. A Convergent
orative Filtering. In: Proc. of Research and Devel- Participation Model for Evaluation of Learning Ob-
opment in Information Retrieval. jects. Canadian Journal of Learning and Technol-
ogy, 28(3).
Karypis, G. 2000. Evaluation of Item-Based Top-N Rec-
ommendation Algorithms. Tech. rept. 00-046. Uni- Ng, Raymong, Wagner, Alan, & Yin, Yu. 2001. Iceberg-
versity of Minnesota, Department of Computer Sci- cube Computation with PC Clusters. In: ACM SIG-
ence. MOD 2001.

10
REFERENCES REFERENCES

Pennock, D. M., & Horvitz, E. 1999. Collaborative Fil-

tering by Personality Diagnosis: A Hybrid Memory-
and Model-Based Approach. In: IJCAI-99.
Raîche, Gilles. 2000. La distribution d’échantillonnage
de l’estimateur du niveau d’habileté en testing adap-
tatif en fonction de deux règles d’arrêt : selon
l’erreur-type et selon le nombre d’items administrés.
Ph.D. thesis, Université de Montréal.
Recker, M., Walker, A., & Lawless, K. 2003. What do
you recommend? Implementation and analyses of
collaborative filtering of Web resources for educa-
tion. Instructional Science, 31(4), 229–316.
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., &
Riedl, J. 1994. Grouplens: An open architecture for
collaborative filtering of netnews. Pages 175–186 of:
Proc. ACM Computer Supported Cooperative Work.
Sarwar, B. M., Karypis, G., Konstan, J. A., & Riedl, J.
2001. Item-based Collaborative Filtering Recom-
mender Algorithms. In: WWW10.
Weatherley, John, Sumner, Tamara, Khoo, Michael,
Wright, Michael, & Hoffmann, Marcel. 2002. Part-
nership reviewing: a cooperative approach for peer
review of complex educational resources. Pages
106–114 of: Proceedings of the second ACM/IEEE-
CS joint conference on Digital libraries. ACM Press.
Weiss, S.M., & Indurkhya, N. 2001. Lightweight Collab-
orative Filtering Method for Binary Encoded Data.
In: PKDD ’01.