Unit-18
Unit-18
18.6 Summary
18.7 Answers to Self Check Exercises
18.8 Keywords
18.9 References and Further Reading
18.0 OBJECTIVES
In the efforts for search for approprriate techniques of information retrieval,
various models have been developed. This Unit deals with basics ofinformation
retrieval techniques and the different types of models developed from time to
time. After reading this Unit, you will be able to:
• know the basics and types of factors involved in the information retrieval;
• know the shift from.the conventional to modem information retrieval;
• understand the process of matching information need and retrieval of information
from databases, knowledge bases, information systems and libraries;
• be conversant with the development of information retrieval models; and
• acquaint yourself with research areas in the field of information retrieval.
497
Information Retrieval
18.1 INTRODUCTION
Pattern of information retrieval indicates a knowledge seeking behaviour of
individuals and groups of individuals. In this context, a searcher seeks some
information from the vast store of a knowledge, An analysis and diagnosis of this
state of mind provides guidelines for organisation of knowledge in libraries,
information retrieval systems, databases, knowledge bases and similar
environments. Such guidelines are aimed at providing conducive compatibility
between searchers' approach and knowledge organisation in a database. The
current human environment consisting of learning, problem-solving, and decision-
making situation calls for flexibility in knowledge structures at each instance. The
development in computer and communication technologies have made it possible
to store vast amount of information in compact fOrID. The variety of software
developed have also given scope for quick retrieval of information from this
store. While the speed of retrieval is valuable, it could be enriched further if the
retrieved information is readily assimilable by the information seeker. It is in this
context that modelling of retrieved information into user-friendly approaches calls
for cognitive modelling of information retrieval.
Such development has given rise to a field 'Cognitive Science' which is an inter-
disciplinary field drawing inputs from the fields of Psychology, Behavioural Studies,
Computer Science, Artificial Intelligence and Information Science.
498
d) The relevance information (often supplied by the users of the system connecting Information Retrieval Models
the information requests to the stored information items). and Their Applications
The first information retrieval system originated with the need to organize
information in libraries. Catalogues were created to facilitate the identification
and retrieval of items. The term 'item' is used to represent the smallest complete
unit that is processed and manipulated by the system. The definition of item
varies by how a specific source treats information. A complete document, such
as a book, newspaper or magazine could be an item. Each chapter, or article may
also be defined as an item. As sources vary and systems include more complex
processing, an item may address even lower levels of abstraction such as a
contiguous passage of text or a paragraph.
In addition' to the complexities in generating a query, quite often' the user is not
an expert in the area that is being searched and lacks domain specific vocabulary
that is unique to that particular subject area. The user starts the search process
with a general concept of the information required, but does not have a focused
definition of exactly what is needed. A limited knowledge of the vocabulary
associated with a particular area along with lack of focus on exactly what
information is needed leads to use of inaccurate and in some cases misleading
search terms. Even when the user is an expert in the area being searched, the
ability to select the proper search terms is constrained by lack of knowledge of
the author's vocabulary.
There are natural obstacles to specification of the information, a user needs that
come from ambiguities inherent in languages, limits to the user's ability to express
what information is needed and differences between the user's vocabulary corpus
and that of the authors of the items in the database. Languages suffer from word
ambiguities such as homographs and use of acronyms that allow the same word 499
Information Retrieval to have multiple meanings. Many users have trouble in generating a. good search
statement. The typical user does not have significant experience with, nor even'
the aptitude for Boolean logic statements. The use of Boolean logic is a legacy
from the evolution of database management systems. Multi-media also adds an
additional level of complexity in search specification.
Thus, an information retrieval system must provide tools to help to overcome the
search specification problems. In particular, the search tools must assist the user
automatically and through system interaction in developing a search specification
that represents the need of the user and the writing style of diverse authors.
18.2.1 Database
On the other hand, precisely because information can be stored in a free form,
unformatted way, electronic means of storage also make it possible to retrieve
and manipulate information in ways that are not possible in a paper-based system,
as our increasing reliance on electronic means of storage and retrieval system
makes it clear. The difference in power and flexibility in retrieving information
can be illustrated by comparing searches done in print indexes and their electronic
counterparts. A search in a print index restricts the user to chosen keywords or
index terms that can be combined in relatively limited ways. A search in an
electronic database generally allows freedom to the searcher and allows him to
define more precisely the hierarchical and proximal relationships the terms should
have.
Such flexibility is necessary because of the widely varying sources and reasons
for needing information,' which are rooted in what Belkin, Brooke and Oddy
[1972] refer to as Anomalous State of Knowledge (ASK). These ASKs may
represent the lack of a particular fact in the information seeker's knowledge
base, they may represent a much larger area of missing information, or they may
represent a lack of knowledge structure. A small need can often be answered
with a single fact, while a much larger or more unstructured area within one's
knowledge base .may potentially be answered in a variety of ways be a number
of facts or documents. Information base, whether databases or collections, must
be structured and organised to meet the potential needs of the people who will
use them, so that the means of information retrieval must also be selected or
created with an understanding of what information needs will have to be met,
and how people are likely to understand and use the system.
-
18.2.3 Structured Query Language (SQL)
Structured Query Language (SQL) is a query language used for accessing and
modifying information in a database. Some common SQL commands include
'insert,' 'update,' and 'delete'. Queries take the form of a command language
that lets a .user select, insert, update, find out the location of data, and so forth.
There is also a programming interface. The language was first created by IBM
in 1975 and was called SEQUEL for "Structured English Query Language".
Since then, it has undergone a number of changes, with a lot of influence from'
Oracle Corporation. Today, SQL is commonly used for Web database development
and management. Though SQL is now considered to be a standard language,
there are still a number of variations of it, such as mSQL and mySQL. Many
database products such as MS-Access, SQL Server and Oracle support SQL
with proprietary extensions to the standard language. Some Information Retrieval
Systems are limited to finding those facts or documents containing characteristics
specified by the query. Such systems are often referred to as database systems
or database as SQL or forms variant of it recognized by the system under
consideration. SQL allows precise specification of the value for attributes of
terms to be retrieved.
Select from courses where Student's Name = "Anjali Kapoor" and Department
= "Management Science"
501
Information Retrieval
18.3 INFORMATION RETRIEVAL TECHNIQUES
The organisation of information and the development of various techniques to
retrieve information have been major areas of research. With the development
in computer technology, interest in this area has been renewed through greater
emphasis on the computerised information retrieval systems.
The required techniques can be broadly categorised into exact match, best match
and partial match techniques and the latter can be divided into individual or
network techniques. These can be further broken down to accommodate specific
techniques, such as, cluster, probabilistic, vector-space and so on. The technique
most widely used is the exact match retrieval one. It is implemented as Boolean,
.full-text or string searching. Its advantages are that it is easy to understand and
use and is available on most systems. But its disadvantages are many. It misses
many relevant texts, which match the query only partially; it neither ranks
documents nor does it take into account the relative concepts either within the
query or within the text. We may have to think about a researcher while looking
for leads or at the beginning of the research problem, be satisfied with information
retrieved using only this technique. In research of multi-dimensional and multi-
disciplinary nature, and sometimes covering fringe areas, probably a spreading
activation technique or the use of a citation strategy where the seed document
is a highly cited one, will be more effective in this instance. The best match
refers to comparison of two or more and neednot be same as exact match. You
may refer to Unit 17 for further details.
The issue of matching queries to retrieval has another aspect of study. The issue
here is whether specific retrieval techniques for specific kinds of queries are
on a professional basis of appropriateness and is preferable, or we can also opt
for a single retrieval technique to many questions.
Data retrieval model essentially handles data. For the purpose of our understanding,
data can be taken as unprocessed information or preliminary phase of information.
Data is an unbiased fact which can be used to form an information. For example,
we can say that the 'population of the city of Jaipur is eighty (80) lakhs. This is
a data. Thus, a census system is a data retrieval system. Similarly, National
Sample Survey Organization and Central Statistical Organisation can be taken to
be numerical data systems. A data retrieval model calls for organisational structure
based on various criteria such as properties, clusters and other different entities.
There is a need for a taxonomic presentation of these aspects. Such a taxonomic
presentation must also be accessible from other types of associations. A searcher
of a data comes for a specific information retrieval. Therefore, the expression
of information need should be very precise. Therefore, the data retrieval model
is a simple model of information retrieval needing specific matching techniques
viz., a taxonomic structure of the various entities involved and their properties.
a) The so-called knowledge base or a store of accumulated set of rules for converting
information into knowledge. It also incorporates knowledge acquisition system.
b) The second aspect of the system is inference engine. An inference engine is
capable of deriving appropriate information from the combination of rules for
deriving a synthesized knowledge. This process of deriving is based on inferential
logic using quantitative and non-quantitative techniques.
c) A user interface, i.e., conversational process in the model which is capable
of receiving information in the conversation mode and converting it into
database signals for interaction purposes. Thus, a knowledge retrieval model
is a sophisticated model of information processing, organization and retrieval.
...............................................................................................................
, ,
503
Information Retrieval
18.5 MODELS BASED ON THEORIES AND TOOLS
Based on theories and methods/tools available in other disciplines, a number of
models have been developed in order to find satisfactory solutions for information
retrieval problems.
The query process uses the weights along with any weights. assigned to terms
in the query to determine a scalar value (rank value) used in predicting the
likelihood that an item satisfies the query. The results are presented to the user
504. in order of the rank value from highest number to lowest number.
If weights are assigned to the terms between the values 0'.0 to 1.0', they may be Information Retrieval Models
interpreted as the significance that users are placing on each term. The value 1.0' and Their Applications
is assumed to be the strict interpretation of a Boolean query. The value 0'.0'is
interpreted .to mean that the us~r places little value on the term. Under these
assumptions, a term assigned a value of 0'.0'should have no effect on the retrieved
set. Thus,
This suggests that as the weight for term B goes from 0'.0'to 1.0' the resultant
set changes from the set of all items that contains term A to the set normally
generated from the Boolean operation. The process can be visualised by use of
the Venn diagram shown in Figure 18.1.
Under the strict intetpretation "AI OR B1" would include all items that are in
all the areas in the Venn diagram. "AI OR BO'''would be only those items in A
(i.e., the white and dark shaded areas) which is everything except items in "B
NOT A" (the grey area). Thus, as the value of query term B goes from 0'.0'to
1.0',items from "13 NOT A" are proportionally added until at 1.0'all of the items
will be added.
Similarly, under the strict interpretation "AI AND B 1" would include all of the
items that are in the black dotted area. "AI AND BO''' will be all of the items
in A as described above. Thus, as the value of query term B goes from 1.0' to
0'.0'items will be proportionally added from "A NOT B" (white area) until at 0'.0'
all of the items will be added.
Finally, the strict interpretation of "AI" NOT BP' is white area while "AI NOT
BO'''is all of A. Thus as the value ofB goes from 0'.0'to 1.0',items are proportionally
added from "A AND B" (dark shaded area) until at 1.0' all of the items have
been added.
The final issue here is the determination of which items are to be added or
dropped in interpreting the weighted values.
Fuzzy logic supports values - true and false as well as other values in between. 50'5
Information Retrieval The conceptual fuzzy logic was introduced by Professor Lotfi A Zadeah. The
basic objective of the fuzzy logic is to develop a model that could be close to
natural language process. It is an appropriate. tool for mode ling the kind of
uncertainty associated with vagueness with imprecision.
Fuzzy retrieval provide the capability to locate spellings of words that are similar
to the entered search term. This function is primarily used to compensate for
errors in spelling of words. Fuzzy retrieval increases recall at the expense of
decreasing precision. In the process of expanding a query term, fuzzy retrieval
includes other terms that have similar spellings; giving more weight to words in
the database that have similar word lengths and position of the characters as the
entered term. A fuzzy search on the term 'computer' would automatically include
the following words from the information database: 'computer', 'compiter',
'computter', 'compute'. An additional enhancement may lookup the proposed
alternative spelling and if it is a valid word with a different meaning, include it
in the search with a low ranking or not include it at all (e.g., 'commuter').
Systems allow the specification of the maximum number of few terms that the
expansion includes in the query .
.
• Fuzzy retrieval has its maximum utilisation in a system that accepts items that
have been optical character recognised. In the OCR process, a hardcopy item
is scanned into a binary image. The OCR process is a pattern recognition process
that segments the scanned in image into meaningful sub-regions, often considering
a segment - the area defining a single character. The OCR process will then
determine the character and translate into an internal computer encoding. Based
upon the original quality of the hardcopy this process introduces errors in
recognising characters. With decent quality input, systems achieve in the 90-99
per cent range of accuracy. Since these are character errors throughout the text,
fuzzy retrieval allows location of items of interest compensating for the erroneous
characters.
The set theoretical view of information retrieval is based on the recognition that
information requests are normally formulated by choosing collections or sets of
item identifiers, or keywords. The keyword sets 'in turn' lead to the retrieval of
record subsets chosen from among the stored collection of records. The
fundamental data of retrieval theory. are provided in this view by the relations
which exist between the set of item descriptions and the corresponding record
sets.
The Vector Space Model procedure can be divided into three stages. The first
stage is the document indexing where the content bearing terms are extracted
from the document text. It is obvious that many of the words in a document do
not describe the content, like, the, is, are, in, to, of, etc. These are called non-
significant words or stop words. In case of automatic indexing, these terms are
removed from the document vector, so the document will only be represented by
the content-bearing terms. In general, 40-50% of the total number of words, in
a document, are stop words. These can be removed with the help of a stop word
list. The second stage is the weighting of the indexed terms to enhance retrieval
of document relevant to the user. The last stage ranks the document with respect
to .the query according to a similarity measure.
The VSM is contrary to the Boolean Retrieval Model in which retrieval is based
on the hundred percent (exact) match. The VSM allows retrieval of the most
similar to the query without the exact match. Thus, the VSM can be well
explained in terms of keyword-by-document matrix (AJ, in which the rows
correspond to keywords (W) in the database and the columns correspond to
documents (D), then the matrix will be like:
Dl D2 D3 D4 Dn
The VSM is a retrieval model which constitutes a fairly large class of retrieval
methods, each consisting of an indexing method and a retrieval function, The
indexing method generates description vectors, and the retrieval function generates
retrieval status values by comparing the query description vector with the document
description vectors. A conceptual diagram of VSM is given at Figure 18.2. The
information seeker is assumed to have information need, which he formulates as
a query. The query q and the document dj are indexed in two steps. First
appropriate indexing features are spotted in the query q and in the document dj.
Secondly, these features are assigned weights to obtain the query description and
the document descriptions are sets of weighted indexing features. These are
called document description vector and query vector. The query description and
document descriptions are matched and a score is generated for every document
pair. These scores are called Retrieval Status Values (RSVs). For every query,
the documents are presented to the information seeker in descending order of
these RSVs.
Information Database
Need
Documents
Query
(dj)
(q)
Query Document
Description Description
(q-) (dj-)
The VSM relies on the premise that the meaning of a document can be derived
from the document's constituent terms. Each keyword in a document collection
forms document vector which represents the single or multiple occurrences of
the term i in document d. Similarly, a query is represented by a query vector
which denotes the number of occurrences of terms in the query. Both the
document vector and query vector provide the locations of the objects in the
term-document space. There are two common one-dimensional measures that
508 every vector has, length and angle with respect to a fixed point. The angle
between two vectors refers to the measure in degrees between those two vectors. Information Retrieval Models
The document vector whose angle is closest to the query vector's angle is the and Their Applications
best choice, yielding the document most closely related to the query. It is measured
in terms of cosine angle between the two vectors. If the cosine of the angle is
1, then the angle between the document vector and the query vector measures
° degree, meaning the document vector and the query vector move in the same
direction. A cosine measure of ° would mean the document is unrelated to the
query vector. Thus, a cosine measure close to 1 means that the document is
closely related to the query.
Kannada
Literature Document
Vector
Query Vector
I Criticism
Probability theory can also be used to rank, and order documents according to
their probability of relevance. Robertson [1978] shows that the order of documents
can be based on term values and on 'Optimal Retrieval Function'. However, if
one attempts to rank order of the documents in a Boolean environment, some
difficulties arise which are inherent to the Boolean logic. Bookstein [1978]
suggested that the retrieved documents be ordered according to the number of
Boolean expressions present in the document that are true.
Doszkoc [1978] suggests that probability associations are being used to find
terms that are associated with other terms. The association procedures are
based on term occurrences and the frequency of these terms in the database. ""-
William Cooper has formulated the Probability Ranking Principle as "If a reference
retrieval system's response to each request is a ranking of the documents in the
collections in order of decreasing probability of usefulness to the user who
submitted the request, where the probabilities are estimated as accurately as
possible on the basis of whatever data has been made available to the system
for this purpose, then the overall effectiveness of the system to its users will be
the best that is obtainable on the basis of that data". In this method the system
replies to a query by presenting the beginning of a list of documents that are
ranked in descending order of scores that either represent probabilities themselves
or could be mapped to probabilities by means of an order preserving transformation.
These scores often called Retrieval Status Values (RS V) depend on document
descriptions consisting of appropriate statistical information about the indexing
features. Such score may also depend on domain dependent parameters that are
estimated by means of additional data e.g. by a thesaurus.
In linguistic model for information retrieval, study the information retrieval from
the point of view of the properties of language. Information retrieval is provided
by features of natural language as well as artificial languag.; 1he various ways
510 of storage of information are essentially based on natural language. The human
communication itself is full of natural language. In short, the languages carry Information Retrieval Models
and Their Applications
three types of functions:
The logical structure of a language and the taxonomy of the languages refers to
relationship between vocabulary and concepts. The vocabulary generally refers
to the logical structure. In modern times the vocabulary control also include
thesaural control and technical glossary control. Use of transformational grammar
as well as parsing techniques provide processing speed of the language for
information retrieval. Besides this, indexing language with coordinative control
provides a basic model for information retrieval. Use of associative mathematics
in search logic and in search expression formulation, provide yet another type of
language control in information retrieval. This linguistic model forms an essential
base for information retrieval. In social science field, language plays an ambiguous
role because the terminology of the field is not as rigorous as in the field of
natural sciences.
The economic model of information retrieval centres round the measures of cost
effectiveness and cost efficiency of information retrieval These two criteria are
based on performance of information retrieval systems in relation to input cost
as well as the number of successful outputs. The concept of provision of multiple
access points being used gives a chance for measurement of information transfer.
The field of information retrieval, which has developed several models of'
information measurement based on statistical and mathematical techniques used
for studies in bibliometrics and scientometrics provides a scope for correlation of
economic benefits. However, due to various intangible elements in information
retrieval, which cannot be identified, the economic model does not yet provide a
holistic approach to information retrieval.
Looking at the Internet at the current time there are three classes. of mechanisms
to help find information: manually generated indexes or directories, automatically
generated indexes and web crawlers (intelligent agents). Yahoo (http://
www.yahoo.com) is an example of the first case where information sources
(home pages) are indexed manually into a hyperlinked hierarchy. The user can
navigate through the hierarchy by expanding the hyperlink on a particular topic
to see the more detailed sub-topics. At some point the user starts to see the end
items. Lycos (http://www.lycos.com)andAltaVista(http:/ /www.altavista.com)
automatically go out to other Internet sites and return the text at the sites- for
automatic indexing. Lycos returns home pages from each site for automatic
indexing while AltaVista indexes all of the text at a site.
Wep crawlers (WebCrawler, Open Text, Pathfinder) and intelligent agents (Coriolis
Groups' NetSeeker) are tools that allow a user to define items of interest and
they automatically go to various sites on the Internet searching for the desired
information. The Uniform Resource Locator (URL) hypertext links can map to
another item or to a specific location within an item.
512
Self Check Exercise Information Retrieval Models
and Their Applications
4) Write down the different Information Retrieval Models based on theories and
tools.
18.6 SUMMARY
An information retrieval system (IRS) is a system that is capable of storage,
retrieval, and maintenance of information. Information in this context can be
composed of text (including numeric and date data), images, audio, video and
other multi-media objects. It consists of a software program that facilitates a
user in finding the information the user needs. An IRS includes database,
information base' and Structured Query Language (SQL). Database consists of
a set of records or files. An information base consists of a set of databases. SQL
is a standard interactive and programming language for getting information from
and updating a database. Many database products such as MS-Access, SQL
Server and. Oracle support SQL with proprietary extensions to the standard
language. Queries take the form of a command language that lets user select,
insert, update, find out the location of data, and so forth. Information retrieval
.systems process users queries as well as manipulate online databases using
SQL. Thus, it is very important area of study. Retrieval models, like, Data
Retrieval Model, Information Retrieval Model and Knowledge Retrieval Model
are based on input and output. Data Retrieval Model handles data and can be
taken as unprocessed information or preliminary phase of information. Census
system is a data retrieval system. Information Retrieval Model is data oriented
to a purpose. It combines several data into a relational structure. Knowledge
Retrieval Model assimilates several types of information in order to facilitate
decision-making and problem solving. It is a sophisticated model of information
processing and organisation. The IR models based on theories and tools try to
develop efficient IR systems.
4) The different information retrieval models based on theories and tools are:
• Boolean Retrieval Model (Standard and Weighted)
• Fuzzy Logic Model
• Set Theoretic Model
• Vector Space Model
• Probabilistic Retrieval Model
• Linguistic Model
• Mathematical Model
• Psychological Model
• Economic Model
• Hypertext Linkage Model
18.8 KEYWORDS
Anomalous State of Lack of a particular fact in the information see-
Knowledge (ASK) ker's knowledge base, it may represent a much
larger area of missing information, or it may
represent a lack of knowledge structure.
Fuzzy Retrieval The capability to. locate spellings of words that
are similar to the entered search term. Fuzzy
retrieval increases recall and decreases precision.
In the process of expanding a query term, fuzzy
retrieval includes other terms that have similar
spellings. This function is primarily used to
compensate for errors in spelling of words.
Retrieval Status Values In retrieval process, the query description and
(RSVs) document descriptions are matched and a score
is generated for every document pair. These scores
are called Retrieval Status Values (RSVs).
Weighting Weighting is the process of assigning an
importance to an index term's use in an item.
The weight represents the degree to which the
concept associated with the index term is
represented in the item. In a weighted indexing
system, an attempt is made to place a value on
the index term's representation of its associated
concept in the document.
Bookstein, Abraham. (1978). On the perils of merging Boolean and weighted retrieval
system. Journal of American Society for Information Science, 29(3), 156-8.
Brookes, Bertram C. and Grifiths, Joseph. (1978). Frequency rank distributions.
Journal of the American Society for Information Science, 27(1), 13-17.
514
Information Retrieval Models
Carter, M.B. (1986). A methodology for the economic appraisal of management in-
and Their Applications
formation. International Journal of Information Management, 193-203.
Doszkoc, Tamas E. (1978). AID: an Associative Interactive Dictionary of online search-
ing. Online Review, 2(2), 163-73.
Gopinath, M.A. (1999). Information retrieval. In: MLIS-03 course materials. Unit
13. New Delhi: Indira Gandhi National Open University.
Gopinath, M.A. (1999). ISAR systems: operations and design. In: MLIS-03 course
materials. New Delhi: Indira Gandhi National Open University.
Gopinath, M.A. (1999). Objectives of information storage and retrieval systems.
In: MLIS-03 course materials. New Delhi: Indira Gandhi National Open University.
Henry, G. and Diodato, V. (1991). The rates of assignment of thesaurus terms in the
ERIC information retrieval system: an analysis of hierarchies and levels. Journal of
Documentation, 47(3), 276-283.
Karen, Spark Jones (1973). Linguistics and information science. New York: Aca-
demic.
Kemp, Alister (1988). Knowledge base retrieval system. London: Aslib.
Levitan, K.B. (1982)'. Information resources as 'Goods' in the life cycle of inform a-
tion production. Journal of American Society of Information Science, 33, 44-54.
McGill, Michael J.(1978). Knowledge and information spaces: implications for re-
6'trieval systems. Journal for the American Society for Information Science, 27(4),
205-10.
Mock, T.I. and Vasarhelyi, M.A. (1980). A synthesis of the information economics
and lens model. Journal of Accounting Research, 477-505.
Murthy, S.G.K. and Biswas, R.N. (2004). A fuzzy logic based search technique for
digital libraries. DESIDOC Bulletin of Information Technology, 24(6), 3-10.
Riisbergen, Van CJ. (.1979). Probabilistic retrieval. In: Information retrieval, ed-
ited by C.J. Van Rijsbergen. 2nd Ed. London: Butterworths.
Robertson, Stephen E. (1978). On the nature of fuzz: a diatribe. Journal of the Ameri-
can Society for Information Science, 29(6), 304-7.
Sa1ton, Gerard. (1979). Mathematics and information retrieval. Journal of Docu-
mentation, 35 (1), 1-29.
Salton, G. and McGill, MJ. (1983). Introduction to modern information retrieval.
New York: McGraw-Hill.
Smith, L.e. (1976). Artificial Intelligence in Information Retrieval System. Informa-
tion Processing and Management, 12(3) 189-222.
Wikipedia.org. (2005). Information retrieval. <http://en.wikipedia.org/wikil
information retrieval!>. -
515