0% found this document useful (0 votes)
121 views

Written Report - Indexing

Indexing is the process of analyzing the content of an information resource to determine its main concepts, and expressing these concepts in concise index entries. An index provides a systematic guide to the contents of a document through a list of headings and subheadings arranged in alphabetical order, with references showing where each topic is discussed. Key aspects of indexing include exhaustivity, specificity, and consistency in assigning index terms. Indexes are arranged alphabetically, classified by topic, or in numerical order, and can index books, periodicals, newspapers, or audiovisual materials. The purpose is to help users efficiently find relevant information within large documents or collections.

Uploaded by

GS Library
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views

Written Report - Indexing

Indexing is the process of analyzing the content of an information resource to determine its main concepts, and expressing these concepts in concise index entries. An index provides a systematic guide to the contents of a document through a list of headings and subheadings arranged in alphabetical order, with references showing where each topic is discussed. Key aspects of indexing include exhaustivity, specificity, and consistency in assigning index terms. Indexes are arranged alphabetically, classified by topic, or in numerical order, and can index books, periodicals, newspapers, or audiovisual materials. The purpose is to help users efficiently find relevant information within large documents or collections.

Uploaded by

GS Library
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

INDEXING

Concepts
The word index comes from Latin and meant, according to Harper (2017), “one who point out,
discloser, discoverer, informer, forefinger (because used in pointing), pointer, sign, title, inscription,
list”. Knight (1979, 17) wrote that the Latin word had the meaning “he who, or that which, points the
way”.

According to British standards (BS 3700: 1964), Index is “a systematic guide to the text of any reading
matter or to the contents of other collected documentary material, comprising a series of entries,
with headings arranged in alphabetical or other chosen order and with references to show where
each item indexed is located.

According to Taylor (2009), Indexing is the process by which the content of an information resource
is analyzed, and the “aboutness” of that item is determined and expressed in a concise manner.
An index is made up of index entries (individual records in the index). An index entry has several
elements. The basic ones include:

• Index Heading – This is a term chosen to represent in the index the item or concept derived
from the material being indexed.
• Index Subheading – This refers to the heading that is subsumed under a heading to indicate
a modifying or subordinate relationship.
• Qualifier – This is a term added to a heading, but separated from it by a punctuation
(preferably parentheses) in order to distinguish the heading from homographs in the same
index.
• Scope Note – This pertains to the explanation added to a heading to clarify the range of the
subject matter encompassed, or the usage of the heading within the index.
• Locator – This element leads the user directly to the part of the document or collection
containing the information to which the index heading refers.

Principles of Indexing
Exhaustivity - This principle refers to the extent to which concepts or topics are made retrievable by
means of index terms. There are two identified basic degrees of Exhaustivity.
1. Depth indexing aims to extract all the main concepts dealt with in a document, recognizing
many subthemes and subtopics. This has been traditionally practiced for the subject analysis
of parts of items (e.g. journals, articles, chapters in books, etc.).
2. Summarization identifies only a dominant, over-all subject of the item, recognizing only
concepts embodied in the main theme. This is usually observed in library cataloging subject
analysis.
This diagram illustrates the
concept that summarization
leads to document retrieval and
depth indexing leads to information retrieval. re line that runs halfway in between suggests that
it is possible to have subject analysis system that is halfway between the two extremes.

Specificity - This principle refers to the extent to which a concept or topic in a document is
identified by a precise term in the hierarchy of its genus-species relationship. If the heading used
is parallel to the concept contained in the item and represents this concept correctly, then the
level of specificity is high.

Consistency - This refers to the extent to which agreement exists on the terms to be used to index
some documents. It requires that items on the same subject be conceptually analyzed and
translated in the same way. There are two types of consistency level.

1. Inter-indexer consistency refers to the agreement between or among indexers working as a


team.
2. Intra-indexer consistency refers to the extent to which one indexer is consistent to
himself/herself.

Indexing Purpose and Uses


The functions and uses of an index are as follows.

• An index identifies potentially relevant information in the document or collection being


indexed.
• An index analyzes concepts treated in a document so as to produce suitable index headings
based on its terminology.
• An index indicates relationships among topics.
• An index groups together information on topics scattered by arrangement of the document
or collection.
• An index organizes headings and their modifying subheadings into index entries.
• An index directs users seeking information under terms not chosen as index headings to
headings that have been chosen, by means of See references

Types of Index

1. By Arrangement
a. Alphabetical - This index is based on the orderly principles of letters in the alphabet and
is used for the arrangement of subject headings, cross references, and qualifying terms,
as well as main headings. It is more convenient to use since it follows an order that is
familiar to any user. However, problems of synonymy and scattering may arise. Scattering
means that subcategories of a subject are not drawn together under the generic term, but
are frequently cross referenced from the not preferred terms to the preferred ones.
b. Classified - The classified index has its contents arranged on the basis of relation among
concepts represented by headings (e.g. hierarchy, inclusion, chronology, and other
association). Classified indexes are often based on existing classification schemes (e.g.
DDC).
c. Concordance - An alphabetical index of all the principal words appearing in a single text
or in the multi-volume work of a single author with a pointer to the precise point at which
the word counts.
The need for indexes was first felt when the English Bible was made available to ordinary
people. This paved the way for Alexander Cruden in 1737 to prepare The Concordance of
the Bible.
A concordance is used to:
• to locate a partly or completely remembered passage
• to assemble subject matter
• to compare and analyze word meaning and usage
d. Numerical/Serial Order - Included in this group are patent-number indexes (e.g. The
Numerical Patent Index of Chemical Abstracts) and table indexes.

2. By Type/Form
a. Book Index - The book index, or back-of-the-book index, is an alphabetical list of words,
or group of words at the back of the book giving a page location of the subject or name
associated with each word or group of words. A book index is prepared in order to:
• reduce the frustration of information overload
• permit a browser in a bookstore to compare books prior to purchase
• collect the different ways of wording the same concept
• provide well-worded sub-entries (rather than long strings of unanalyzed page
references
• guide a user directly to a specific aspect of a topic
• filter information for the reader
b. Periodical Index - The periodical index is based on the same principles and has the same
general objective as a book index but has a broader scope. Periodical indexes are open-
ended projects usually performed by a group of people. Each issue of a periodical may
deal with unrelated topics by several authors, written in different styles and aimed at
different users.
The following table summarizes the major distinctions between the book and periodical
indexes…
c. Newspaper Index - This index uses the same principles and objectives with the previous
index types, except for some problems occurring to them.
• A newspaper article may contain names, places, or even subjects that may not
occur again (problem in vocabulary control).
• Multiple editions that some newspapers tend to have may cause some stories to
be added, dropped, or shifted to other pages.
d. Audiovisual Materials Index –
• Multimedia Sources - textual labeling is needed along with image matching
• Sound Databases - usually these are neural networks (retrieval; indexes created
automatically). They usually feature sound browsers which allow fuzzy searches
on audio databases
e. Card Index - a catalog or similar collection of information in which each item is entered
on a separate card and the cards are arranged in a particular order, typically alphabetical.
(e.g. Card Catalog)
f. Printed Index - Printed indexes (e.g. indexes in printed book or serial formats) - These
indexes contain indexer's markings on the items. They are constructed through the use
of bibliographic worksheets.
g. Microform Index - Many microform collections are accompanied by a paper guide that
aids in scholarship by acting as an index, often by author, title, or subject.
h. Computerized Index - The intervention of computers in indexing can either be automated
or computer-assisted. In automated indexing, the computer is left to construct the index
without human intervention. In computer-assisted indexing, humans are responsible for
the intellectual part of the task while the mundane work is done by computers (e.g.
sorting, organizing, etc.).

Indexes can also be categorized by the type of index headings they contain. These include the
following:

• Subject Index - This index provides access to the topics treated in documents and/or features
of documentary units (e.g. genre, format, etc.). Index subject headings are arranged
alphabetically or in other systematic order.
• Author Index - This index provides access to information on documents cited by the author's
name in the indexed document, or it lists documents distinguished by author's name in the
indexed collection.
• Name Index - The name index provides access to names contained in documents, whether or
persons, organizations, or other animate or inanimate objects which are identified by a
proper name.

Indexing Language
An indexing language is a system of naming or identifying subjects contained in a document. Like
languages used in daily living, it also serves as a tool for communication, a means of expressing feeling
or thoughts and is a method of combining a group of words or word-like symbols so that they can be
understood by daily users. In indexing, it is used for the representation of topics and features of a
documentary unit and for the retrieval of documentary units from an information-retrieval system.
1. Natural Language – also called as ‘derived-term system’, ‘indexing by extraction’
- This type of language uses significant terms or words occurring in the text as is as index
entries. Words extracted from the text that use natural language for indexing purposes
are often called keywords.
o Natural language tends to improve recall because it provides more access points
but reduces precision.
o In natural language, redundancy is greater.
o Natural language uses more current terms.
o Natural language tends to be favored by subject specialists or the end user
2. Controlled Vocabulary/Language - Controlled vocabulary makes use of authority lists that
enable an indexer to establish a standard description for each concept and use that
description each time it is appropriate. It serves several purposes.
o It controls synonyms by choosing one form as the standard term.
o It makes distinctions among homographs. E.g. Security (Law); Security
(Psychology)
o It establishes the size or scope (e.g. whether the word baseball would include
softball).
o It usually records hierarchical and affinitive/associative relations.
o It controls variant spellings.
- Controlled vocabulary uses several syndetic devices.
- There is a possibility that the controlled vocabulary may be inadequate. The indexer and
the searcher are limited to the terminology used, to the scope of each content (term or
notation), and to the structure of the existing system.
- Controlled vocabulary is channeled in (2) basic forms:
1) Authority List/Subject Heading List - This is an alphabetical list of subject headings
with cross references from not preferred terms and headings to preferred ones, and
linking devices between related terms and headings. It often includes separate
sequences of standardized subheadings that may be combined with subject headings.
Rules for applying subheadings usually accompany the list. A subject heading is used
primarily to index textual, book-length documents, with one or two terms that
capture what the document is all about.
Examples of subject heading lists are the following.
- Library of Congress Subject Headings (LCSH) - LCSH is used in
conjunction with the Subject Cataloging Manual: Subject Headings, a
document that contains policies and practices of the Library of Congress.
LCSH is updated continuously. Electronic updates are available via
subscriptions to Cataloger's Desktop and through various bibliographic
utilities such as RLIN and OCLC.
- Sears List of Subject Headings - This list is intended for small collections
used by persons with general needs. Its main users are public and school
libraries. It is also continuously updated (updates are available in
electronic form).
- Medical Subject Headings (MeSH) - This list is used to provide subject
access points on every bibliographic record created at the National
Library of Medicine. In printed version, MeSH is comprised of three (3)
volumes - one volume for a hierarchical listing, another volume that is
alphabetically-arranged that includes scope note, and a volume of
permuted alphabetical listing. in which every word of a phrase is brought
into lead position.
2) Thesaurus - The term thesaurus is derived from Latin, which means "treasure". It is
used to control indexing vocabulary in one subject or field of interest, ranging from
Agriculture to Vocational Training and to the European Communities. It is a
controlled indexing language vocabulary arranged in a known order and structured
so that equivalence, homographic, hierarchical, and associative relationship
indicators among terms are displayed clearly and identified by standardized
relationship indicators that are reciprocally employed. More than the subject heading
list, a thesaurus is based on terms and concepts that appear on the actual text of
documents being indexed. A thesaurus aims to promote consistency in the indexing
of documents, predominantly for post-coordinate information retrieval systems, and
to facilitate searching by linking lead-in terms with descriptors.

Levels of Indexing

Concordance The first level is known as the concordance, which consists of all
(First Level) references to all words in the original text arranged in alphabetical
order.
Information Theoretic Level The second level is the information theoretic level which
(Second Level) calculates the likelihood of a word being chosen for indexing based
on its frequency of occurrence in a given text document.
Linguistic Level The third level is the linguistic level which attempts to explain how
(Third Level) meaningful words are extracted from large units of text. (Some
Indexers have proposed that opening paragraphs, chapters etc. are
good sources for choosing indexing terms).
Textual/Skeletal Framework The textual or the skeletal framework is the fourth level. Here the
(Fourth Level) text is prepared by the author in an organized manner and held
together by a skeletal structure. The onus therefore lies on the
indexer to identify the skeleton and markers that will determine
the content of the given text.
Inferential Level The fifth level of indexing theory is the inferential level. An indexer
(Fifth Level) should be able to make inferences about the relationships between
words and phrases by understanding the sentence structure.

Indexing Process

FIRST STAGE (Familiarization) In indexing a document is to have a general idea of the


document by going through the title, preface, foreword, content
pages and possibly introduction. One can also flip through the
text and make some spot reading. This will give the indexer
sufficient familiarization with the document; hence this stage is
called the familiarization stage. The indexer wants to know
what the document is about by identifying concepts that are
conveyed by words and phrases in the document, examining
the title, abstract, preface, introduction, chapter headings,
major headings, sub-headings, etc. It is important that the
indexer takes into account the needs of the users.
SECOND STAGE (Analysis) The second stage, which is the analysis, involves the indexer
using his intellectual judgment by identifying the concepts the
book has treated. Sometimes the indexer may use the exact
term used by the author or he might formulate an appropriate
term. These terms are intended to accurately describe the
whole document. The indexer at this stage is doing what is
referred to as subject analysis or concept analysis.

This is where the subject background of the indexer comes into


play, especially if he/she has a sufficient subject background of
that document. At this stage, the terms identified by the indexer
are what he/she judges to be the terms that represent the
totality of the document. In a situation where the use of
terminology is controlled, the indexer cannot use these terms
directly as index terms or access points. Rather terms identified
have to be translated into an indexing language used by the
system which is the language used by both the indexer and the
searcher in an information storage and retrieval process. This
language exercises some control over what terms to be used as
index terms
THIRD STAGE (Translation) During this stage, the indexer assigns subject descriptors
chosen from the controlled language that the users of the
discipline are familiar with. This stage is called the translation
stage. However, in a setting where there is no need to exercise
control over the terminology of the system, such as the bock of
a book index or computerized indexes, this last stage may not
be necessary.

However, in determining the policies, certain features of


indexing have to be explained. These include depth of indexing,
specificity, exhaustively and weighting, etc. Depth of Indexing
involves selecting as a large number of topics from a document,
that is making as many important topics as are treated in a
document as index terms for the document. Specificity involves
selecting only terms that are specific to the document, which is
a term that entirely covers the document.

Indexing Process (Journal Articles)

1. Read carefully and understand the title.


2. Read the introduction down to the point where the author states the purpose of his article
and correlate it with the title. Absorb but do not necessarily attempt to index the introductory
material since this is usually a statement of known facts upon which the present study is
based.
3. Scan the body of the article, focus on the Materials & Methods section and the Results section.
4. Note section headings, paragraph headings; italics, boldface; charts, plates, tables,
illustrations; laboratory methods, case reports, etc. Headings supplied by the author usually
herald the content of the section headed.
5. Select for indexing only those subjects actually discussed as opposed to those subjects merely
mentioned (and of little or no value in retrieval).
6. Read the summary or conclusions of the author to determine whether he achieved the aims
set forth in his stated purpose. Weigh conclusions based on the text but do not index
implications or suggested future applications. Do not index conclusive statements not
supported by discussion in the text.
7. Scan the abstract, if there, for items missed in indexing, being careful, however, to locate
actual discussions within the text of the article; ignore mere implications.
8. Scan the author's own indexing if supplied or the keywords supplied by the publisher to see
whether the concepts chosen are actually discussed in the text and if they have been indexed.
9. Scan the bibliographic references supplied by the author for clues and further corroboration.
Index Evaluation

All types of indexes that are produced should be evaluated to determine their effectiveness in terms
of how many documents that contain a particular term can be retrieved from a system and the
relevance of the documents retrieved.
The effectiveness of an indexing system is controlled by two parameters, called indexing exhaustively
and term specificity.

• Indexing Exhaustively
this means the degree to which the subject matter of a given document has been reflected
through the index entries. An exhaustive indexing system thus is supposed to represent the
contents of the input documents fully. However, to attain this objective the system has to
select as many keywords as possible to represent the idea put forward in the document. In a
non-exhaustive system only, a few keywords are chosen which gives a broad representation
of the subject.
• Term Specificity
refers to how broad or how specific are the terms or keywords chosen under a given situation.
The more specific the terms, the better is the representation of the subject through the index
entry.

1. Recall Measure/Ratio
A quantitative measure used to determine the ability of an index as an aid to retrieving
documents containing information on a particular request from a collection of documents
present in a library of an information center. It also refers to the proportion of relevant
materials retrieved by a system, and can be represented thus:

No. of relevant documents retrieved


RECALL = ____________________________________________________________
No. of relevant documents in the collection

2. Precision Measure/Ratio
A quantitative ratio of the number of relevant documents retrieved to the total number of
documents retrieved. Relevance or precision depends on the terminology of the text being
indexed and the specificity of the indexing language used.

No. of relevant documents retrieved


RECALL = ____________________________________________________________
Total number of documents retrieved

These parameters are expressed in percentage terms and this means that both recall and
precision may vary between 1 and 100%.

It is therefore obvious that there is an inverse relationship between recall ratio and precision
ratio. The higher the recall ratio the lower the precision ratio and vice versa. The more
documents that are recalled, the less precise the indexing system would be, and the less
documents that are recalled, the more precise the indexing system is. Thus, the indexer must
ensure a fair balance of recall ratio and precision ratio. We therefore expect about 70% recall
ratio and 60% precision ratio.

It should be noted that specificity and exhaustively have influence on recall and precision
ratios. When the indexing policy of a library or an indexing agency is to support exhaustively,
then it would result in a high recall of documents and a low precision that is most of the
documents recalled would not be relevant. By increasing the number of keywords during a
search process, it may happen that we may choose subjects that are very narrowly discussed
in the given documents. In order words the system will retrieve a large number of non-
relevant documents, thereby reducing increasing recall but reducing precision.

On the other hand, when an indexing agency supports specificity, then the recall of documents
would be low, but the precision would be high as only documents that are relevant to the user
would have been recalled. Specificity promotes low recall and high precision while
exhaustively promotes high recall and low precision. For Example, if we are looking for
information on “Internet” one can use related terms such as “net”, information superhighway,
“World Wide Web” etc. for the search process. By doing this, one may retrieve more
information and will increase the possibility of higher number of relevant items. This
concludes that making the search more exhaustive, we tend to get a higher recall.

Applications of Indexing
A. Book Indexing Procedure
1. Examine the text carefully.
2. Read the text several times, page by page, to be able to analyze the contents and
determine the indexable topics.
3. Select the topics to be indexed taking into consideration their significance to the central
theme of the book.
4. Name the topics that were chosen to be indexed and mark up page proofs.
5. Alphabetize the entries.
6. Edit the entries.
• Decide which entries should be the main headings and which should be the
subheadings.
• Decide whether certain entries will be treated as main entries or subentries.
• Main entries unmodified by subentries should not be followed by long rows or page
numbers.
• Subentries must be concise and informative.
• Make a final choice among synonymous terms.
• Provide adequate but not excessive cross-referencing
Example:

Cars Trucks
Chevrolet, 224 Dodge Ram, 219
Mazda, 146 GMC (Jimmy), 143
Volkswagen, 25 Mercedes-Benz, 144
See also trucks See also cars

• Punctuation
• The inversion of a phrase used as a heading in a main entry is punctuated by
a comma.
• If the heading is followed immediately by page references, a comma is used
between the heading and the first numeral and between subsequent
numerals.
• If the heading is followed immediately by run-in subentries, a colon precedes
the first subheading. All subsequent subentries are preceded by semicolons.

For example:
payments, balance of: definition of, 16;
importance of, 19

7. Determine the design of the index after the compilation of the entries.
• Decide whether subentries will follow an indented or run-in style.
• Typography should be used to differentiate between types of headings and to
distinguish them from numerals indicating volumes, parts and pages.
• Typing, proofreading, and final review.

B. Periodical Indexing
Periodicals play a critical role in information centres since they convey the most up-to-date
information on developments in the users‟ areas of expertise.
According to Matanji (2012), there are two types of periodical indexes. There are indexes to
a single journal and indexes to several journals. Very often, the editors of most journals will
issue an index at the end of the volume.
1. Always index names of persons honored by awards or prizes and those eulogized in
obituaries.
2. Every article that has permanent value should be indexed under all topics and issues
dealt with.
3. Editorials should be indexed under their topics as any other article but differentiated
from the others by the addition of (Ed.) or (E). The titles of editorials may be indexed
under a collective heading "Editorials".
4. Letter to the editor, if considered indexable, should be indexed by topic, not under a
caption that may have been assigned by the editor. It is advisable to index at least the
name of the person who criticized an article as well as the author's response.
e.g. Doe, John. "Effect of magnetic fields" 37-43
Errors (H. Smith) 75; correction 185
[author's entry]
Smith, Henry. "Effect of magnetic fields"
(John Doe pp. 37-43): errors 75
[letter writer's index entry]
5. Book reviews are indexed by the title of the book, followed by the name of the author,
the locator, and the designation (R) unless all book reviews are listed under the class
heading "Book Reviews" or in a separate index.
e.g. Guide to reference books, 10th ed. (Sheehy) 68 (R)
*The name of the reviewer should be included in the author name index,
e.g. Dixon, Geoffrey 68 (R), 92-96, 123

C. Multi-media and Image Indexing


• Verbal audio and visual text is not too different from document indexing, as once the
speech or text images are converted to text using the appropriate recognition
software.
• When analyzing multimedia content, it is important to note that recognition software
can be used to retrieve information on the type of content as well as the content itself.
• Image indexing is a concept that is easy to understand in theory, but is very different
from standard text indexing, due to the nature of images. Image indexing includes
attaining, editing, storing, and retrieving visual images. (Cleveland & Cleveland,
2013).
In multi-media indexes, textual labeling is needed (index terms or
descriptive narrative) along with image matching. Thus, a search on words (e.g. battle,
attack, fight) might retrieve an image of a particular type of scene, and this in turn could
be used as input to find others like it.
Content-based indexing can be used to generate terms for the color, texture and basic
spatial attributes of images. Image searchers use textual descriptor search terms that
require human description-based indexing of the semantic attributes of images.
D. Web Indexing
‘Web indexing’ refers to:
• search engine indexing of the Web,
• creation of metadata,
• organization of Web links by category, and
• creation of a website index that looks and functions like a back-of-book index. It will
usually be alphabetically organized, give detailed access to information, and contain
index entries with subdivisions and cross references

Web indexing means providing access points for online information material which are
available through the use of World Wide Web browsing Software.

REFERENCES
https://www.slideshare.net/ImeAmorMortel/indexing-
10954481?fbclid=IwAR174jeAABgURoQ5Ey0wzVW_CSmeU0u168whct7VswDeMDa-rJ1Vi0NlvY0
https://www.wattpad.com/243583202-library-science-reviewer-unit-14-indexing/page/15

https://www.mlsu.ac.in/econtents/413_Indexing%20techniques%20and%20process.pdf

https://www.lisedunetwork.com/indexing-principles-and-
process/?fbclid=IwAR1Qz_FLXsrqS7F3iXsyr1FR3Na_qJeWNlU-VcXd3Ji4EuboBMRJXvOZqxw

https://digital.library.unt.edu/ark:/67531/metadc1164546/m2/1/high_res_d/Revisiting_Indexin
G_and_Abstracting_in_the_Digital_Era.pdf
https://www.nlm.nih.gov/bsd/indexing/training/TIP_010.htmL

You might also like