SlideShare a Scribd company logo
Semantic Knowledge Representation for
Information Retrieval Winfried Gödert pdf
download
https://ebookfinal.com/download/semantic-knowledge-
representation-for-information-retrieval-winfried-godert/
Explore and download more ebooks or textbooks
at ebookfinal.com
We believe these products will be a great fit for you. Click
the link to download now, or visit ebookfinal
to discover even more!
Knowledge representation in the social Semantic Web 1st
Edition Katrin Weller
https://ebookfinal.com/download/knowledge-representation-in-the-
social-semantic-web-1st-edition-katrin-weller/
Classification Made Simple An Introduction to Knowledge
Organisation and Information Retrieval 3rd Edition Eric J.
Hunter
https://ebookfinal.com/download/classification-made-simple-an-
introduction-to-knowledge-organisation-and-information-retrieval-3rd-
edition-eric-j-hunter/
Computational Information Retrieval 1st Edition Michael W.
Berry
https://ebookfinal.com/download/computational-information-
retrieval-1st-edition-michael-w-berry/
Text Information Retrieval Systems 3rd Edition Charles T.
Meadow
https://ebookfinal.com/download/text-information-retrieval-
systems-3rd-edition-charles-t-meadow/
Learning to Rank for Information Retrieval and Natural
Language Processing 2nd Edition Hang Li
https://ebookfinal.com/download/learning-to-rank-for-information-
retrieval-and-natural-language-processing-2nd-edition-hang-li/
Interactive Information Retrieval in Digital Environments
1st Edition Iris Xie
https://ebookfinal.com/download/interactive-information-retrieval-in-
digital-environments-1st-edition-iris-xie/
A Semantic Web Primer Cooperative Information Systems
Grigoris Antoniou
https://ebookfinal.com/download/a-semantic-web-primer-cooperative-
information-systems-grigoris-antoniou-2/
A Semantic Web Primer Cooperative Information Systems
Grigoris Antoniou
https://ebookfinal.com/download/a-semantic-web-primer-cooperative-
information-systems-grigoris-antoniou/
Spoken Language Understanding Systems for Extracting
Semantic Information from Speech 1st Edition Gokhan Tur
https://ebookfinal.com/download/spoken-language-understanding-systems-
for-extracting-semantic-information-from-speech-1st-edition-gokhan-
tur/
Semantic Knowledge Representation for Information Retrieval Winfried Gödert
Semantic Knowledge Representation for Information
Retrieval Winfried Gödert Digital Instant Download
Author(s): Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt
ISBN(s): 9783110329704, 3110329700
Edition: Digital original
File Details: PDF, 7.75 MB
Year: 2014
Language: english
Semantic Knowledge Representation for Information Retrieval Winfried Gödert
Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt
Semantic Knowledge Representation for Information Retrieval
Semantic Knowledge Representation for Information Retrieval Winfried Gödert
Winfried Gödert, Jessica Hubrich,
Matthias Nagelschmidt
Semantic Knowledge
Representation for
Information Retrieval
This work has been published with the financial support of the Cologne University of Applied
Sciences.
ISBN 978-3-11-030477-0
e-ISBN 978-3-11-032970-4
Library of Congress Cataloging-in-Publication Data
A CIP catalog record for this book has been applied for at the Library of Congress.
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available in the Internet http://dnb.dnb.de.
© 2014 Walter de Gruyter GmbH, Berlin/Boston
Typesetting: Michael Peschke, Berlin
Cover image: bentaboe/iStock/Thinkstock
Printing: Hubert & Co. GmbH & Co. KG, Göttingen
♾ Printed on acid-free paper
Printed in Germany
www.degruyter.com
Preface
An information seeker – in our context usually referred to as user or end user
of search interfaces of collections of information resources like online libraries,
domain-specific databases, or the World Wide Web – thinks of something he or
she wants to find in a collection. “Something” may be of a very specific or of a
very vague kind. Search operations are always designed with the intention to rec-
oncile as far as possible these individual conceptualizations of a person’s search
interests with the represented conceptualizations of stored indexing data. The
retrieval success highly depends on a suitable correspondence between these
two components. Information seekers commonly express their search interests in
words that they think are grasping best the intended meaning and thus promise
best-possible retrieval results. The used words either comply with semantically
controlled terms of an indexing language or constitute free-text tokens. They
reflect conceptual ideas whose meaning does not manifest itself in isolated con-
cepts as it includes a time-dependent context and semantic relations to other
concepts. Although recent trends explore semantic relations with statistical
and linguistic methods, there are reasons for cognitively analyzing the context
as well, to represent it adequately and thereby to provide additional support for
automated processes. This is particularly true for information systems that are
designed to facilitate knowledge exploration and searching in semantic context
by inference or reasoning processes.
Historically, there are at least two essential approaches for representing
semantic connections between entities of artificial languages: on the one hand
indexing languages that are used for representing the content of information
resources, on the other hand knowledge representation systems that are used for
machine-based knowledge exploration. Combining both approaches might sig-
nificantly improve the efficiency of subject-oriented search processes.
Within the framework of document indexing, extensive methods have been
developed for representing concepts as elements of controlled indexing languages
and using them as tools for retrieval processes. Indexing languages represent
common knowledge – or more precisely, extracts of common or specialist knowl-
edge – in a standardized manner and provide terminological building blocks for
subject indexing. As connectors between specific knowledge and corresponding
information spaces, they significantly improve thematic access to documents
described in form of bibliographic data in a way other systems cannot cope with.
Modeled conceptual structures reflect familiar knowledge contexts that are pri-
marily processed for cognitive interpretation. They point to connections informa-
tion seekers are possibly not aware of but that might nevertheless have a positive
impact on the success of the search process if proposed to them. Frequently, the
vi Preface
resources of interest are indexed by headings that are not the first wordings the
seeker thinks of, and only by offering such headings as additional vocabulary to
the seeker positive retrieval results are obtained. Traditionally, such relationships
are not regarded as tools for machine-supported analysis. Therefore, they are not
sufficiently formalized for automatic reasoning processes. Usually, attributes or
properties justifying a particular relation between two concepts are not stated
explicitly. Relational structures commonly make use of a rather small set of rela-
tionships that is not expressive and does not allow making precise, differentiated
statements about semantic connections. Until now, mainly theoretical proposals
give valuable hints for creating an adequate inventory of specified relation types;
there are only very few attempts for practical realization.
In the context of artificial intelligence, systems for knowledge representa-
tion have been developed that focus on formal considerations and techniques for
modeling knowledge, neglecting issues of indexing and retrieval of documents.
They primarily aim at enabling machine processing and especially at drawing
inferences on the formalized knowledge level. Expert or diagnostic systems give
respective examples. Document indexing and retrieval are considered in the
context of special applications, if at all. In general, existing indexing languages
are not included; tools for knowledge representation are rather newly created or
recreated.
The conception of a Semantic Web marks a new step of development. It is
proposed that distributed data resources should be technically combined, and
it is envisioned that appropriate ontological representing and linking of distrib-
uted resources could generate an additional semantic value from which thematic
search processes could enormously benefit. As a matter of fact, some retrieval
tests could already adduce the empirical evidence that ontology-based search
processes lead to a higher performance than keyword-based searches. However,
it is not clear yet how subject indexing and document retrieval can benefit from
these visionary and technological impulses and how appropriate strategies for
realization could look like. These questions are far from being trivial. This is
reflected by the fact that the focus has shifted from Semantic Web to Linked Data
applications. These intend to achieve added semantic value by merely connect-
ing existing data reservoirs, making them technically interoperable. Combining
cognitive and mechanical interpretation of semantic data for improving retrieval
efficiency and retrieval results lies outside the interest of such projects. Yet, a
semantic space that is cognitively and at the same time machine-interpretable
and that brings together different existing and newly created resources for the
benefit of knowledge acquisition and information retrieval is the most challeng-
ing idea connected with the Semantic Web. In such a space, information seekers
Preface vii
could formulate their cognitive interests and automated tools would subsequently
provide additional support that would lead to an improved search success.
When designing improved search environments, it is important to ensure
that content-descriptive terms of different systems are exchangeable, that seman-
tic entities are interoperable. Suitable models of semantic interoperability would
support both, switching between different indexing languages as well as combin-
ing entities of more than one indexing language to execute thematic queries. In
case valid conclusions on the conceptual level were reached, it would be essen-
tial considering not only mechanical interoperability and string matching but
also the semantic content of entities and the relational structure of the respective
indexing languages.
The final stage may be characterized as ontology-based indexing and
retrieval with respect to semantic interoperability in heterogeneous environ-
ments. Combining the methodological approaches to the semantic representa-
tion standards of the Semantic Web provides the opportunity to separate from
proprietary application contexts. Already developed knowledge structures can
be used for or shared with other applications in the sense of a content-oriented
semantic interoperability.
The main character of this book can be described as twofold. First, it gives a
state-of-the-art report with regard to the mentioned issues. It presents a frame-
work for interconnecting the described two strands of development and shows
how they can benefit from each other. In particular, it is discussed how document
retrieval and search results can be improved based on an expanded set of differ-
entiated semantic relation types that allow for drawing machine inferences along
the relational structure. Secondly, it contains proposals to which extent existing
indexing languages can be used and what requirements have to be met to develop
them further towards knowledge representations being able to fulfill both the
conceptual interpretations of their elements and to support formal inferences for
the design of advanced retrieval environments.
This part of the book is based on two projects that were conducted at the
Cologne University of Applied Sciences during the years 2006 to 2011: CrissCross
and Reseda. The CrissCross project was financially supported by the Deutsche
Forschungsgemeinschaft (German Research Foundation) and was executed in
cooperation with the German National Library. It aimed at creating a multilingual,
thesaurus-based and user-friendly research vocabulary that facilitates research
in heterogeneously indexed collections. To achieve this aim the subject headings
of the German subject headings authority file Schlagwortnormdatei (SWD) were
mapped to notations of the Dewey Decimal Classification, i.e., its German version
(DDC Deutsch). Within its framework, the German National Library also linked
SWD headings to their equivalents in the Library of Congress Subject Headings
viii Preface
(LCSH) and the French indexing vocabulary Rameau, thus contributing to the
MACS project. The results of the project became part of the Linked Data service of
the German National Library.
The experiences and expertise gained in the CrissCross project were utilized
within the second project, Reseda - Representational models for semantic data.
This project was made possible by the financial support of the Cologne Univer-
sity of Applied Sciences. Its focus was on designing, developing and improving
models and frameworks for the representation of semantic information in knowl-
edge organization systems. The project’s aim was to explore strategies for pre-
cisely specifying the semantic content and characteristics of concepts and the
semantic relations between these concepts in indexing languages and other
knowledge organization systems, thereby augmenting the semantic richness and
expressivity of these vocabularies for machine support within retrieval scenarios.
Many results of this project form the basis of this book.
Initial and target point of all considerations presented in this book are pro-
cesses of information retrieval for subject content, viz. automatic and cognitive
strategies to explore knowledge or to facilitate access to information.
An introductory chapter gives a description of the problems and objectives
for solutions, technical details of the subsequent discussion are thus not antici-
pated. From the perspective of the authors, the selected sample environment has
a special aptitude for this objective. For the subsequent discussion, however, it is
not of substantive importance. The focus of the considerations are always general
problems and solutions. All of the examples of the book are designed to support
abstract considerations or to illustrate general methods. None of the displayed
methods is designed for a specific example of the sample environment alone.
After this introduction, the text is divided in three parts, each describing a
stage for the development of a concept that we call an “ontology-based model for
indexing and retrieval”. The first part reports state-of-the-art essentials of knowl-
edge organization, indexing principles, and paradigms of information retrieval.
Essential characteristics of semantic technologies for knowledge representation
are introduced in Chapter 3. The basic features of web-specific representation
languages for semantic content are sketched as far as they are of special interest
for our context. Besides XML, RDF, and OWL, application-specific representation
languages are described. Chapter 4 discusses different levels of semantic expres-
sivity in search processes and how the resulting requirements can be supported
by combining features of indexing results and retrieval environments. Limita-
tions indexing languages face in view of multilingual and heterogeneous infor-
mation spaces are also outlined.
Part B presents in its first chapter various approaches for handling hetero-
geneity in indexing and retrieval, including citation pearl growing, multilingual
Preface ix
indexing languages, and vocabulary linking. Design and outcomes of several
projects are presented. It is questioned whether these approaches can be seen as
possible solutions for a heterogeneity treatment that human beings can interpret
and that at the same time are promoting machine supported inferences. The latter
aspect gives rise for continuing the discussion in Chapter 6 by a more detailed
analysis of the problems that must be taken into concern if heterogeneity should
be solved be methods of semantic interoperability. It is clarified how semantic
interoperability should be understood for indexing and retrieval purposes and
how to combine this understanding with a model for conceptual knowledge
representation by entities and improved relational structures. Conditions under
which entities of different indexing languages can be viewed as semantically
interoperable are derived as requirements for the following discussion.
The third part presents in 4 chapters the components of our understanding of
a model for ontology-based indexing and retrieval by combining the established
methods of indexing and retrieval with the strength of formal knowledge repre-
sentation. In more detail, the primarily cognitively interpretable terms and the
established relations between them are embedded into a formal framework of
semantic models, typed relations and inference procedures to develop enhanced
procedures of search and find scenarios. Within this frame, refining and restruc-
turing their relational inventories is indispensible. Based on first examples, we
show the potential of specified, logically valid semantic data being interpretable
both for cognitive and machine-supported information retrieval processes. We
devote special attention to the crucial task of enriching and restructuring existing
indexing languages viz. refining the relational inventory by means of abstraction
and generalization.
The presentation concludes with a short discussion of some open questions
and suggestions for further research.
Although the chapters are based on each other in content, it was the aim to
make each chapter as self-explanatory as possible. In doing so, duplication and
cross-references could not always be avoided. Sometimes the re-treatment of a
question under a changed point of view was required. The chosen cross-disci-
plinary approach made it necessary in some places to use an own terminology.
The particularly important terminological definitions have been compiled in a
systematic glossary in the appendix.
Many colleagues have substantially supported our work and contributed to
our findings especially by patient and continuous discussions. At first, we would
like to mention the members of the Cologne staff of both projects CrissCross and
Reseda: Anne Betz, Felix Boteram, Jan-Helge Jacobs, Tina Mengel, Katrin Müller
and Michael Panzer (neé Preuss). We would like to thank them all; our work
would not have been successful without their help. A special thanks to Jens Wille
x Preface
who set up a Web search environment for our experiments with typed relations
and thus allows performing the first tests as well as verifying our statements. We
also got benefit from many persons we cannot mention all by name, especially
the members of our project partner institutions and other colleagues interested in
our work. We wish to thank them, too.
Winfried Gödert
Jessica Hubrich
Matthias Nagelschmidt
Table of Contents
Preface  v
1 Introduction: Envisioning Semantic Information Spaces  1
Part A Propaedeutics – Organizing, Representing, and Exploring Knowledge
2 Indexing and Knowledge Organization  15
2.1 Knowledge Organization Systems as Indexing Languages  15
2.1.1 Building Elements: Entities and Terms  16
2.1.2 Structural Elements: Intrasystem Relations  21
2.1.3 Result Elements: Indexates  27
2.2 Standards and Frameworks  30
2.2.1 ISO 25964: Thesauri and Interoperability with other
Vocabularies  30
2.2.2 Functional Requirements for Subject Authority Data (FRSAD)  31
3 Semantic Technologies for Knowledge Representation  33
3.1 Web-based Representation Languages  33
3.1.1 XML  34
3.1.2 RDF/RDFS  37
3.1.3 OWL  42
3.2 Application-based Representation Languages  49
3.2.1 XTM  50
3.2.2 SKOS  57
4 Information Retrieval and Knowledge Exploration  61
4.1 Information Retrieval Essentials  61
4.1.1 Exact Match Paradigm  62
4.1.2 Partial Match Paradigm  64
4.2 Measuring Effectiveness in Information Retrieval  65
4.3 From Retrieving to Exploring  68
4.3.1 String-based Retrieval Processes  71
4.3.2 Conceptual Retrieval Process  73
4.3.3 Conceptual Exploration Processes  74
4.3.4 Topical Exploration Processes  78
4.4 From Homogeneous to Heterogeneous Information Spaces  80
xii Table of Contents
Part B Status quo – Handling Heterogeneity in Indexing and Retrieval
5 Approaches to Handle Heterogeneity  87
5.1 Citation Pearl Growing  87
5.2 Modeling Multilingual Indexing Languages  89
5.3 Establishing Semantic Interoperability between Indexing
Languages  90
5.3.1 Structural Models  91
5.3.2 Mapping Levels  93
5.3.3 Vocabulary Linking Projects  96
6 Problems with Establishing Semantic Interoperability  105
6.1 Conceptual Interoperability between Entities of Indexing
Languages  107
6.1.1 Focused and Comprehensive Mapping  108
6.1.2 Conceptual Identity and Semantic Congruence  112
6.2 Equivalent Intersystem Relationships  118
6.2.1 Intersystem Relations Compared to Intrasystem Relations  119
6.2.2 Interoperability and Search Tactics  121
6.2.3 Specified Intersystem Relationships  132
6.2.4 Conceptual Interoperability between Indexing Results  134
6.2.5 Directedness of Intersystem Relationships  137
Part C Vision – Ontology-based Indexing and Retrieval
7 Formalization in Indexing Languages  147
7.1 Introduction and Objectives  147
7.2 Common Characteristics and Differences between Indexing
Languages and Formal Knowledge Representation  151
7.3 Prerequisites for an Ontology-based Indexing  156
7.3.1 Semantic Relations and Inferred Document Sets  158
7.3.2 Facets and Inferences  167
8 Typification of Semantic Relations  181
8.1 Inventories of Typed relations  182
8.2 Typed Relations and their Benefit for Indexing and
Retrieval  188
8.3 Examples of the Benefit of Typed Relations for the Retrieval
Process  194
Table of Contents xiii
8.3.1 Example 1: Aspect-oriented Specification of the Generic Hierarchy
Relation  194
8.3.2 Example 2: Typed Relations of a Topic Map built from the ASIST
Thesaurus  197
8.3.3 Example 3: Degrees of Determinacy  213
9 Inferences in Retrieval Processes  215
9.1 Inferences of Level 1  216
9.1.1 Hierarchical Relationships  216
9.1.2 Associative Relationships  217
9.1.3 Typification of the Synonymy / Equivalence Relationship  218
9.2 Inferences of Level 2 and of Higher Levels, Transitivity  222
9.2.1 Hierarchical Relationships  223
9.2.2 Unspecific Associative Relationships  226
9.2.3 Typification of Associative Relationships  229
9.3 Inferences by Combining Different Types of Relationships  231
9.3.1 Synonymy Relation with Hierarchical Relationships  231
9.3.2 Chronological Relation with Hierarchical Relationships  232
9.3.3 Transitions from Associative Relationships to a Hierarchical
Structure  232
9.3.4 Transitions from a Hierarchical Structure to Associative
Relationships  233
9.3.5 Transitivity for Combinations of Typed Associative
Relationships  235
10 Semantic Interoperability and Inferences  237
10.1 Conditions for Entity-based Interoperability  237
10.2 Models of Semantic Interoperability  244
10.2.1 Ontological Spine and Satellite Ontologies  244
10.2.2 Degrees of Determinacy and Interoperability  250
10.2.3 Entity-based Interoperability and Facets  252
10.3 Perspective: Ontology-based Indexing and Retrieval  254
11 Remaining Research Questions  259
11.1 Questions of Modeling  259
11.2 Questions of Procedure  260
11.3 Questions of Technology and Implementation  262
xiv Table of Contents
Part D Appendices
Systematic Glossary  265
Abbreviations  271
List of figures  273
List of tables  277
References  279
Index  289
1 Introduction: Envisioning Semantic Information
Spaces
Indexing languages, interoperability, information retrieval, semantic technolo-
gies – is it really worth examining the particular interaction of these rather dif-
fering subjects, as we do in this book? In this preliminary chapter we try to give
a first answer why we think it is. Therefore we will pick up the idea of a semantic
information space again, which was already mentioned in the preface and make
it more concrete by envisioning some examples. We will take a first naive look
at search situations and the impact of semantic knowledge representation, yet
without considering the conceptual or technical background. Thus in this first
look, information retrieval systems, indexing languages and semantic technolo-
gies are treated as a black box, which ideally provides a search environment that
can be somehow characterized as a semantic information space.
Examples in this book are heterogeneous and (amongst some others) taken
from the domains of chemistry, physics and biology, particularly ornithology.
Although neither the authors nor the subjects of this book are affiliated to these
disciplines, we will nevertheless occasionally revert to them, as they are clearly
outside of our own profession and can be seen insofar as a “neutral” domain,
which seems to provide a lower risk of misunderstanding than examples from
the less accurate fields of humanities or social sciences would probably provide.
However, there are of course no special skills in natural sciences needed to read
and understand the examples and to follow the argumentation. All examples are
trivial enough to be understood even without any substantial chemical, physical
or zoological knowledge.
When speaking of an “information space”, one could quite generally think of
two extremes: either a collection of information resources that are widely homog-
enous in form and content and centralized in one storage or a heterogeneous col-
lection, distributed over several repositories and organized independently from
each other – the first extreme is e.g. embodied by traditional library collections,
while the most prominent example for the latter is the World Wide Web. In the
following, both extremes and every possible specification between them shall be
understood as information spaces.
We begin our consideration with a relatively simple organized information
space. Figure 1.1 shows a situation that is remindful of a bibliographic database.
The document store contains a number of bibliographic records, which are repre-
senting two monographs written by the German chemist and Nobel Prize laureate
Otto Hahn and one book of correspondence from the physicist Lise Meitner to Otto
Hahn. To represent the authorship of Otto Hahn and Lise Meitner for each docu-
2 1 Introduction: Envisioning Semantic Information Spaces
ment consistently, a name authority file is used, which contains personal name
authority records of both scientists that can be linked to the stored documents.
In doing so, one can easily search the information space e.g. for all documents
written by Otto Hahn – this search operation is often referred to as a collocation
search.
Fig. 1.1: Authority files in information spaces.
Another search operation can be described as a subject search. That would be a
search e.g. for all documents about “radioactivity”. To carry out subject searches,
the information space must somehow provide the information of what each doc-
ument is “about” – in the indexing context we also speak of the aboutness of a
document (cf. Ingwersen 1992, 50–54). In bibliographic databases this aboutness
is traditionally represented by one or more subject headings or thesaurus descrip-
tors. In order to provide a consistent representation, the subject headings can be
organized in a subject headings authority file, so that each subject heading has
its own authority record that can be linked to the appropriate document records
(cf. Fig. 1.1).
There is nothing special to the situation described so far and everybody who
has ever used an online catalog of a library should be familiar with it, as it corre-
sponds to the way bibliographic data has been organized for a long time and still
1 Introduction: Envisioning Semantic Information Spaces 3
continues to be organized by documentary institutions and especially libraries.
However, knowledge representation is beginning beyond this situation.
In Figure 1.2 the authority files are replaced by a network-like structure. The
now grey shaded elements of Figure 1.1 seem to become more complex, as they
are somehow embedded in a meaningful context – later on in this book, we will
address these elements precisely and speak more abstractly of entities of a knowl-
edge representation. What we are characterizing here rather vague as a “meaning-
ful context” raises these entities from the keyword-based level in Figure 1.1 to a
conceptual level in Figure 1.2. We will examine this important step in the follow-
ing chapters and confine ourselves here to the determination that these concepts
primarily can be used for indexing the stored documents and thereby fulfill the
same basic descriptor function as simple keywords, but that they also open up a
broader context, as they are connected to other, somehow related concepts. In the
following, this situation will be referred to as a knowledge structure.
Fig. 1.2: Knowledge structures in information spaces.
Searching the information space in Figure 1.2 with a descriptor “radioactivity”
leads not only to the indexed monograph of Otto Hahn “Applied radiochemistry”,
but also to the related descriptors “activity level” and “radioisotope”. It becomes
apparent that an information seeker, who is interested in “radioactivity”, could
also be interested in certain levels of radioactivity or in concrete radioactive iso-
4 1 Introduction: Envisioning Semantic Information Spaces
topes. The same seems to apply to “nuclear fission” and “nuclear reaction” – it
isn’t unlikely that an information seeker with an interest in nuclear fission may
also be interested in other nuclear reactions. Beyond that, the knowledge struc-
ture of Figure 1.2 also establishes a relationship between Otto Hahn and the rather
abstract concept “person” explicit, as well as between Otto Hahn’s research col-
league Lise Meitner and “person”. As a human there’s no difficulty in the cogni-
tive interpretation of these relations – we can easily see that Otto Hahn and Lise
Meitner are persons, even if we never heard their names before. By using seman-
tic technologies, this knowledge can be made machine-readable, so that it would
be able to infer (Glossary C3.2) that Otto Hahn is a person due to the fact that the
concept “Hahn, Otto” is related to the concept “person” in a specific way. Like-
wise the risk of confusing the person Otto Hahn with the homonymous research
vessel, which was launched in 1964 and named after the famous scientist, could
be avoided.
At this point we have already mentioned many aspects and reached to the
core issues of this book. In the following, we will take a closer look at searches in
information spaces and the underlying information retrieval processes and there-
fore give a first impression of the usefulness of relations like the above described.
We will also look at the interdependency between indexing and information
retrieval processes, introduce Knowledge Organization Systems (KOSs) as types
of knowledge structures that are designed to support indexing and retrieval and
finally concern questions like how it could be made explicit and recognizable for
a KOS that a document “Letters of Lise Meitner to Otto Hahn” is about letters that
Lise Meitner wrote to Otto Hahn and not vice versa.
Based on this, we will provide a more systematic discussion of the specific
types of relations and their functionality within and between knowledge struc-
tures – later on we will speak of them as intra- and intersystem relations. Yet,
before that, some preliminary considerations will be provided, in order to facili-
tate a better understanding of the mentioned issues.
Accordingly, we will address the functionality of intersystem relations, i.e.,
those relations that are bridging two knowledge structures and therefore make
them somehow interoperable. In this context, we will focus on the problems of
heterogeneity that may arise e.g. from the use of different knowledge structures
for indexing purposes. This is denoted in Figure 1.3, where single concepts of our
introduced example knowledge structure are linked to other, really existing struc-
tures, namely the Library of Congress Subject Headings (LCSH), the International
Nuclear Information System / Energy Technology Data Exchange (INIS/ETDE), and
the YAGO project.
1 Introduction: Envisioning Semantic Information Spaces 5
Fig. 1.3: Interoperability in information spaces.
These three structures, which were arbitrary selected for this example, are quite
different in their organization, coverage and purpose. The LCSH can be charac-
terized as an authority file, INIS/ETDE is a thesaurus that has been developed
and used by the International Atomic Energy Agency (IAEA)1, and YAGO is an
ontology mainly built up with vocabulary from the Wikipedia2. Since we haven’t
1 http://www.iaea.org/inis/products-services/thesaurus
2 http://www.mpi-inf.mpg.de/yago-naga/yago
Discovering Diverse Content Through
Random Scribd Documents
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except
for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookfinal.com
Ad

More Related Content

Similar to Semantic Knowledge Representation for Information Retrieval Winfried Gödert (20)

Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurst
WIKOLO
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
Kaniska Mandal
 
Technical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search EngineTechnical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search Engine
s0P5a41b
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
Marianne Sweeny
 
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
ijaia
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
cscpconf
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
John Felahi
 
Nlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_intNlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_int
KarenVacca
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
University of Bari (Italy)
 
Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs
inventionjournals
 
Mapping a path to the empowered searcher
Mapping a path to the empowered searcherMapping a path to the empowered searcher
Mapping a path to the empowered searcher
Sheila Webber
 
Hypertext
HypertextHypertext
Hypertext
patrickalfredwaluchio
 
SCOReD-UniTEN 2010 Managing Personal Knowledge
SCOReD-UniTEN 2010 Managing Personal KnowledgeSCOReD-UniTEN 2010 Managing Personal Knowledge
SCOReD-UniTEN 2010 Managing Personal Knowledge
Shahrinaz Ismail
 
Analysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ DataAnalysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ Data
Seth Grimes
 
Vocabulary interoperability in the semantic web james r morris
Vocabulary interoperability in the semantic web   james r morrisVocabulary interoperability in the semantic web   james r morris
Vocabulary interoperability in the semantic web james r morris
James R. Morris
 
Plenary paper-2012-weideman-academic-content-web-visibility-presence
Plenary paper-2012-weideman-academic-content-web-visibility-presencePlenary paper-2012-weideman-academic-content-web-visibility-presence
Plenary paper-2012-weideman-academic-content-web-visibility-presence
Cape Peninsula University of Technology
 
Word Embedding In IR
Word Embedding In IRWord Embedding In IR
Word Embedding In IR
Bhaskar Chatterjee
 
Transform unstructured e&p information
Transform unstructured e&p informationTransform unstructured e&p information
Transform unstructured e&p information
Stig-Arne Kristoffersen
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
AIIM Minnesota
 
CS8080_IRT__UNIT_I_NOTES.pdf
CS8080_IRT__UNIT_I_NOTES.pdfCS8080_IRT__UNIT_I_NOTES.pdf
CS8080_IRT__UNIT_I_NOTES.pdf
AALIM MUHAMMED SALEGH COLLEGE OF ENGINEERING
 
Henry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurstHenry stewart dam2010_taxonomicsearch_markohurst
Henry stewart dam2010_taxonomicsearch_markohurst
WIKOLO
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
Kaniska Mandal
 
Technical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search EngineTechnical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search Engine
s0P5a41b
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
Marianne Sweeny
 
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
NATURE: A TOOL RESULTING FROM THE UNION OF ARTIFICIAL INTELLIGENCE AND NATURA...
ijaia
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
cscpconf
 
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White PaperContent Analyst - Conceptualizing LSI Based Text Analytics White Paper
Content Analyst - Conceptualizing LSI Based Text Analytics White Paper
John Felahi
 
Nlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_intNlp and semantic_web_for_competitive_int
Nlp and semantic_web_for_competitive_int
KarenVacca
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
University of Bari (Italy)
 
Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs Information Organisation for the Future Web: with Emphasis to Local CIRs
Information Organisation for the Future Web: with Emphasis to Local CIRs
inventionjournals
 
Mapping a path to the empowered searcher
Mapping a path to the empowered searcherMapping a path to the empowered searcher
Mapping a path to the empowered searcher
Sheila Webber
 
SCOReD-UniTEN 2010 Managing Personal Knowledge
SCOReD-UniTEN 2010 Managing Personal KnowledgeSCOReD-UniTEN 2010 Managing Personal Knowledge
SCOReD-UniTEN 2010 Managing Personal Knowledge
Shahrinaz Ismail
 
Analysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ DataAnalysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ Data
Seth Grimes
 
Vocabulary interoperability in the semantic web james r morris
Vocabulary interoperability in the semantic web   james r morrisVocabulary interoperability in the semantic web   james r morris
Vocabulary interoperability in the semantic web james r morris
James R. Morris
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
AIIM Minnesota
 

Recently uploaded (20)

One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
P-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 finalP-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 final
bs22n2s
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Operations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdfOperations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdf
Arab Academy for Science, Technology and Maritime Transport
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
Sinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_NameSinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_Name
keshanf79
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
P-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 finalP-glycoprotein pamphlet: iteration 4 of 4 final
P-glycoprotein pamphlet: iteration 4 of 4 final
bs22n2s
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
Sinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_NameSinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_Name
keshanf79
 
Ad

Semantic Knowledge Representation for Information Retrieval Winfried Gödert

  • 1. Semantic Knowledge Representation for Information Retrieval Winfried Gödert pdf download https://ebookfinal.com/download/semantic-knowledge- representation-for-information-retrieval-winfried-godert/ Explore and download more ebooks or textbooks at ebookfinal.com
  • 2. We believe these products will be a great fit for you. Click the link to download now, or visit ebookfinal to discover even more! Knowledge representation in the social Semantic Web 1st Edition Katrin Weller https://ebookfinal.com/download/knowledge-representation-in-the- social-semantic-web-1st-edition-katrin-weller/ Classification Made Simple An Introduction to Knowledge Organisation and Information Retrieval 3rd Edition Eric J. Hunter https://ebookfinal.com/download/classification-made-simple-an- introduction-to-knowledge-organisation-and-information-retrieval-3rd- edition-eric-j-hunter/ Computational Information Retrieval 1st Edition Michael W. Berry https://ebookfinal.com/download/computational-information- retrieval-1st-edition-michael-w-berry/ Text Information Retrieval Systems 3rd Edition Charles T. Meadow https://ebookfinal.com/download/text-information-retrieval- systems-3rd-edition-charles-t-meadow/
  • 3. Learning to Rank for Information Retrieval and Natural Language Processing 2nd Edition Hang Li https://ebookfinal.com/download/learning-to-rank-for-information- retrieval-and-natural-language-processing-2nd-edition-hang-li/ Interactive Information Retrieval in Digital Environments 1st Edition Iris Xie https://ebookfinal.com/download/interactive-information-retrieval-in- digital-environments-1st-edition-iris-xie/ A Semantic Web Primer Cooperative Information Systems Grigoris Antoniou https://ebookfinal.com/download/a-semantic-web-primer-cooperative- information-systems-grigoris-antoniou-2/ A Semantic Web Primer Cooperative Information Systems Grigoris Antoniou https://ebookfinal.com/download/a-semantic-web-primer-cooperative- information-systems-grigoris-antoniou/ Spoken Language Understanding Systems for Extracting Semantic Information from Speech 1st Edition Gokhan Tur https://ebookfinal.com/download/spoken-language-understanding-systems- for-extracting-semantic-information-from-speech-1st-edition-gokhan- tur/
  • 5. Semantic Knowledge Representation for Information Retrieval Winfried Gödert Digital Instant Download Author(s): Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt ISBN(s): 9783110329704, 3110329700 Edition: Digital original File Details: PDF, 7.75 MB Year: 2014 Language: english
  • 7. Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt Semantic Knowledge Representation for Information Retrieval
  • 9. Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt Semantic Knowledge Representation for Information Retrieval
  • 10. This work has been published with the financial support of the Cologne University of Applied Sciences. ISBN 978-3-11-030477-0 e-ISBN 978-3-11-032970-4 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet http://dnb.dnb.de. © 2014 Walter de Gruyter GmbH, Berlin/Boston Typesetting: Michael Peschke, Berlin Cover image: bentaboe/iStock/Thinkstock Printing: Hubert & Co. GmbH & Co. KG, Göttingen ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
  • 11. Preface An information seeker – in our context usually referred to as user or end user of search interfaces of collections of information resources like online libraries, domain-specific databases, or the World Wide Web – thinks of something he or she wants to find in a collection. “Something” may be of a very specific or of a very vague kind. Search operations are always designed with the intention to rec- oncile as far as possible these individual conceptualizations of a person’s search interests with the represented conceptualizations of stored indexing data. The retrieval success highly depends on a suitable correspondence between these two components. Information seekers commonly express their search interests in words that they think are grasping best the intended meaning and thus promise best-possible retrieval results. The used words either comply with semantically controlled terms of an indexing language or constitute free-text tokens. They reflect conceptual ideas whose meaning does not manifest itself in isolated con- cepts as it includes a time-dependent context and semantic relations to other concepts. Although recent trends explore semantic relations with statistical and linguistic methods, there are reasons for cognitively analyzing the context as well, to represent it adequately and thereby to provide additional support for automated processes. This is particularly true for information systems that are designed to facilitate knowledge exploration and searching in semantic context by inference or reasoning processes. Historically, there are at least two essential approaches for representing semantic connections between entities of artificial languages: on the one hand indexing languages that are used for representing the content of information resources, on the other hand knowledge representation systems that are used for machine-based knowledge exploration. Combining both approaches might sig- nificantly improve the efficiency of subject-oriented search processes. Within the framework of document indexing, extensive methods have been developed for representing concepts as elements of controlled indexing languages and using them as tools for retrieval processes. Indexing languages represent common knowledge – or more precisely, extracts of common or specialist knowl- edge – in a standardized manner and provide terminological building blocks for subject indexing. As connectors between specific knowledge and corresponding information spaces, they significantly improve thematic access to documents described in form of bibliographic data in a way other systems cannot cope with. Modeled conceptual structures reflect familiar knowledge contexts that are pri- marily processed for cognitive interpretation. They point to connections informa- tion seekers are possibly not aware of but that might nevertheless have a positive impact on the success of the search process if proposed to them. Frequently, the
  • 12. vi Preface resources of interest are indexed by headings that are not the first wordings the seeker thinks of, and only by offering such headings as additional vocabulary to the seeker positive retrieval results are obtained. Traditionally, such relationships are not regarded as tools for machine-supported analysis. Therefore, they are not sufficiently formalized for automatic reasoning processes. Usually, attributes or properties justifying a particular relation between two concepts are not stated explicitly. Relational structures commonly make use of a rather small set of rela- tionships that is not expressive and does not allow making precise, differentiated statements about semantic connections. Until now, mainly theoretical proposals give valuable hints for creating an adequate inventory of specified relation types; there are only very few attempts for practical realization. In the context of artificial intelligence, systems for knowledge representa- tion have been developed that focus on formal considerations and techniques for modeling knowledge, neglecting issues of indexing and retrieval of documents. They primarily aim at enabling machine processing and especially at drawing inferences on the formalized knowledge level. Expert or diagnostic systems give respective examples. Document indexing and retrieval are considered in the context of special applications, if at all. In general, existing indexing languages are not included; tools for knowledge representation are rather newly created or recreated. The conception of a Semantic Web marks a new step of development. It is proposed that distributed data resources should be technically combined, and it is envisioned that appropriate ontological representing and linking of distrib- uted resources could generate an additional semantic value from which thematic search processes could enormously benefit. As a matter of fact, some retrieval tests could already adduce the empirical evidence that ontology-based search processes lead to a higher performance than keyword-based searches. However, it is not clear yet how subject indexing and document retrieval can benefit from these visionary and technological impulses and how appropriate strategies for realization could look like. These questions are far from being trivial. This is reflected by the fact that the focus has shifted from Semantic Web to Linked Data applications. These intend to achieve added semantic value by merely connect- ing existing data reservoirs, making them technically interoperable. Combining cognitive and mechanical interpretation of semantic data for improving retrieval efficiency and retrieval results lies outside the interest of such projects. Yet, a semantic space that is cognitively and at the same time machine-interpretable and that brings together different existing and newly created resources for the benefit of knowledge acquisition and information retrieval is the most challeng- ing idea connected with the Semantic Web. In such a space, information seekers
  • 13. Preface vii could formulate their cognitive interests and automated tools would subsequently provide additional support that would lead to an improved search success. When designing improved search environments, it is important to ensure that content-descriptive terms of different systems are exchangeable, that seman- tic entities are interoperable. Suitable models of semantic interoperability would support both, switching between different indexing languages as well as combin- ing entities of more than one indexing language to execute thematic queries. In case valid conclusions on the conceptual level were reached, it would be essen- tial considering not only mechanical interoperability and string matching but also the semantic content of entities and the relational structure of the respective indexing languages. The final stage may be characterized as ontology-based indexing and retrieval with respect to semantic interoperability in heterogeneous environ- ments. Combining the methodological approaches to the semantic representa- tion standards of the Semantic Web provides the opportunity to separate from proprietary application contexts. Already developed knowledge structures can be used for or shared with other applications in the sense of a content-oriented semantic interoperability. The main character of this book can be described as twofold. First, it gives a state-of-the-art report with regard to the mentioned issues. It presents a frame- work for interconnecting the described two strands of development and shows how they can benefit from each other. In particular, it is discussed how document retrieval and search results can be improved based on an expanded set of differ- entiated semantic relation types that allow for drawing machine inferences along the relational structure. Secondly, it contains proposals to which extent existing indexing languages can be used and what requirements have to be met to develop them further towards knowledge representations being able to fulfill both the conceptual interpretations of their elements and to support formal inferences for the design of advanced retrieval environments. This part of the book is based on two projects that were conducted at the Cologne University of Applied Sciences during the years 2006 to 2011: CrissCross and Reseda. The CrissCross project was financially supported by the Deutsche Forschungsgemeinschaft (German Research Foundation) and was executed in cooperation with the German National Library. It aimed at creating a multilingual, thesaurus-based and user-friendly research vocabulary that facilitates research in heterogeneously indexed collections. To achieve this aim the subject headings of the German subject headings authority file Schlagwortnormdatei (SWD) were mapped to notations of the Dewey Decimal Classification, i.e., its German version (DDC Deutsch). Within its framework, the German National Library also linked SWD headings to their equivalents in the Library of Congress Subject Headings
  • 14. viii Preface (LCSH) and the French indexing vocabulary Rameau, thus contributing to the MACS project. The results of the project became part of the Linked Data service of the German National Library. The experiences and expertise gained in the CrissCross project were utilized within the second project, Reseda - Representational models for semantic data. This project was made possible by the financial support of the Cologne Univer- sity of Applied Sciences. Its focus was on designing, developing and improving models and frameworks for the representation of semantic information in knowl- edge organization systems. The project’s aim was to explore strategies for pre- cisely specifying the semantic content and characteristics of concepts and the semantic relations between these concepts in indexing languages and other knowledge organization systems, thereby augmenting the semantic richness and expressivity of these vocabularies for machine support within retrieval scenarios. Many results of this project form the basis of this book. Initial and target point of all considerations presented in this book are pro- cesses of information retrieval for subject content, viz. automatic and cognitive strategies to explore knowledge or to facilitate access to information. An introductory chapter gives a description of the problems and objectives for solutions, technical details of the subsequent discussion are thus not antici- pated. From the perspective of the authors, the selected sample environment has a special aptitude for this objective. For the subsequent discussion, however, it is not of substantive importance. The focus of the considerations are always general problems and solutions. All of the examples of the book are designed to support abstract considerations or to illustrate general methods. None of the displayed methods is designed for a specific example of the sample environment alone. After this introduction, the text is divided in three parts, each describing a stage for the development of a concept that we call an “ontology-based model for indexing and retrieval”. The first part reports state-of-the-art essentials of knowl- edge organization, indexing principles, and paradigms of information retrieval. Essential characteristics of semantic technologies for knowledge representation are introduced in Chapter 3. The basic features of web-specific representation languages for semantic content are sketched as far as they are of special interest for our context. Besides XML, RDF, and OWL, application-specific representation languages are described. Chapter 4 discusses different levels of semantic expres- sivity in search processes and how the resulting requirements can be supported by combining features of indexing results and retrieval environments. Limita- tions indexing languages face in view of multilingual and heterogeneous infor- mation spaces are also outlined. Part B presents in its first chapter various approaches for handling hetero- geneity in indexing and retrieval, including citation pearl growing, multilingual
  • 15. Preface ix indexing languages, and vocabulary linking. Design and outcomes of several projects are presented. It is questioned whether these approaches can be seen as possible solutions for a heterogeneity treatment that human beings can interpret and that at the same time are promoting machine supported inferences. The latter aspect gives rise for continuing the discussion in Chapter 6 by a more detailed analysis of the problems that must be taken into concern if heterogeneity should be solved be methods of semantic interoperability. It is clarified how semantic interoperability should be understood for indexing and retrieval purposes and how to combine this understanding with a model for conceptual knowledge representation by entities and improved relational structures. Conditions under which entities of different indexing languages can be viewed as semantically interoperable are derived as requirements for the following discussion. The third part presents in 4 chapters the components of our understanding of a model for ontology-based indexing and retrieval by combining the established methods of indexing and retrieval with the strength of formal knowledge repre- sentation. In more detail, the primarily cognitively interpretable terms and the established relations between them are embedded into a formal framework of semantic models, typed relations and inference procedures to develop enhanced procedures of search and find scenarios. Within this frame, refining and restruc- turing their relational inventories is indispensible. Based on first examples, we show the potential of specified, logically valid semantic data being interpretable both for cognitive and machine-supported information retrieval processes. We devote special attention to the crucial task of enriching and restructuring existing indexing languages viz. refining the relational inventory by means of abstraction and generalization. The presentation concludes with a short discussion of some open questions and suggestions for further research. Although the chapters are based on each other in content, it was the aim to make each chapter as self-explanatory as possible. In doing so, duplication and cross-references could not always be avoided. Sometimes the re-treatment of a question under a changed point of view was required. The chosen cross-disci- plinary approach made it necessary in some places to use an own terminology. The particularly important terminological definitions have been compiled in a systematic glossary in the appendix. Many colleagues have substantially supported our work and contributed to our findings especially by patient and continuous discussions. At first, we would like to mention the members of the Cologne staff of both projects CrissCross and Reseda: Anne Betz, Felix Boteram, Jan-Helge Jacobs, Tina Mengel, Katrin Müller and Michael Panzer (neé Preuss). We would like to thank them all; our work would not have been successful without their help. A special thanks to Jens Wille
  • 16. x Preface who set up a Web search environment for our experiments with typed relations and thus allows performing the first tests as well as verifying our statements. We also got benefit from many persons we cannot mention all by name, especially the members of our project partner institutions and other colleagues interested in our work. We wish to thank them, too. Winfried Gödert Jessica Hubrich Matthias Nagelschmidt
  • 17. Table of Contents Preface  v 1 Introduction: Envisioning Semantic Information Spaces  1 Part A Propaedeutics – Organizing, Representing, and Exploring Knowledge 2 Indexing and Knowledge Organization  15 2.1 Knowledge Organization Systems as Indexing Languages  15 2.1.1 Building Elements: Entities and Terms  16 2.1.2 Structural Elements: Intrasystem Relations  21 2.1.3 Result Elements: Indexates  27 2.2 Standards and Frameworks  30 2.2.1 ISO 25964: Thesauri and Interoperability with other Vocabularies  30 2.2.2 Functional Requirements for Subject Authority Data (FRSAD)  31 3 Semantic Technologies for Knowledge Representation  33 3.1 Web-based Representation Languages  33 3.1.1 XML  34 3.1.2 RDF/RDFS  37 3.1.3 OWL  42 3.2 Application-based Representation Languages  49 3.2.1 XTM  50 3.2.2 SKOS  57 4 Information Retrieval and Knowledge Exploration  61 4.1 Information Retrieval Essentials  61 4.1.1 Exact Match Paradigm  62 4.1.2 Partial Match Paradigm  64 4.2 Measuring Effectiveness in Information Retrieval  65 4.3 From Retrieving to Exploring  68 4.3.1 String-based Retrieval Processes  71 4.3.2 Conceptual Retrieval Process  73 4.3.3 Conceptual Exploration Processes  74 4.3.4 Topical Exploration Processes  78 4.4 From Homogeneous to Heterogeneous Information Spaces  80
  • 18. xii Table of Contents Part B Status quo – Handling Heterogeneity in Indexing and Retrieval 5 Approaches to Handle Heterogeneity  87 5.1 Citation Pearl Growing  87 5.2 Modeling Multilingual Indexing Languages  89 5.3 Establishing Semantic Interoperability between Indexing Languages  90 5.3.1 Structural Models  91 5.3.2 Mapping Levels  93 5.3.3 Vocabulary Linking Projects  96 6 Problems with Establishing Semantic Interoperability  105 6.1 Conceptual Interoperability between Entities of Indexing Languages  107 6.1.1 Focused and Comprehensive Mapping  108 6.1.2 Conceptual Identity and Semantic Congruence  112 6.2 Equivalent Intersystem Relationships  118 6.2.1 Intersystem Relations Compared to Intrasystem Relations  119 6.2.2 Interoperability and Search Tactics  121 6.2.3 Specified Intersystem Relationships  132 6.2.4 Conceptual Interoperability between Indexing Results  134 6.2.5 Directedness of Intersystem Relationships  137 Part C Vision – Ontology-based Indexing and Retrieval 7 Formalization in Indexing Languages  147 7.1 Introduction and Objectives  147 7.2 Common Characteristics and Differences between Indexing Languages and Formal Knowledge Representation  151 7.3 Prerequisites for an Ontology-based Indexing  156 7.3.1 Semantic Relations and Inferred Document Sets  158 7.3.2 Facets and Inferences  167 8 Typification of Semantic Relations  181 8.1 Inventories of Typed relations  182 8.2 Typed Relations and their Benefit for Indexing and Retrieval  188 8.3 Examples of the Benefit of Typed Relations for the Retrieval Process  194
  • 19. Table of Contents xiii 8.3.1 Example 1: Aspect-oriented Specification of the Generic Hierarchy Relation  194 8.3.2 Example 2: Typed Relations of a Topic Map built from the ASIST Thesaurus  197 8.3.3 Example 3: Degrees of Determinacy  213 9 Inferences in Retrieval Processes  215 9.1 Inferences of Level 1  216 9.1.1 Hierarchical Relationships  216 9.1.2 Associative Relationships  217 9.1.3 Typification of the Synonymy / Equivalence Relationship  218 9.2 Inferences of Level 2 and of Higher Levels, Transitivity  222 9.2.1 Hierarchical Relationships  223 9.2.2 Unspecific Associative Relationships  226 9.2.3 Typification of Associative Relationships  229 9.3 Inferences by Combining Different Types of Relationships  231 9.3.1 Synonymy Relation with Hierarchical Relationships  231 9.3.2 Chronological Relation with Hierarchical Relationships  232 9.3.3 Transitions from Associative Relationships to a Hierarchical Structure  232 9.3.4 Transitions from a Hierarchical Structure to Associative Relationships  233 9.3.5 Transitivity for Combinations of Typed Associative Relationships  235 10 Semantic Interoperability and Inferences  237 10.1 Conditions for Entity-based Interoperability  237 10.2 Models of Semantic Interoperability  244 10.2.1 Ontological Spine and Satellite Ontologies  244 10.2.2 Degrees of Determinacy and Interoperability  250 10.2.3 Entity-based Interoperability and Facets  252 10.3 Perspective: Ontology-based Indexing and Retrieval  254 11 Remaining Research Questions  259 11.1 Questions of Modeling  259 11.2 Questions of Procedure  260 11.3 Questions of Technology and Implementation  262
  • 20. xiv Table of Contents Part D Appendices Systematic Glossary  265 Abbreviations  271 List of figures  273 List of tables  277 References  279 Index  289
  • 21. 1 Introduction: Envisioning Semantic Information Spaces Indexing languages, interoperability, information retrieval, semantic technolo- gies – is it really worth examining the particular interaction of these rather dif- fering subjects, as we do in this book? In this preliminary chapter we try to give a first answer why we think it is. Therefore we will pick up the idea of a semantic information space again, which was already mentioned in the preface and make it more concrete by envisioning some examples. We will take a first naive look at search situations and the impact of semantic knowledge representation, yet without considering the conceptual or technical background. Thus in this first look, information retrieval systems, indexing languages and semantic technolo- gies are treated as a black box, which ideally provides a search environment that can be somehow characterized as a semantic information space. Examples in this book are heterogeneous and (amongst some others) taken from the domains of chemistry, physics and biology, particularly ornithology. Although neither the authors nor the subjects of this book are affiliated to these disciplines, we will nevertheless occasionally revert to them, as they are clearly outside of our own profession and can be seen insofar as a “neutral” domain, which seems to provide a lower risk of misunderstanding than examples from the less accurate fields of humanities or social sciences would probably provide. However, there are of course no special skills in natural sciences needed to read and understand the examples and to follow the argumentation. All examples are trivial enough to be understood even without any substantial chemical, physical or zoological knowledge. When speaking of an “information space”, one could quite generally think of two extremes: either a collection of information resources that are widely homog- enous in form and content and centralized in one storage or a heterogeneous col- lection, distributed over several repositories and organized independently from each other – the first extreme is e.g. embodied by traditional library collections, while the most prominent example for the latter is the World Wide Web. In the following, both extremes and every possible specification between them shall be understood as information spaces. We begin our consideration with a relatively simple organized information space. Figure 1.1 shows a situation that is remindful of a bibliographic database. The document store contains a number of bibliographic records, which are repre- senting two monographs written by the German chemist and Nobel Prize laureate Otto Hahn and one book of correspondence from the physicist Lise Meitner to Otto Hahn. To represent the authorship of Otto Hahn and Lise Meitner for each docu-
  • 22. 2 1 Introduction: Envisioning Semantic Information Spaces ment consistently, a name authority file is used, which contains personal name authority records of both scientists that can be linked to the stored documents. In doing so, one can easily search the information space e.g. for all documents written by Otto Hahn – this search operation is often referred to as a collocation search. Fig. 1.1: Authority files in information spaces. Another search operation can be described as a subject search. That would be a search e.g. for all documents about “radioactivity”. To carry out subject searches, the information space must somehow provide the information of what each doc- ument is “about” – in the indexing context we also speak of the aboutness of a document (cf. Ingwersen 1992, 50–54). In bibliographic databases this aboutness is traditionally represented by one or more subject headings or thesaurus descrip- tors. In order to provide a consistent representation, the subject headings can be organized in a subject headings authority file, so that each subject heading has its own authority record that can be linked to the appropriate document records (cf. Fig. 1.1). There is nothing special to the situation described so far and everybody who has ever used an online catalog of a library should be familiar with it, as it corre- sponds to the way bibliographic data has been organized for a long time and still
  • 23. 1 Introduction: Envisioning Semantic Information Spaces 3 continues to be organized by documentary institutions and especially libraries. However, knowledge representation is beginning beyond this situation. In Figure 1.2 the authority files are replaced by a network-like structure. The now grey shaded elements of Figure 1.1 seem to become more complex, as they are somehow embedded in a meaningful context – later on in this book, we will address these elements precisely and speak more abstractly of entities of a knowl- edge representation. What we are characterizing here rather vague as a “meaning- ful context” raises these entities from the keyword-based level in Figure 1.1 to a conceptual level in Figure 1.2. We will examine this important step in the follow- ing chapters and confine ourselves here to the determination that these concepts primarily can be used for indexing the stored documents and thereby fulfill the same basic descriptor function as simple keywords, but that they also open up a broader context, as they are connected to other, somehow related concepts. In the following, this situation will be referred to as a knowledge structure. Fig. 1.2: Knowledge structures in information spaces. Searching the information space in Figure 1.2 with a descriptor “radioactivity” leads not only to the indexed monograph of Otto Hahn “Applied radiochemistry”, but also to the related descriptors “activity level” and “radioisotope”. It becomes apparent that an information seeker, who is interested in “radioactivity”, could also be interested in certain levels of radioactivity or in concrete radioactive iso-
  • 24. 4 1 Introduction: Envisioning Semantic Information Spaces topes. The same seems to apply to “nuclear fission” and “nuclear reaction” – it isn’t unlikely that an information seeker with an interest in nuclear fission may also be interested in other nuclear reactions. Beyond that, the knowledge struc- ture of Figure 1.2 also establishes a relationship between Otto Hahn and the rather abstract concept “person” explicit, as well as between Otto Hahn’s research col- league Lise Meitner and “person”. As a human there’s no difficulty in the cogni- tive interpretation of these relations – we can easily see that Otto Hahn and Lise Meitner are persons, even if we never heard their names before. By using seman- tic technologies, this knowledge can be made machine-readable, so that it would be able to infer (Glossary C3.2) that Otto Hahn is a person due to the fact that the concept “Hahn, Otto” is related to the concept “person” in a specific way. Like- wise the risk of confusing the person Otto Hahn with the homonymous research vessel, which was launched in 1964 and named after the famous scientist, could be avoided. At this point we have already mentioned many aspects and reached to the core issues of this book. In the following, we will take a closer look at searches in information spaces and the underlying information retrieval processes and there- fore give a first impression of the usefulness of relations like the above described. We will also look at the interdependency between indexing and information retrieval processes, introduce Knowledge Organization Systems (KOSs) as types of knowledge structures that are designed to support indexing and retrieval and finally concern questions like how it could be made explicit and recognizable for a KOS that a document “Letters of Lise Meitner to Otto Hahn” is about letters that Lise Meitner wrote to Otto Hahn and not vice versa. Based on this, we will provide a more systematic discussion of the specific types of relations and their functionality within and between knowledge struc- tures – later on we will speak of them as intra- and intersystem relations. Yet, before that, some preliminary considerations will be provided, in order to facili- tate a better understanding of the mentioned issues. Accordingly, we will address the functionality of intersystem relations, i.e., those relations that are bridging two knowledge structures and therefore make them somehow interoperable. In this context, we will focus on the problems of heterogeneity that may arise e.g. from the use of different knowledge structures for indexing purposes. This is denoted in Figure 1.3, where single concepts of our introduced example knowledge structure are linked to other, really existing struc- tures, namely the Library of Congress Subject Headings (LCSH), the International Nuclear Information System / Energy Technology Data Exchange (INIS/ETDE), and the YAGO project.
  • 25. 1 Introduction: Envisioning Semantic Information Spaces 5 Fig. 1.3: Interoperability in information spaces. These three structures, which were arbitrary selected for this example, are quite different in their organization, coverage and purpose. The LCSH can be charac- terized as an authority file, INIS/ETDE is a thesaurus that has been developed and used by the International Atomic Energy Agency (IAEA)1, and YAGO is an ontology mainly built up with vocabulary from the Wikipedia2. Since we haven’t 1 http://www.iaea.org/inis/products-services/thesaurus 2 http://www.mpi-inf.mpg.de/yago-naga/yago
  • 26. Discovering Diverse Content Through Random Scribd Documents
  • 27. payments must be paid within 60 days following each date on which you prepare (or are legally required to prepare) your periodic tax returns. Royalty payments should be clearly marked as such and sent to the Project Gutenberg Literary Archive Foundation at the address specified in Section 4, “Information about donations to the Project Gutenberg Literary Archive Foundation.” • You provide a full refund of any money paid by a user who notifies you in writing (or by e-mail) within 30 days of receipt that s/he does not agree to the terms of the full Project Gutenberg™ License. You must require such a user to return or destroy all copies of the works possessed in a physical medium and discontinue all use of and all access to other copies of Project Gutenberg™ works. • You provide, in accordance with paragraph 1.F.3, a full refund of any money paid for a work or a replacement copy, if a defect in the electronic work is discovered and reported to you within 90 days of receipt of the work. • You comply with all other terms of this agreement for free distribution of Project Gutenberg™ works. 1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™ electronic work or group of works on different terms than are set forth in this agreement, you must obtain permission in writing from the Project Gutenberg Literary Archive Foundation, the manager of the Project Gutenberg™ trademark. Contact the Foundation as set forth in Section 3 below. 1.F. 1.F.1. Project Gutenberg volunteers and employees expend considerable effort to identify, do copyright research on, transcribe and proofread works not protected by U.S. copyright
  • 28. law in creating the Project Gutenberg™ collection. Despite these efforts, Project Gutenberg™ electronic works, and the medium on which they may be stored, may contain “Defects,” such as, but not limited to, incomplete, inaccurate or corrupt data, transcription errors, a copyright or other intellectual property infringement, a defective or damaged disk or other medium, a computer virus, or computer codes that damage or cannot be read by your equipment. 1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for the “Right of Replacement or Refund” described in paragraph 1.F.3, the Project Gutenberg Literary Archive Foundation, the owner of the Project Gutenberg™ trademark, and any other party distributing a Project Gutenberg™ electronic work under this agreement, disclaim all liability to you for damages, costs and expenses, including legal fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE. 1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you discover a defect in this electronic work within 90 days of receiving it, you can receive a refund of the money (if any) you paid for it by sending a written explanation to the person you received the work from. If you received the work on a physical medium, you must return the medium with your written explanation. The person or entity that provided you with the defective work may elect to provide a replacement copy in lieu of a refund. If you received the work electronically, the person or entity providing it to you may choose to give you a second opportunity to receive the work electronically in lieu of a refund.
  • 29. If the second copy is also defective, you may demand a refund in writing without further opportunities to fix the problem. 1.F.4. Except for the limited right of replacement or refund set forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PURPOSE. 1.F.5. Some states do not allow disclaimers of certain implied warranties or the exclusion or limitation of certain types of damages. If any disclaimer or limitation set forth in this agreement violates the law of the state applicable to this agreement, the agreement shall be interpreted to make the maximum disclaimer or limitation permitted by the applicable state law. The invalidity or unenforceability of any provision of this agreement shall not void the remaining provisions. 1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation, the trademark owner, any agent or employee of the Foundation, anyone providing copies of Project Gutenberg™ electronic works in accordance with this agreement, and any volunteers associated with the production, promotion and distribution of Project Gutenberg™ electronic works, harmless from all liability, costs and expenses, including legal fees, that arise directly or indirectly from any of the following which you do or cause to occur: (a) distribution of this or any Project Gutenberg™ work, (b) alteration, modification, or additions or deletions to any Project Gutenberg™ work, and (c) any Defect you cause. Section 2. Information about the Mission of Project Gutenberg™
  • 30. Project Gutenberg™ is synonymous with the free distribution of electronic works in formats readable by the widest variety of computers including obsolete, old, middle-aged and new computers. It exists because of the efforts of hundreds of volunteers and donations from people in all walks of life. Volunteers and financial support to provide volunteers with the assistance they need are critical to reaching Project Gutenberg™’s goals and ensuring that the Project Gutenberg™ collection will remain freely available for generations to come. In 2001, the Project Gutenberg Literary Archive Foundation was created to provide a secure and permanent future for Project Gutenberg™ and future generations. To learn more about the Project Gutenberg Literary Archive Foundation and how your efforts and donations can help, see Sections 3 and 4 and the Foundation information page at www.gutenberg.org. Section 3. Information about the Project Gutenberg Literary Archive Foundation The Project Gutenberg Literary Archive Foundation is a non- profit 501(c)(3) educational corporation organized under the laws of the state of Mississippi and granted tax exempt status by the Internal Revenue Service. The Foundation’s EIN or federal tax identification number is 64-6221541. Contributions to the Project Gutenberg Literary Archive Foundation are tax deductible to the full extent permitted by U.S. federal laws and your state’s laws. The Foundation’s business office is located at 809 North 1500 West, Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up to date contact information can be found at the Foundation’s website and official page at www.gutenberg.org/contact
  • 31. Section 4. Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg™ depends upon and cannot survive without widespread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine-readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States. Compliance requirements are not uniform and it takes a considerable effort, much paperwork and many fees to meet and keep up with these requirements. We do not solicit donations in locations where we have not received written confirmation of compliance. To SEND DONATIONS or determine the status of compliance for any particular state visit www.gutenberg.org/donate. While we cannot and do not solicit contributions from states where we have not met the solicitation requirements, we know of no prohibition against accepting unsolicited donations from donors in such states who approach us with offers to donate. International donations are gratefully accepted, but we cannot make any statements concerning tax treatment of donations received from outside the United States. U.S. laws alone swamp our small staff. Please check the Project Gutenberg web pages for current donation methods and addresses. Donations are accepted in a number of other ways including checks, online payments and
  • 32. credit card donations. To donate, please visit: www.gutenberg.org/donate. Section 5. General Information About Project Gutenberg™ electronic works Professor Michael S. Hart was the originator of the Project Gutenberg™ concept of a library of electronic works that could be freely shared with anyone. For forty years, he produced and distributed Project Gutenberg™ eBooks with only a loose network of volunteer support. Project Gutenberg™ eBooks are often created from several printed editions, all of which are confirmed as not protected by copyright in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition. Most people start at our website which has the main PG search facility: www.gutenberg.org. This website includes information about Project Gutenberg™, including how to make donations to the Project Gutenberg Literary Archive Foundation, how to help produce our new eBooks, and how to subscribe to our email newsletter to hear about new eBooks.
  • 33. Welcome to our website – the ideal destination for book lovers and knowledge seekers. With a mission to inspire endlessly, we offer a vast collection of books, ranging from classic literary works to specialized publications, self-development books, and children's literature. Each book is a new journey of discovery, expanding knowledge and enriching the soul of the reade Our website is not just a platform for buying books, but a bridge connecting readers to the timeless values of culture and wisdom. With an elegant, user-friendly interface and an intelligent search system, we are committed to providing a quick and convenient shopping experience. Additionally, our special promotions and home delivery services ensure that you save time and fully enjoy the joy of reading. Let us accompany you on the journey of exploring knowledge and personal growth! ebookfinal.com