Semantic Knowledge Representation for Information Retrieval Winfried Gödert

Semantic Knowledge Representation for
Information Retrieval Winfried Gödert pdf
download
https://ebookfinal.com/download/semantic-knowledge-
representation-for-information-retrieval-winfried-godert/
Explore and download more ebooks or textbooks
at ebookfinal.com

We believe these products will be a great fit for you. Click
the link to download now, or visit ebookfinal
to discover even more!
Knowledge representation in the social Semantic Web 1st
Edition Katrin Weller
https://ebookfinal.com/download/knowledge-representation-in-the-
social-semantic-web-1st-edition-katrin-weller/
Classification Made Simple An Introduction to Knowledge
Organisation and Information Retrieval 3rd Edition Eric J.
Hunter
https://ebookfinal.com/download/classification-made-simple-an-
introduction-to-knowledge-organisation-and-information-retrieval-3rd-
edition-eric-j-hunter/
Computational Information Retrieval 1st Edition Michael W.
Berry
https://ebookfinal.com/download/computational-information-
retrieval-1st-edition-michael-w-berry/
Text Information Retrieval Systems 3rd Edition Charles T.
Meadow
https://ebookfinal.com/download/text-information-retrieval-
systems-3rd-edition-charles-t-meadow/

Learning to Rank for Information Retrieval and Natural
Language Processing 2nd Edition Hang Li
https://ebookfinal.com/download/learning-to-rank-for-information-
retrieval-and-natural-language-processing-2nd-edition-hang-li/
Interactive Information Retrieval in Digital Environments
1st Edition Iris Xie
https://ebookfinal.com/download/interactive-information-retrieval-in-
digital-environments-1st-edition-iris-xie/
A Semantic Web Primer Cooperative Information Systems
Grigoris Antoniou
https://ebookfinal.com/download/a-semantic-web-primer-cooperative-
information-systems-grigoris-antoniou-2/
A Semantic Web Primer Cooperative Information Systems
Grigoris Antoniou
https://ebookfinal.com/download/a-semantic-web-primer-cooperative-
information-systems-grigoris-antoniou/
Spoken Language Understanding Systems for Extracting
Semantic Information from Speech 1st Edition Gokhan Tur
https://ebookfinal.com/download/spoken-language-understanding-systems-
for-extracting-semantic-information-from-speech-1st-edition-gokhan-
tur/

Semantic Knowledge Representation for Information
Retrieval Winfried GÃ¶dert Digital Instant Download
Author(s): Winfried GÃ¶dert, Jessica Hubrich, Matthias Nagelschmidt
ISBN(s): 9783110329704, 3110329700
Edition: Digital original
File Details: PDF, 7.75 MB
Year: 2014
Language: english

Winfried Gödert, Jessica Hubrich, Matthias Nagelschmidt
Semantic Knowledge Representation for Information Retrieval

Winfried Gödert, Jessica Hubrich,
Matthias Nagelschmidt
Semantic Knowledge
Representation for
Information Retrieval

This work has been published with the financial support of the Cologne University of Applied
Sciences.
ISBN 978-3-11-030477-0
e-ISBN 978-3-11-032970-4
Library of Congress Cataloging-in-Publication Data
A CIP catalog record for this book has been applied for at the Library of Congress.
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available in the Internet http://dnb.dnb.de.
© 2014 Walter de Gruyter GmbH, Berlin/Boston
Typesetting: Michael Peschke, Berlin
Cover image: bentaboe/iStock/Thinkstock
Printing: Hubert & Co. GmbH & Co. KG, Göttingen
♾ Printed on acid-free paper
Printed in Germany
www.degruyter.com

Preface
An information seeker – in our context usually referred to as user or end user
of search interfaces of collections of information resources like online libraries,
domain-specific databases, or the World Wide Web – thinks of something he or
she wants to find in a collection. “Something” may be of a very specific or of a
very vague kind. Search operations are always designed with the intention to rec-
oncile as far as possible these individual conceptualizations of a person’s search
interests with the represented conceptualizations of stored indexing data. The
retrieval success highly depends on a suitable correspondence between these
two components. Information seekers commonly express their search interests in
words that they think are grasping best the intended meaning and thus promise
best-possible retrieval results. The used words either comply with semantically
controlled terms of an indexing language or constitute free-text tokens. They
reflect conceptual ideas whose meaning does not manifest itself in isolated con-
cepts as it includes a time-dependent context and semantic relations to other
concepts. Although recent trends explore semantic relations with statistical
and linguistic methods, there are reasons for cognitively analyzing the context
as well, to represent it adequately and thereby to provide additional support for
automated processes. This is particularly true for information systems that are
designed to facilitate knowledge exploration and searching in semantic context
by inference or reasoning processes.
Historically, there are at least two essential approaches for representing
semantic connections between entities of artificial languages: on the one hand
indexing languages that are used for representing the content of information
resources, on the other hand knowledge representation systems that are used for
machine-based knowledge exploration. Combining both approaches might sig-
nificantly improve the efficiency of subject-oriented search processes.
Within the framework of document indexing, extensive methods have been
developed for representing concepts as elements of controlled indexing languages
and using them as tools for retrieval processes. Indexing languages represent
common knowledge – or more precisely, extracts of common or specialist knowl-
edge – in a standardized manner and provide terminological building blocks for
subject indexing. As connectors between specific knowledge and corresponding
information spaces, they significantly improve thematic access to documents
described in form of bibliographic data in a way other systems cannot cope with.
Modeled conceptual structures reflect familiar knowledge contexts that are pri-
marily processed for cognitive interpretation. They point to connections informa-
tion seekers are possibly not aware of but that might nevertheless have a positive
impact on the success of the search process if proposed to them. Frequently, the

vi Preface
resources of interest are indexed by headings that are not the first wordings the
seeker thinks of, and only by offering such headings as additional vocabulary to
the seeker positive retrieval results are obtained. Traditionally, such relationships
are not regarded as tools for machine-supported analysis. Therefore, they are not
sufficiently formalized for automatic reasoning processes. Usually, attributes or
properties justifying a particular relation between two concepts are not stated
explicitly. Relational structures commonly make use of a rather small set of rela-
tionships that is not expressive and does not allow making precise, differentiated
statements about semantic connections. Until now, mainly theoretical proposals
give valuable hints for creating an adequate inventory of specified relation types;
there are only very few attempts for practical realization.
In the context of artificial intelligence, systems for knowledge representa-
tion have been developed that focus on formal considerations and techniques for
modeling knowledge, neglecting issues of indexing and retrieval of documents.
They primarily aim at enabling machine processing and especially at drawing
inferences on the formalized knowledge level. Expert or diagnostic systems give
respective examples. Document indexing and retrieval are considered in the
context of special applications, if at all. In general, existing indexing languages
are not included; tools for knowledge representation are rather newly created or
recreated.
The conception of a Semantic Web marks a new step of development. It is
proposed that distributed data resources should be technically combined, and
it is envisioned that appropriate ontological representing and linking of distrib-
uted resources could generate an additional semantic value from which thematic
search processes could enormously benefit. As a matter of fact, some retrieval
tests could already adduce the empirical evidence that ontology-based search
processes lead to a higher performance than keyword-based searches. However,
it is not clear yet how subject indexing and document retrieval can benefit from
these visionary and technological impulses and how appropriate strategies for
realization could look like. These questions are far from being trivial. This is
reflected by the fact that the focus has shifted from Semantic Web to Linked Data
applications. These intend to achieve added semantic value by merely connect-
ing existing data reservoirs, making them technically interoperable. Combining
cognitive and mechanical interpretation of semantic data for improving retrieval
efficiency and retrieval results lies outside the interest of such projects. Yet, a
semantic space that is cognitively and at the same time machine-interpretable
and that brings together different existing and newly created resources for the
benefit of knowledge acquisition and information retrieval is the most challeng-
ing idea connected with the Semantic Web. In such a space, information seekers

Preface vii
could formulate their cognitive interests and automated tools would subsequently
provide additional support that would lead to an improved search success.
When designing improved search environments, it is important to ensure
that content-descriptive terms of different systems are exchangeable, that seman-
tic entities are interoperable. Suitable models of semantic interoperability would
support both, switching between different indexing languages as well as combin-
ing entities of more than one indexing language to execute thematic queries. In
case valid conclusions on the conceptual level were reached, it would be essen-
tial considering not only mechanical interoperability and string matching but
also the semantic content of entities and the relational structure of the respective
indexing languages.
The final stage may be characterized as ontology-based indexing and
retrieval with respect to semantic interoperability in heterogeneous environ-
ments. Combining the methodological approaches to the semantic representa-
tion standards of the Semantic Web provides the opportunity to separate from
proprietary application contexts. Already developed knowledge structures can
be used for or shared with other applications in the sense of a content-oriented
semantic interoperability.
The main character of this book can be described as twofold. First, it gives a
state-of-the-art report with regard to the mentioned issues. It presents a frame-
work for interconnecting the described two strands of development and shows
how they can benefit from each other. In particular, it is discussed how document
retrieval and search results can be improved based on an expanded set of differ-
entiated semantic relation types that allow for drawing machine inferences along
the relational structure. Secondly, it contains proposals to which extent existing
indexing languages can be used and what requirements have to be met to develop
them further towards knowledge representations being able to fulfill both the
conceptual interpretations of their elements and to support formal inferences for
the design of advanced retrieval environments.
This part of the book is based on two projects that were conducted at the
Cologne University of Applied Sciences during the years 2006 to 2011: CrissCross
and Reseda. The CrissCross project was financially supported by the Deutsche
Forschungsgemeinschaft (German Research Foundation) and was executed in
cooperation with the German National Library. It aimed at creating a multilingual,
thesaurus-based and user-friendly research vocabulary that facilitates research
in heterogeneously indexed collections. To achieve this aim the subject headings
of the German subject headings authority file Schlagwortnormdatei (SWD) were
mapped to notations of the Dewey Decimal Classification, i.e., its German version
(DDC Deutsch). Within its framework, the German National Library also linked
SWD headings to their equivalents in the Library of Congress Subject Headings

viii Preface
(LCSH) and the French indexing vocabulary Rameau, thus contributing to the
MACS project. The results of the project became part of the Linked Data service of
the German National Library.
The experiences and expertise gained in the CrissCross project were utilized
within the second project, Reseda - Representational models for semantic data.
This project was made possible by the financial support of the Cologne Univer-
sity of Applied Sciences. Its focus was on designing, developing and improving
models and frameworks for the representation of semantic information in knowl-
edge organization systems. The project’s aim was to explore strategies for pre-
cisely specifying the semantic content and characteristics of concepts and the
semantic relations between these concepts in indexing languages and other
knowledge organization systems, thereby augmenting the semantic richness and
expressivity of these vocabularies for machine support within retrieval scenarios.
Many results of this project form the basis of this book.
Initial and target point of all considerations presented in this book are pro-
cesses of information retrieval for subject content, viz. automatic and cognitive
strategies to explore knowledge or to facilitate access to information.
An introductory chapter gives a description of the problems and objectives
for solutions, technical details of the subsequent discussion are thus not antici-
pated. From the perspective of the authors, the selected sample environment has
a special aptitude for this objective. For the subsequent discussion, however, it is
not of substantive importance. The focus of the considerations are always general
problems and solutions. All of the examples of the book are designed to support
abstract considerations or to illustrate general methods. None of the displayed
methods is designed for a specific example of the sample environment alone.
After this introduction, the text is divided in three parts, each describing a
stage for the development of a concept that we call an “ontology-based model for
indexing and retrieval”. The first part reports state-of-the-art essentials of knowl-
edge organization, indexing principles, and paradigms of information retrieval.
Essential characteristics of semantic technologies for knowledge representation
are introduced in Chapter 3. The basic features of web-specific representation
languages for semantic content are sketched as far as they are of special interest
for our context. Besides XML, RDF, and OWL, application-specific representation
languages are described. Chapter 4 discusses different levels of semantic expres-
sivity in search processes and how the resulting requirements can be supported
by combining features of indexing results and retrieval environments. Limita-
tions indexing languages face in view of multilingual and heterogeneous infor-
mation spaces are also outlined.
Part B presents in its first chapter various approaches for handling hetero-
geneity in indexing and retrieval, including citation pearl growing, multilingual

Preface ix
indexing languages, and vocabulary linking. Design and outcomes of several
projects are presented. It is questioned whether these approaches can be seen as
possible solutions for a heterogeneity treatment that human beings can interpret
and that at the same time are promoting machine supported inferences. The latter
aspect gives rise for continuing the discussion in Chapter 6 by a more detailed
analysis of the problems that must be taken into concern if heterogeneity should
be solved be methods of semantic interoperability. It is clarified how semantic
interoperability should be understood for indexing and retrieval purposes and
how to combine this understanding with a model for conceptual knowledge
representation by entities and improved relational structures. Conditions under
which entities of different indexing languages can be viewed as semantically
interoperable are derived as requirements for the following discussion.
The third part presents in 4 chapters the components of our understanding of
a model for ontology-based indexing and retrieval by combining the established
methods of indexing and retrieval with the strength of formal knowledge repre-
sentation. In more detail, the primarily cognitively interpretable terms and the
established relations between them are embedded into a formal framework of
semantic models, typed relations and inference procedures to develop enhanced
procedures of search and find scenarios. Within this frame, refining and restruc-
turing their relational inventories is indispensible. Based on first examples, we
show the potential of specified, logically valid semantic data being interpretable
both for cognitive and machine-supported information retrieval processes. We
devote special attention to the crucial task of enriching and restructuring existing
indexing languages viz. refining the relational inventory by means of abstraction
and generalization.
The presentation concludes with a short discussion of some open questions
and suggestions for further research.
Although the chapters are based on each other in content, it was the aim to
make each chapter as self-explanatory as possible. In doing so, duplication and
cross-references could not always be avoided. Sometimes the re-treatment of a
question under a changed point of view was required. The chosen cross-disci-
plinary approach made it necessary in some places to use an own terminology.
The particularly important terminological definitions have been compiled in a
systematic glossary in the appendix.
Many colleagues have substantially supported our work and contributed to
our findings especially by patient and continuous discussions. At first, we would
like to mention the members of the Cologne staff of both projects CrissCross and
Reseda: Anne Betz, Felix Boteram, Jan-Helge Jacobs, Tina Mengel, Katrin Müller
and Michael Panzer (neé Preuss). We would like to thank them all; our work
would not have been successful without their help. A special thanks to Jens Wille

x Preface
who set up a Web search environment for our experiments with typed relations
and thus allows performing the first tests as well as verifying our statements. We
also got benefit from many persons we cannot mention all by name, especially
the members of our project partner institutions and other colleagues interested in
our work. We wish to thank them, too.
Winfried Gödert
Jessica Hubrich
Matthias Nagelschmidt

Table of Contents
Preface  v
1 Introduction: Envisioning Semantic Information Spaces  1
Part A Propaedeutics – Organizing, Representing, and Exploring Knowledge
2 Indexing and Knowledge Organization  15
2.1 Knowledge Organization Systems as Indexing Languages  15
2.1.1 Building Elements: Entities and Terms  16
2.1.2 Structural Elements: Intrasystem Relations  21
2.1.3 Result Elements: Indexates  27
2.2 Standards and Frameworks  30
2.2.1 ISO 25964: Thesauri and Interoperability with other
Vocabularies  30
2.2.2 Functional Requirements for Subject Authority Data (FRSAD)  31
3 Semantic Technologies for Knowledge Representation  33
3.1 Web-based Representation Languages  33
3.1.1 XML  34
3.1.2 RDF/RDFS  37
3.1.3 OWL  42
3.2 Application-based Representation Languages  49
3.2.1 XTM  50
3.2.2 SKOS  57
4 Information Retrieval and Knowledge Exploration  61
4.1 Information Retrieval Essentials  61
4.1.1 Exact Match Paradigm  62
4.1.2 Partial Match Paradigm  64
4.2 Measuring Effectiveness in Information Retrieval  65
4.3 From Retrieving to Exploring  68
4.3.1 String-based Retrieval Processes  71
4.3.2 Conceptual Retrieval Process  73
4.3.3 Conceptual Exploration Processes  74
4.3.4 Topical Exploration Processes  78
4.4 From Homogeneous to Heterogeneous Information Spaces  80

xii Table of Contents
Part B Status quo – Handling Heterogeneity in Indexing and Retrieval
5 Approaches to Handle Heterogeneity  87
5.1 Citation Pearl Growing  87
5.2 Modeling Multilingual Indexing Languages  89
5.3 Establishing Semantic Interoperability between Indexing
Languages  90
5.3.1 Structural Models  91
5.3.2 Mapping Levels  93
5.3.3 Vocabulary Linking Projects  96
6 Problems with Establishing Semantic Interoperability  105
6.1 Conceptual Interoperability between Entities of Indexing
Languages  107
6.1.1 Focused and Comprehensive Mapping  108
6.1.2 Conceptual Identity and Semantic Congruence  112
6.2 Equivalent Intersystem Relationships  118
6.2.1 Intersystem Relations Compared to Intrasystem Relations  119
6.2.2 Interoperability and Search Tactics  121
6.2.3 Specified Intersystem Relationships  132
6.2.4 Conceptual Interoperability between Indexing Results  134
6.2.5 Directedness of Intersystem Relationships  137
Part C Vision – Ontology-based Indexing and Retrieval
7 Formalization in Indexing Languages  147
7.1 Introduction and Objectives  147
7.2 Common Characteristics and Differences between Indexing
Languages and Formal Knowledge Representation  151
7.3 Prerequisites for an Ontology-based Indexing  156
7.3.1 Semantic Relations and Inferred Document Sets  158
7.3.2 Facets and Inferences  167
8 Typification of Semantic Relations  181
8.1 Inventories of Typed relations  182
8.2 Typed Relations and their Benefit for Indexing and
Retrieval  188
8.3 Examples of the Benefit of Typed Relations for the Retrieval
Process  194

Table of Contents xiii
8.3.1 Example 1: Aspect-oriented Specification of the Generic Hierarchy
Relation  194
8.3.2 Example 2: Typed Relations of a Topic Map built from the ASIST
Thesaurus  197
8.3.3 Example 3: Degrees of Determinacy  213
9 Inferences in Retrieval Processes  215
9.1 Inferences of Level 1  216
9.1.1 Hierarchical Relationships  216
9.1.2 Associative Relationships  217
9.1.3 Typification of the Synonymy / Equivalence Relationship  218
9.2 Inferences of Level 2 and of Higher Levels, Transitivity  222
9.2.1 Hierarchical Relationships  223
9.2.2 Unspecific Associative Relationships  226
9.2.3 Typification of Associative Relationships  229
9.3 Inferences by Combining Different Types of Relationships  231
9.3.1 Synonymy Relation with Hierarchical Relationships  231
9.3.2 Chronological Relation with Hierarchical Relationships  232
9.3.3 Transitions from Associative Relationships to a Hierarchical
Structure  232
9.3.4 Transitions from a Hierarchical Structure to Associative
Relationships  233
9.3.5 Transitivity for Combinations of Typed Associative
Relationships  235
10 Semantic Interoperability and Inferences  237
10.1 Conditions for Entity-based Interoperability  237
10.2 Models of Semantic Interoperability  244
10.2.1 Ontological Spine and Satellite Ontologies  244
10.2.2 Degrees of Determinacy and Interoperability  250
10.2.3 Entity-based Interoperability and Facets  252
10.3 Perspective: Ontology-based Indexing and Retrieval  254
11 Remaining Research Questions  259
11.1 Questions of Modeling  259
11.2 Questions of Procedure  260
11.3 Questions of Technology and Implementation  262

xiv Table of Contents
Part D Appendices
Systematic Glossary  265
Abbreviations  271
List of figures  273
List of tables  277
References  279
Index  289

1 Introduction: Envisioning Semantic Information
Spaces
Indexing languages, interoperability, information retrieval, semantic technolo-
gies – is it really worth examining the particular interaction of these rather dif-
fering subjects, as we do in this book? In this preliminary chapter we try to give
a first answer why we think it is. Therefore we will pick up the idea of a semantic
information space again, which was already mentioned in the preface and make
it more concrete by envisioning some examples. We will take a first naive look
at search situations and the impact of semantic knowledge representation, yet
without considering the conceptual or technical background. Thus in this first
look, information retrieval systems, indexing languages and semantic technolo-
gies are treated as a black box, which ideally provides a search environment that
can be somehow characterized as a semantic information space.
Examples in this book are heterogeneous and (amongst some others) taken
from the domains of chemistry, physics and biology, particularly ornithology.
Although neither the authors nor the subjects of this book are affiliated to these
disciplines, we will nevertheless occasionally revert to them, as they are clearly
outside of our own profession and can be seen insofar as a “neutral” domain,
which seems to provide a lower risk of misunderstanding than examples from
the less accurate fields of humanities or social sciences would probably provide.
However, there are of course no special skills in natural sciences needed to read
and understand the examples and to follow the argumentation. All examples are
trivial enough to be understood even without any substantial chemical, physical
or zoological knowledge.
When speaking of an “information space”, one could quite generally think of
two extremes: either a collection of information resources that are widely homog-
enous in form and content and centralized in one storage or a heterogeneous col-
lection, distributed over several repositories and organized independently from
each other – the first extreme is e.g. embodied by traditional library collections,
while the most prominent example for the latter is the World Wide Web. In the
following, both extremes and every possible specification between them shall be
understood as information spaces.
We begin our consideration with a relatively simple organized information
space. Figure 1.1 shows a situation that is remindful of a bibliographic database.
The document store contains a number of bibliographic records, which are repre-
senting two monographs written by the German chemist and Nobel Prize laureate
Otto Hahn and one book of correspondence from the physicist Lise Meitner to Otto
Hahn. To represent the authorship of Otto Hahn and Lise Meitner for each docu-

2 1 Introduction: Envisioning Semantic Information Spaces
ment consistently, a name authority file is used, which contains personal name
authority records of both scientists that can be linked to the stored documents.
In doing so, one can easily search the information space e.g. for all documents
written by Otto Hahn – this search operation is often referred to as a collocation
search.
Fig. 1.1: Authority files in information spaces.
Another search operation can be described as a subject search. That would be a
search e.g. for all documents about “radioactivity”. To carry out subject searches,
the information space must somehow provide the information of what each doc-
ument is “about” – in the indexing context we also speak of the aboutness of a
document (cf. Ingwersen 1992, 50–54). In bibliographic databases this aboutness
is traditionally represented by one or more subject headings or thesaurus descrip-
tors. In order to provide a consistent representation, the subject headings can be
organized in a subject headings authority file, so that each subject heading has
its own authority record that can be linked to the appropriate document records
(cf. Fig. 1.1).
There is nothing special to the situation described so far and everybody who
has ever used an online catalog of a library should be familiar with it, as it corre-
sponds to the way bibliographic data has been organized for a long time and still

1 Introduction: Envisioning Semantic Information Spaces 3
continues to be organized by documentary institutions and especially libraries.
However, knowledge representation is beginning beyond this situation.
In Figure 1.2 the authority files are replaced by a network-like structure. The
now grey shaded elements of Figure 1.1 seem to become more complex, as they
are somehow embedded in a meaningful context – later on in this book, we will
address these elements precisely and speak more abstractly of entities of a knowl-
edge representation. What we are characterizing here rather vague as a “meaning-
ful context” raises these entities from the keyword-based level in Figure 1.1 to a
conceptual level in Figure 1.2. We will examine this important step in the follow-
ing chapters and confine ourselves here to the determination that these concepts
primarily can be used for indexing the stored documents and thereby fulfill the
same basic descriptor function as simple keywords, but that they also open up a
broader context, as they are connected to other, somehow related concepts. In the
following, this situation will be referred to as a knowledge structure.
Fig. 1.2: Knowledge structures in information spaces.
Searching the information space in Figure 1.2 with a descriptor “radioactivity”
leads not only to the indexed monograph of Otto Hahn “Applied radiochemistry”,
but also to the related descriptors “activity level” and “radioisotope”. It becomes
apparent that an information seeker, who is interested in “radioactivity”, could
also be interested in certain levels of radioactivity or in concrete radioactive iso-

4 1 Introduction: Envisioning Semantic Information Spaces
topes. The same seems to apply to “nuclear fission” and “nuclear reaction” – it
isn’t unlikely that an information seeker with an interest in nuclear fission may
also be interested in other nuclear reactions. Beyond that, the knowledge struc-
ture of Figure 1.2 also establishes a relationship between Otto Hahn and the rather
abstract concept “person” explicit, as well as between Otto Hahn’s research col-
league Lise Meitner and “person”. As a human there’s no difficulty in the cogni-
tive interpretation of these relations – we can easily see that Otto Hahn and Lise
Meitner are persons, even if we never heard their names before. By using seman-
tic technologies, this knowledge can be made machine-readable, so that it would
be able to infer (Glossary C3.2) that Otto Hahn is a person due to the fact that the
concept “Hahn, Otto” is related to the concept “person” in a specific way. Like-
wise the risk of confusing the person Otto Hahn with the homonymous research
vessel, which was launched in 1964 and named after the famous scientist, could
be avoided.
At this point we have already mentioned many aspects and reached to the
core issues of this book. In the following, we will take a closer look at searches in
information spaces and the underlying information retrieval processes and there-
fore give a first impression of the usefulness of relations like the above described.
We will also look at the interdependency between indexing and information
retrieval processes, introduce Knowledge Organization Systems (KOSs) as types
of knowledge structures that are designed to support indexing and retrieval and
finally concern questions like how it could be made explicit and recognizable for
a KOS that a document “Letters of Lise Meitner to Otto Hahn” is about letters that
Lise Meitner wrote to Otto Hahn and not vice versa.
Based on this, we will provide a more systematic discussion of the specific
types of relations and their functionality within and between knowledge struc-
tures – later on we will speak of them as intra- and intersystem relations. Yet,
before that, some preliminary considerations will be provided, in order to facili-
tate a better understanding of the mentioned issues.
Accordingly, we will address the functionality of intersystem relations, i.e.,
those relations that are bridging two knowledge structures and therefore make
them somehow interoperable. In this context, we will focus on the problems of
heterogeneity that may arise e.g. from the use of different knowledge structures
for indexing purposes. This is denoted in Figure 1.3, where single concepts of our
introduced example knowledge structure are linked to other, really existing struc-
tures, namely the Library of Congress Subject Headings (LCSH), the International
Nuclear Information System / Energy Technology Data Exchange (INIS/ETDE), and
the YAGO project.

1 Introduction: Envisioning Semantic Information Spaces 5
Fig. 1.3: Interoperability in information spaces.
These three structures, which were arbitrary selected for this example, are quite
different in their organization, coverage and purpose. The LCSH can be charac-
terized as an authority file, INIS/ETDE is a thesaurus that has been developed
and used by the International Atomic Energy Agency (IAEA)1, and YAGO is an
ontology mainly built up with vocabulary from the Wikipedia2. Since we haven’t
1 http://www.iaea.org/inis/products-services/thesaurus
2 http://www.mpi-inf.mpg.de/yago-naga/yago

Discovering Diverse Content Through
Random Scribd Documents

payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright

law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except
for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.

If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission
of Project Gutenberg™

Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and

credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookfinal.com

Semantic Knowledge Representation for Information Retrieval Winfried Gödert

Recommended

More Related Content

Similar to Semantic Knowledge Representation for Information Retrieval Winfried Gödert (20)

Recently uploaded (20)

Semantic Knowledge Representation for Information Retrieval Winfried Gödert