4.1 Semantic Data and Web: Unit 4 Ontology
4.1 Semantic Data and Web: Unit 4 Ontology
Data is organized based on binary models of objects, usually in groups of three parts: two objects
and their relationship. For example, if one wanted to represent a cup sitting on a table, the data
organization might look like this: CUP TABLE. The objects (cup and table) are interpreted with regard
to their relationship (sitting on). The data is organized linearly, telling the software that since CUP
comes first in the line, it is the object that acts. In other words, the position of the word tells the
software that the cup is on the table and not that the table is sitting on the cup. Databases designed
around this concept have greater applicability and are more easily integrated into other databases.
The Semantic Web is a vision about an extension of the existing World Wide Web, which provides
software programs with machine-interpretable metadata of the published information and data. In
other words, we add further data descriptors to otherwise existing content and data on the Web. As
a result, computers are able to make meaningful interpretations similar to the way humans process
information to achieve their goals.
What is an ontology?
An ontology is a way in which to describe the world. From one perspective, language is an ontology;
a set of labels to give meaning to real world things.
An ontology is an explicit specification of a conceptualization. The term is borrowed from
philosophy, where an Ontology is a systematic account of Existence. For AI systems, what "exists" is
that which can be represented. When the knowledge of a domain is represented in a declarative
formalism, the set of objects that can be represented is called the universe of discourse. This set of
objects, and the describable relationships among them, are reflected in the representational
vocabulary with which a knowledge-based program represents knowledge. Thus, in the context of
AI, we can describe the ontology of a program by defining a set of representational terms. In such an
ontology, definitions associate the names of entities in the universe of discourse (e.g., classes,
relations, functions, or other objects) with human-readable text describing what the names mean,
and formal axioms that constrain the interpretation and well-formed use of these terms. Formally,
an ontology is the statement of a logical theory.
Example:
A general example may help. A bookseller may want to integrate data coming from different
publishers. The data can be imported into a common RDF model, eg, by using converters to the
publishers’ databases. However, one database may use the term “author”, whereas the other may
use the term “creator”. To make the integration complete, and extra definition should be added to
the RDF data, describing the fact that the relationship described as “author” is the same as
“creator”. This extra piece of information is, in fact, a vocabulary (or an ontology), albeit an
extremely simple one.
In a more complex case the application may need a more detailed ontology as part of the extra
information. This may include formal description on how authors are to be uniquely identified (eg, in
a US setting, by referring to a unique social security number), how the terms used in this particular
application relate to other datasets on the Web (eg, Wikipedia or geographic information), how the
term “author” (or “creator”) can be related to terms like “editors”, etc.
What is RDF?
In the Semantic Web we refer to the things in the world as resources; a resource can be
anything that someone might want to talk about.
Resource is the word used in the Semantic Web standards. the name of the base technology
in the Semantic Web (RDF) uses this word in an essential way. RDF stands for Resource
Description Framework.
In a web of information, anyone can contribute to our knowledge about a resource. It was
this aspect of the current Web that allowed it to grow at such an unprecedented rate. To
implement the Semantic Web, we need a model of data that allows information to be
distributed over the Web. RDF can represent data in a distributed way across the Web.
RDF Example:
<rdf:RDF
xmlns:mfg¼"http://www.WorkingOntologist.com/Examples/Chapter3/
Manufacturing.rdf#"
xmlns:rdf¼"http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<mfg:Product
rdf:about¼"http://www.WorkingOntologist.com/Examples/Chapter3/
Manufacturing.rdf#Product1">
<mfg:Available>23</mfg:Available>
<mfg:Division>Manufacturing support</mfg:Division>
<mfg:ProductLine>Paper machine</mfg:ProductLine>
<mfg:SKU>FB3524</mfg:SKU>
<mfg:ModelNo>ZX-3</mfg:ModelNo>
<mfg:ManufactureLocation>Sacramento</mfg:Manufacture
Location>
</mfg:Product>
<mfg:Product
rdf:about¼"http://www.WorkingOntologist.com/Examples/Chapter3/
Manufacturing.rdf#Product2">
<mfg:SKU>KD5243</mfg:SKU>
<mfg:Division>Manufacturing support</mfg:Division>
<mfg:ManufactureLocation>Sacramento</mfg:Manufacture
Location>
<mfg:Available>4</mfg:Available>
<mfg:ModelNo>ZX-3P</mfg:ModelNo>
<mfg:ProductLine>Paper machine</mfg:ProductLine>
</mfg:Product>
</rdf:RDF>
Main tools:
An RDF parser reads text in one (or more) of these formats and interprets it as triples in the
RDF data model. An RDF serializer does the reverse; it takes a set of triples and creates a file
that expresses that content in one of the serialization form.
Parser : RDF can be in XML/RDF or Turtle format, Parser converts in into an RDF graph.
Serializer: does the opposite: from a graph it creates a serialized version of it.
RDF Converters: Sometimes the data source is not in RDF form.
e.g. relational databases, spreadsheets but also can be micro formats - special attributes in
HTML tags (business cards or events)
or RDFa - same idea, embed RDF into HTML attributes
o to have machine-processable HTML data
RDF Store
This is a database tuned for storing and retrieving triples. Also should have an ability to
merge information from multiple data sources (unlike Relational Databases)
The OWL Web Ontologoy Language is a language for defining and instantiating Web
ontologies. Ontology is a term borrowed from philosophy that refers to the science of
describing the kinds of entities in the world and how they are related.
An OWL ontology may include descriptions of classes, properties and their instances. Given
such an ontology, the OWL formal semantics specifies how to derive its logical
consequences.
i.e. facts not literally present in the ontology, but entailed by the semantics.
These entailments may be based on a single document or multiple distributed documents
that have been combined using defined OWL mechanisms.
1. Namespaces
A standard initial component of an ontology includes a set of XML namespace declarations
enclosed in an opening rdf:RDF tag.
These provide a means to unambiguously interpret identifiers and make the rest of the
ontology presentation much more readable.
A typical OWL ontology begins with a namespace declaration similar to the following.
<rdf:RDF
xmlns ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xmlns:vin ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xml:base ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xmlns:food="http://www.w3.org/TR/2004/REC-owl-guide-20040210/food#"
xmlns:owl ="http://www.w3.org/2002/07/owl#"
xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd ="http://www.w3.org/2001/XMLSchema#">
As an aid to writing lengthy URLs it can often be useful to provide a set of entity definitions
in a document type declaration (DOCTYPE) that precedes the ontology definitions.
The names defined by the namespace declarations only have significance as parts of XML
tags. Attribute values are not namespace sensitive. But in OWL we frequently reference
ontology identifiers using attribute values.
They can be written down in their fully expanded form, for example
"http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#merlot". Alternatively,
abbreviations can be defined using an ENTITY definition, for example:
<!DOCTYPE rdf:RDF [
<!ENTITY vin "http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#" >
<!ENTITY food "http://www.w3.org/TR/2004/REC-owl-guide-20040210/food#" > ]>
After this pair of ENTITY declarations, we could write the value "&vin;merlot" and it would
expand to "http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#merlot".
Perhaps more importantly, the rdf:RDF namespace declarations can then be simplified so
that changes made to the entity declarations will propagate through the ontology
consistently.
<rdf:RDF
xmlns ="&vin;"
xmlns:vin ="&vin;"
xml:base ="&vin;"
xmlns:food="&food;"
xmlns:owl ="http://www.w3.org/2002/07/owl#"
xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd ="http://www.w3.org/2001/XMLSchema#">
2. Ontology Headers
Once namespaces are established we normally include a collection of assertions about the
ontology grouped under an owl:Ontology tag.
These tags support such critical housekeeping tasks as comments, version control and
inclusion of other ontologies.
<owl:Ontology rdf:about="">
<rdfs:comment>An example OWL ontology</rdfs:comment>
<owl:priorVersion rdf:resource="http://www.w3.org/TR/2003/PR-owl-guide-
20031215/wine"/>
<owl:imports rdf:resource="http://www.w3.org/TR/2004/REC-owl-guide-20040210/food"/>
<rdfs:label>Wine Ontology</rdfs:label>
The owl:Ontology element is a place to collect much of the OWL meta-data for the
document.
The rdf:about attribute provides a name or reference for the ontology.
rdfs:comment provides the obvious needed capability to annotate an ontology.
owl:priorVersion is a standard tag intended to provide hooks for version control systems
working with ontologies.
owl:imports provides an include-style mechanism.
We also include an rdfs:label to support a natural language label for our ontology.
The most basic concepts in a domain should correspond to classes that are the roots of various
taxonomic trees.
Every individual in the OWL world is a member of the class owl:Thing. Thus each user-defined
class is implicitly a subclass of owl:Thing. Domain specific root classes are defined by simply
declaring a named class. OWL also defines the empty class, owl:Nothing.
For sample wines domain, we create three root classes: Winery, Region, and ConsumableThing.
<owl:Class rdf:ID="Winery"/>
<owl:Class rdf:ID="Region"/>
<owl:Class rdf:ID="ConsumableThing"/>
The syntax rdf:ID="Region" is used to introduce a name, as part of its definition.
It relates a more specific class to a more general class. If X is a subclass of Y, then every instance
of X is also an instance of Y.
<owl:Class rdf:ID="PotableLiquid">
<rdfs:subClassOf rdf:resource="#ConsumableThing" />
...
</owl:Class>
Individuals
<owl:Thing rdf:about="#CentralCoastRegion">
<rdf:type rdf:resource="#Region"/>
</owl:Thing>
Example:
In order to have available a few more classes for the properties introduced in the further sections,
we define a branch of the Grape taxonomy, with an individual denoting the Cabernet Sauvignon
grape varietal. Grapes are defined in the food ontology:
<owl:Class rdf:ID="Grape">
...
</owl:Class>
<owl:Class rdf:ID="WineGrape">
<rdfs:subClassOf rdf:resource="&food;Grape" />
</owl:Class>
Properties
A property is a binary relation. Two types of properties are distinguished: datatype properties,
relations between instances of classes and RDF literals and XML Schema datatypes, object
properties, relations between instances of two classes.
Note that the name object property is not intended to reflect a connection with the RDF term
rdf:object ([RDF]
Properties, like classes, can be arranged in a hierarchy.
<owl:Class rdf:ID="WineColor">
<rdfs:subClassOf rdf:resource="#WineDescriptor" />
...
</owl:Class>
<owl:ObjectProperty rdf:ID="hasWineDescriptor">
<rdfs:domain rdf:resource="#Wine" />
<rdfs:range rdf:resource="#WineDescriptor" />
</owl:ObjectProperty>
<owl:ObjectProperty rdf:ID="hasColor">
<rdfs:subPropertyOf rdf:resource="#hasWineDescriptor" />
<rdfs:range rdf:resource="#WineColor" />
...
</owl:ObjectProperty>
WineDescriptor properties relate wines to their color and components of their taste,
including sweetness, body, and flavor.
hasColor is a subproperty of the hasWineDescriptor property, with its range further
restricted to WineColor.
The rdfs:subPropertyOf relation in this case means that anything with a hasColor property
with value X also has a hasWineDescriptor property with value X.
Next we introduce the locatedIn property, which relates things to the regions they are
located in.
<owl:ObjectProperty rdf:ID="locatedIn">
...
<rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing" />
<rdfs:range rdf:resource="#Region" />
</owl:ObjectProperty>
Types of Properties
1. TransitiveProperty
Example:
<owl:ObjectProperty rdf:ID="locatedIn">
<rdf:type rdf:resource="&owl;TransitiveProperty" />
<rdfs:domain rdf:resource="&owl;Thing" />
<rdfs:range rdf:resource="#Region" />
</owl:ObjectProperty>
<Region rdf:ID="SantaCruzMountainsRegion">
<locatedIn rdf:resource="#CaliforniaRegion" />
</Region>
<Region rdf:ID="CaliforniaRegion">
<locatedIn rdf:resource="#USRegion" />
</Region>
<owl:ObjectProperty rdf:ID="adjacentRegion">
<rdf:type rdf:resource="&owl;SymmetricProperty" />
<rdfs:domain rdf:resource="#Region" />
<rdfs:range rdf:resource="#Region" />
</owl:ObjectProperty>
<Region rdf:ID="MendocinoRegion">
<locatedIn rdf:resource="#CaliforniaRegion" />
<adjacentRegion rdf:resource="#SonomaRegion" />
</Region>
3. FunctionalProperty
In our wine ontology, hasVintageYear is functional. A wine has a unique vintage year. That is, a given
individual Vintage can only be associated with a single year using the hasVintageYear property. It is
not a requirement of a owl:FunctionalProperty that all elements of the domain have values. See the
discussion of Vintage cardinality.
<owl:ObjectProperty rdf:ID="hasVintageYear">
<rdf:type rdf:resource="&owl;FunctionalProperty" />
<rdfs:domain rdf:resource="#Vintage" />
<rdfs:range rdf:resource="#VintageYear" />
</owl:ObjectProperty>
4. 4. inverseOf
If a property, P1, is tagged as the owl:inverseOf P2, then for all x and y:
Note that the syntax for owl:inverseOf takes a property name as an argument. A iff B means (A
implies B) and (B implies A).
<owl:ObjectProperty rdf:ID="hasMaker">
<rdf:type rdf:resource="&owl;FunctionalProperty" />
</owl:ObjectProperty>
<owl:ObjectProperty rdf:ID="producesWine">
<owl:inverseOf rdf:resource="#hasMaker" />
</owl:ObjectProperty>