Semantic Web Wednesday
Semantic Web Wednesday
Data
Origin of Internet
Current Web
WWW
Static URI, HTML, HTTP
World Wide Web
(WWW)
An information system on the Internet
which allows documents to be connected
to other documents by hypertext links,
enabling the user to search for information
by moving from one document to another.
In 1989, Sir Tim Berners-Lee invented the
World Wide Web. Then, he gave it to the
world for free. Now, it’s up to all of us to
protect and enhance it.
Tim Berner’s LEE: Father of
web
CONTRIBUTIONS:
HTML: (HyperText Markup Language) The markup (formatting)
5
Web Versions
Web 1.0 was the “read only era” of static websites where
there was one way information flow and information was
just presented before the users by the producers.
Web 2.0 was coined in 2002, which hold goods till present.
Present Web 2.0 is the “read-write-publish” era of
interactive websites with best examples like twitter,
facebook etc.
Presently, the WWW (World Wide Web), has grown to be
the largest repository of information leading to an
Information Technology (IT) revolution.
The Semantic Web is the extension of the present Web 2.0
to Web 3.0 which would enable machines to understand
data and work on behalf of humans for more efficient
search results. Actually the web was originally designed to
be processed by humans and not by machines.
6
Today’s Web
Currently most of the Web content is suits
human needs and is usable to humans
only.
Typical uses of the Web today includes
information seeking, publishing, and using,
searching for people and products,
shopping, reviewing catalogues etc.
Dynamic pages are generated based on
information from databases but without
original information structure found in
databases.
The Syntactic Web
A place where
computers do the presentation (easy) and
people do the linking and interpreting
(hard).
[Goble, 03]
9
What is the Problem?
Consider a typical web
page:
Markup consists of:
rendering
information (e.g.,
font size and
colour)
Hyper-links to
related content
Semantic content is
accessible to humans
but not (easily) to
computers…
[Davies, 03]
10
…Limitations of the Today’s
Web
12
Simply because We are now
connecting almost what we can…
Smart TVs, Microwave, cameras,
Plugs, yet more….
Web 2.0 and Web 3.0
14
TOWARDS SEMANTIC WEB:
Web Limitations
Average WWW searches examine Doubles in size
only about 25% of potentially every six months Semantic Web
relevant sites and return a lot of
unwanted information
The Semantic Web is a
vision: the idea of having
data on the Web defined and
World Wide Web linked in a way that it can be
used by machines not just for
display purposes, but for
automation, integration and
reuse of data across various
applications.
Information on web is not suitable
for software agents
4
The Semantic Web: Web of
Data
Extension of Web 2.0 to
Web 3.0
Present
Present
Web
Web2.0
2.0
Ontology
+ Future
Future Web
Web 3.0:
3.0:
Intelligent
Intelligent Web
Web
(Semantic
(SemanticWeb)
Web)
Intelligent/ Semantics
Semanticsincorporated,
incorporated,
Semantic aaSmarter
SmarterWeb.
Web.
Services (Machine-
(Machine-
understandable)
understandable)
Transition: WWW to Semantic Web
Serious Problems in information
•finding
•extracting
•representing
•interpreting
•and maintaining
WWW Semantic
StaticURI, HTML, HTTP WebRDF(S), OWL
RDF,
From the World Wide Web
to the
Web of Data
Third Generation: The Web of
Data
Data centred processing
Semantic Web Definitions:
Tim Berner’s Lee, James Hendler, and Ora Lassila Defination:
The Semantic Web is an extension of the current web in which
information is given well defined meaning, better enabling computers
and people to work in co-operation. [Lee et. al., 2001]
W3C and Tim Berner’s Lee Defination: The Semantic Web is a
vision: the idea of having data on the web defined and linked in a way
that it can be used by machines not just for display purposes , but for
automation, integration, and reuse of data across various applications.
Tim Berner’s Lee Defination: The Semantic Web will bring structure
to the meaningful content of web pages, creating an environment
where software agents roaming from page to page can readily carry
out sophisticated tasks for users [Shadbolt et al., 2006].
Why Semantic Web
Till today , web search is typically
based on keyword searching.
The Semantic web is proposed to
include more involved questions,
relationships and trust.
Instead of word matching web will be
able to show related items showing new
relationships.
For ex- how does the weather effects
the stock market? Crime? Birth rate?
Vision and Goal
23
The Semantic Web: Web of
Data
Semantic Web enables connecting
new information with data and
knowledge stored in various places
and querying such linked data as one
distributed database
The Semantic Web of
Things:
Semantic Web: Resource
Integration
Semantic
annotation
Shared
ontology
Web resources /
services / DBs / etc. 27
Semantic Web: which resources to annotate ?
This is just a small part of Technologicl
Semantic Web concern !!!
and business External world
processes resources
Web resources /
services / DBs /
etc.
Semantic
annotation
Shared
ontology
Multimedi
a
resources
Web users
(profiles,
preferences)
31
Components
URI- Uniform Resource Identifier (URI) is a string of a standardized form that allows to
uniquely identify resource.
XML- It is Extensible Markup Language (XML) layer with XML namespace and XML
schema definitions is a general purpose markup language for documents containing
structured information.
RDF- RDF is a core data representation format for Semantic Web. RDF is a simple
metadata representation framework, using URIs to identify web based resources and a
graph model for describing relationships between resources. It creates statements in a
form of triples i.e. subject-predicate-objects.
RDF SCHEMA- RDFS can be used to describe taxonomies of classes and properties and
use them to create lightweight ontologies.
ONTOLOGIES- Ontologies provides the building blocks for expressing semantics in a
well defined manner. It is a formal conceptualization of a domain that is usable by a
computer. Detailed ontologies can be created with Web Ontology Language OWL.
SPARQL- For querying RDF data, RDFS and OWL ontologies with knowledge bases, a
Simple Protocol and RDF Query Language (SPARQL) is available.
RIF- It is Rule Interchange format which is used to create a standard for exchanging
rules among Web rule systems.
LOGIC LAYER- This layer functions on the basic principle of first order predicate logic,
so the information is displayed accurately on the web.
PROOF- In this layer, the ultimate goal of semantic web is to create a much smarter
content which could be understood by the machines.
TRUST- In this trustworthiness of information should be subjectively evaluated by each
information consumers.
DIGITAL SIGNATURE(Crypto)- It helps to validate the integrity of metadata.
Semantic Web
Three Major components of Semantic
web.
RDF / XML
Ontology (OWL)
SPARQL
XML
• XML lets us to create our own tags.
• These tags can be used by the script
programs in sophisticated ways to perform
various tasks, but the script writer has to
know why the page writer has used each
tag.
• In short, XML allows you to add arbitrary
structure to the documents but says
nothing about what the structure means.
• It has no built-in mechanism to tell the
meaning of a user’s new tags to other
users.
RDF
A standard of W3C
Defines relationships between
documents
Consisting of triples or sentences:
<subject, property, object>
<“Krishna”, composed, “The Magic Flute” >
RDFS has extended RDF with standard
“ontology vocabulary”:
Class, Property
Type, subClassOf
domain, range
RDF (cont.)
Resource Description Framework (RDF) is
a data model of semantic web.
It means data in Semantic Web tools is
denoted by RDF.
RDF web resources are in the form of
subject-predicate-object (s-p-o)
expressions.
For example “The sky has the colour blue”
in RDF as the triple a subject denoting
“the sky”, predicate denoting “has” and
object denoting the “the colour blue”.
RDF Tools
SPARQL
SPARQL(Simple Protocol and RDF Query
Language), a W3C recommendation, is a
pattern-matching query language.
SPARQL is use to retrieve and data stored in
Resource Description Framework (RDF).
It provides a mechanism to express constraints
and facts and the entities matching those
constraints are returned to the user.
SPARQL 1.0 is the first version of SPARQL and
SPARQL 1.1 is the additional feature of SPARQL.
SPARQL, RDF and Ontology
44
SPARQL, RDF and
Ontology(contd..)
Key Steps are as follows:
1) Create the OWL Ontology using some
Ontology Editor like Protégé, SWOOP etc.
2) Export Ontology as RDF using Jena API’s
like Model API etc.
3) Import Ontology to triplestore
45
SQL and SPARQL
Comparison
S.N
o
SQL SPARQL
1. SQL is based on Tuple SPARQL is based on Triple
Relation Calculus. Relation Calculus.
2. SQL is designed to query SPARQL is designed to query
Relational data. RDF data.
3. In SQL char, varchar, number, In SPARQL subject, predicate,
long etc. data type use. object, uri, literal etc. data type
use.
4. In SQL data is accessed from In SPARQL the data accessed
the Table. from RDF data files.
5. Relational data model stores RDF data is stored in
data in Structured form. Unstructured form.
6. Syntax: Syntax:
SELECT< column_list > SELECT< variable_list>
FROM< table_list > WHERE{< graph_pattern>}
WHERE< condition >
Some SPARQL Tools
Jena Feuseki Server. (Jena Toolkit)
ARQ (Jena Toolkit)
Twinkle
Dbpedia (Virtuoso SPARQL)
Ontology
According to Gruber’s definition, “an Ontology is an explicit
specification of a conceptualization” where explicit means that it
cannot be implicitly assumed and should be processable by
machines The knowledge in Ontologies can be formalized using
certain key components, like: classes or concepts, relations,
instances, and formal axioms [Gruber, 1995].
Classes Instances
or Concepts
Ontology
Key Components
Relations Formal
Axioms
Creation Designing and developing an Ontology from scratch or appending to an existing Ontology
Merging Merging different Ontologies of same type about the same subject into a single one that
unifies all of them
Ontology modeling Models for constructing Ontology like verbal, logic-based, structural, hybrid etc.
Ontology comparison & ranking Different parameters for Ontology comparison and ranking
64
Observation 1: Node 1 and Node 4 has highest degree centrality
(18.182)
Observation 2: Node 1 has highest Betweenness centrality (33.333)
Observation 3: Node 1 has highest Closeness centrality (13.811)
Observation 4: Node 1 has highest Closeness centrality (18.964)
Observation 5: Node 4 has highest pagerank centrality (17.681)
Observation 6: Node “1,3,5” has highest Eccentricity centrality (12.821)
Results: This analysis may be used in various applications to know
the most prominent node involved.
Type of Analysis
Figure No. Highest Value Inference Net Analysis
Performed
Degree Node 1, 4:
Figure 12 18.182
Centrality important
Node 1,3,5 :
Figure 17 Eccentricity Centrality 12.821
Important
75
Software Agents
Software Agents can
collect web content from diverse sources.
76
Web Usage Mining
Web Usage mining has been defined as the application of
data mining techniques to discover usage patterns from
Web data in order to understand and better serve the
needs of Web-based applications. Web usage mining
consists of three phases, namely preprocessing, pattern
discovery, and pattern analysis Structure of information
should be good which will allow extracting knowledge from
log files. Web Usage Mining may be applied to data such as
contained in logs files. A log file contains information
related to the user queries on a website. Web usage mining
may be used to improve the website structure or giving
recommendations to visitors.
77
Web Usage mining
78
Jena Framework
Jena was originally developed by researchers
in HP Labs, starting in Bristol, UK, in 2000.
Jena is a Java framework for building Semantic
Web applications. (Semantic Web Toolkit)
It provides a extensive Java libraries for
helping developers develop code that handles
RDF, RDFS, RDFa, OWL and SPARQL.
Jena includes a rule-based inference engine to
perform reasoning based on OWL and RDFS
ontologies, and a variety of storage strategies
to store RDF triples in memory or on disk.
Jena Framework
The Jena Framework includes:
A RDF API
Reading and writing RDF in RDF/XML, N3
and N-Triples
An OWL API
In-memory and persistent storage
Query Tools (RDQL – a query language
for RDF and Feuseki Server for SPARQL)
Semantic Web and AI?
No human-level intelligence claims
As with today’s WWW
large, inconsistent, distributed
Requirements
scalable, robust, decentralised
tolerant, mediated
Semantic Web will make extensive use of current AI,
any advancement in AI will lead to a better Semantic Web
Current AI is already sufficient to go towards realizing the
semantic web vision
As with WWW, Semantic Web will (need to) adapt
fast
81
Semantic Web & Knowledge
Management
Organising knowledge in conceptual
spaces according to its meaning.
Enabling automated tools to check
for inconsistencies and extracting
new knowledge.
Replacing query-based search with
query answering.
Defining who may view certain parts
of information
82
Research Aspects toward
Semantic Web
Emerging Technology (W3C working
group is doing research in various
domain for implementing semantic
web)
Many researchers are working in other
countries in the field of Semantic Web
but in India the percentage of people
is very less.
Lots of scope in research for handling
Semantic web Data.( Managing
Existing Data, Interoperability with
other data formats, Handling large
data sets and many more).
Research Aspects toward
Semantic Web (Cont.)
Lots of research Scope in Data Retrieval
strategies along with optimization.
How to add AI concept with Semantic web
Data in Searching.
Integration of existing Ontologies of various
domain.
Semantic web Services and Web Usage
Mining.
Semantic web Query Processing and
Optimization.
Semantic web and Linked open data.
Semantic Web
Major Communities
RDF
SPARQL
Ontology
Semantic Web Journals
Journal of Web Semantics (ELSEVIER)
Semantic Web journal (by IOS Press)
SWJ
Journal on Data Semantics
(SPRINGER)
International Journal on Semantic
Web and Information Systems
(IJSWIS) –IGI Global
International Journal of Metadata,
Semantics and Ontologies (INDERSCIENCE)
Semantic Web Journals
(Cont..)
Open Journal of Semantic Web
(OJSW)
International Journal of Web &
Semantic Technology (IJWesT)
Semantic Web Books
John Hebler, Dean Allemang,
Matthew Fisher, James Hendler,
Ryan Blace , Elsevier , Morgan
Andrew Perez Kufman
Lopez, Willey , Publication,
2009.
93