Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Application of Ontology in Semantic Information Retrieval
Presentation for MyRENSeminar
Berjaya Hotel, Kuala Lumpur
27 November 2014
1

Brief speaker’s info
2
Shahrul Azman Mohd. Noah, Ph.D.
Knowledge Technology Research Group
Center for AI Technology (CAIT)
shahrul@ukm.edu.my
Graduated in BSc(Mathematics) from UKM
Graduated in MSc(IS) from Sheffield U.
Graduated in PhD(IS) from Sheffield U. –
knowledge-based systems
From Muar, Johor

What is ontology?
•Ontology may be considered as a kind of method to represent knowledge.
•From a philosophical discipline –the science of “what is”; the kinds and structures of objects, properties, events, processes and relations in every area of reality.
•Aristotle classification of animals is one
the first ontology developed.
6

Ontology in Computing
•An ontology is an engineering artifact:
–It is constituted by a specific vocabulary used to describe a certain reality, plus
–A set of explicit assumptions regarding the intended meaning of the vocabulary.
•Thus, an ontology describes a formal specification of a certain domain:
–Shared understanding of a domain of interest
–Formal and machine manipulablemodel of a domain of interest
7

8
Ontology Definition
Formal, explicit specification of a shared conceptualization
commonly accepted understanding
conceptual model of a domain (ontological theory)
unambiguous terminology definitions
machine-readability with computational semantics
[Gruber93]

Source: Smith & Welty (2001)
a catalog
a set of
text files
a glossary
a thesaurus
a collection of
taxonomies
a set of
general logical
constraints
a collection of
frames
Complexity
An ontology is…
9

Various approaches to classify ontologies
10
Classify ontologies according to the information the ontology needs to express and the richness of its internal structure (Lassila& McGuiness, 2001)
Classify into 2 orthogonal dimensions: the amount and type of structure and the subject (Van Heijstet al., 1997)
Classify ontologies according to their level of dependence on a particular task (Guarino, 1998)

Ontology language
• Ontology languages are formal languages used to construct ontologies
– allow the encoding of knowledge about specific domains and often
– include reasoning rules that support the processing of that knowledge
• Various languages have been proposed: CycL, KL-One, Ontolingua, F-Logic,
OCML, LOOM, Telos, RDF(S), OIL, DAML+OIL, XOL, SHOE,
OWL etc.
• Usually based on Description Logic (DL).
• Summarised as (Kalibatiene & Vasilecas, 2011):
11

Example of ontologies
•Top level ontology -
12
Suggested Upper Merged Ontology (SUMO

13
Portion of SUMO ontology with
USGS Geo-concepts inserted

Example of ontologies (cont.)
•Lexical ontology -Wordnet
14

Example of ontologies (cont.)
•Domain ontology -Simple News and Press Ontologies (SNaP)
15

Applications of ontology
•Searching & browsing
•Decision support system
•Question answering system
•Recommendation
•Data integration
•Etc.
17

Concepts
•“Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968).
•Applications of IR: recommendations, Q&A, filtering… and of course searching.
20

Issues in IR
•Some issues in IR:
–Relevance
–Evaluation
–Users and information needs
•Context based search
•Semantic search
•Etc.
21

ONTOLOGY + INFORMATION RETRIEVAL
23

Ontology and semantic search
•Various ways to support semantic search:
–Query expansion –users query are expanded with related terminological terms
–Disambiguation –resolving terms or concepts when they refer to more than one topics
–Classifying –classify documents such as ads into ontological topics to support semantic search
–Enhanced IR model –embed ontology into existing IR model resulting a modified IR model
25

Query Expansion
•Query expansion (QE) is needed due to the ambiguity of natural language.
•Main aim of QE –to add new meaningful terms to the initial query.
26
Bhogal, J., Macfarlane, A. & Smith, A. 2007. A review of ontology based query expansion. Information Processing and Management, 43: 866-886.

Semantic index
• Textual documents are indexed according to some ontology
model.
• Remember the concept of vocabulary in IR?
31
architecture
bus
computer
database
….
xml
computer science
collection index terms or vocabulary
of the collection
Extract Indexing

Semantic index
• Textual documents are indexed according to some ontology
model.
• Remember the concept of vocabulary in IR?
32
computer science
collection Replace the index with ontological-index
Extract Indexing
architecture
bus
computer
database
….
xml

Examples
•Three research projects that illustrate the applications of ontology-based IR:
–Semantic digital library
–Crime news retrieval
–Multi modality ontology-based image retrieval
35

Semantic digital library
•Proposed an approach for managing, organizing and populating ontology for document collections in digital library.
•The document metadata and content are inserted and populated to a knowledge base which allows sophisticated query and searching.
•Firstly to propose an ontology based information retrieval model which is based on the classic vector space model which includes document annotation, instance-based weighting and concept-based ranking.
36

•General architecture
37

•Involved three ontologies –ACM Topic hierarchies, Geo ontology and Dublin core metadata
•Portion of domain ontology focusing on academic thesis
38

•Document annotation
39

•The process
40

VSM Index
#create Class Person
#create instance of Class Student
<Student rdf:ID="Student1">
<rdfs:label>ArifahAlhadi</rdfs:label>
</Student>
<Student rdf:ID="Student2">
<rdfs:labelrdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>AsyrafArifin</rdfs:label>
</Student>
#Create Instance of Class Supervisor
<Supervisor rdf:ID="Supervisor1">
<rdfs:label>PM Dr ShahrulAzman</rdfs:label>
<rdfs:label>Prof. MadyaDr. ShahrulAzmanMohdNoah</rdfs:label>
</Supervisor>
<Supervisor rdf:ID="Supervisor2">
<rdfs:label>Prof Aziz Deraman</rdfs:label>
</Supervisor>
Concept
Instance
Documents
http://www.ukm.my/thesis/supervisor#
http://www.ukm.my/thesis/person#
Supervisor1
Doc1
http://ukm.my/thesis/student#
http://ukm.my/thesis/creator#
http://ukm.my/thesis/person#
Student1
Doc1
http://ukm.my/thesis/student#
http://ukm.my/thesis/creator#
http://ukm.my/thesis/person#
Student2
Doc1
Id
Term
TFIDF
Frq
Doc
Id
1
ArifahAlhadi
0.11
2
Doc1
2
AsyrafArifin
0.123
1
Doc1
3
PMDr ShahrulAzman
0.45
1
Doc1

Ontology-based IR for crime news retrieval
•Each crime news must be classified into categories: Traffic Violation, Theft, Sex Crime, Murder, Kidnap, Fraud, Drugs, Cybercrime, Arsonand Gang(Chen et al. 2004)
•Useful entities need to be identified: Person, Location, Organisation, Date/Time, Weapon, Amount, Vehicle, Drug, Personel properties, and Age.
•Clustering of crime news into topics, e.g. NurinJazlinmurder, Canny Ong, Sosilawatietc.
•Clustering of specific topic into various
and chronological events.
•Mapping of named entities into news
ontology to support semantic querying and retrieval.
42

Example
43
Murder
Kidnap
Theft
Gang
NurinJazlin
Sosilawati
Canny Ong
Investigation into Canny Ong case include medical report and trial
Evidence/Suspect into Canny Ong case
DNA test
Family reacts into Canny Ong and negligence suit
Court Sentence, plead guilty
(17)
(6)
(3)
(9)
(13)
………………..
Classification
Clustering
Cluster into topics

Required methods
•In order to support the aforementioned requirements:
–Conventional text processing -tokenizing, indexing, stopping, stemming etc.
–Named entity recognition (NER)
–Classification and clustering
–Ontology mapping
44

46
PRE-PROCESSING TASK
DOCUMENT REPRESENTATION
DOCUMENT ORGANIZATION
+
+
•Stopwordremoval
•Stemming
•Parsing
•Indexing
•Bag of words
•Named entity recognition
•Classification -AdaBoost
•Clustering – KNN
•Semantic mapping

Document representation
•Documents will be presented into meaningful forms:
–BoW–Bag of Words
–Named Entity Recognition –used the GATE Annie and Jape rules
–Adopt the Vector Space Model (VSM) but enhanced with ontological model
48

Document organization
•Documents need to be organised into categories, topics and events.
–Classification –Adaboostalgorithm
–Clustering –Used the KNN clustering
–Ontology mapping –we have develop a crime news ontology by extending the existing SNaPontology. Includes classes/entities which are important to crime such as classification of crimes, locationand weapon.
50

51
Asset ontology
Event ontology

Extending the SNaP ontology and
mapping to entities in news documents
52
SNaP
Crime
pne:Event
pna:Asset
pns:Stuff
pns:Tangible
pns:Location pns:Organization
pns:Person
event:Event
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
pns:Weapon
pns:Vehicle
pnc:Classification
<Murder>
<Kidnap>
rdf:type
rdf:type
rdfs:subClassOf
pne:
subeventOf
rdfs:domain
rdfs:range
<Event 1>
rdf:type
pnt:Tag
rdfs:subClassOf
rdfs:subClassOf
pnc:Classifiable
pnc:
isClassifiedBy
rdfs:subClassOf
rdf:domain
rdf:range
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf

The Application
•What we need/desire.
53

Ontology-based Image Retrieval
•Rapid growth of visual information (VI) –lead to difficulty in finding and accessing VI.
•Inability to capture the semantic content.
•Problem arise –lack of coincidence between information extracted from VI and user needs.
•Conventional approaches of image retrieval (IMR) -TBIR and CBIR have reached their limit in attempting to solve this problem.
•As a result –SBIR approach,
ontology-based provide an explicit
domain oriented semantic for
concept and relationship.
55

Ontology-based Image Retrieval
•Illustrate how images are describes based on it visual, textual and domain semantic features.
•Proposed a multi-modality ontology: visual ontology, textual ontology and domain ontology.
•Illustrate how such ontology can be integrated with open source knowledge base (DBpedia) to support a more comprehensive search.
56

Example of multi-modality ontology
58

Example of Multi-modality ontology with DBpedia
59

Conclusion -Practical implementation of ontology-based IR
60
TBox
ABox
Ontology
Documents
Index
Extraction
build
Population
Annotation
Query Processing
query
ranked docs

Research issues
•Index representation –most still based on the conventional VSM.
•Ranking –weighting and ranking mechanisms
•Automatic population –supervised and unsupervised
•Extraction & annotation
•Multilingual and cross-language
61

References
•Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Information Retrieval. IEEE Transaction on Knowledge and Data Engineering, 19(2):
•Shahrul Azman Noah, Nor AfniRaziahAlias, NurulAida Osman, ZuraidahAbdullah, NazliaOmar, YazrinaYahya, MaryatiMohd Yusof: Ontology-Driven Semantic Digital Library. AIRS2010: 141-150.
•Shahrul Azman Noah, DatulAida Ali: The Role of Lexical Ontology in Expanding the Semantic Textual Content of On-Line News Images. AIRS2010: 193-202.
•Fernández, M., Cantador, I., López, V. , Vallet, D., Castells, P., & Motta, E. 2011. Semantically enhanced information retrieval: an ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 9: 434-452.
•Kara, S. Alan, O., Sabuncu, O., Akpınar, S., CicekliN.K., & Alpaslan, F.N. 2012. An ontology-based retrieval system using semantic indexing. Information Systems, 37: 294-305.
•Kohler, J., Philippi, S., Specht, M., & Ruegg, A. 2006. Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19: 744-754.
•Etc.
62

Example-advancedapplicationofontology
64

Watson –the science behind an answer
65

66
1
2
3
4
5
6
7
8
9
10
11
Group members:
1.Shahrul Azman Mohd. Noah
2.JuhanaSalim
3.Masnizah Mohd
4.Nazlia Omar
5.Mohd Juzaiddin Ab Aziz
6.Nazlena Mohamad Ali
7.Saidah Saad
8.Shereena Mohd Arif
9.LailaltulqadriZakaria
10.Sabrina Tiun
11.Maryati Mohd. Yusof

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Recommended

More Related Content

What's hot (20)

Similar to Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM (20)

More from Khirulnizam Abd Rahman (20)

Recently uploaded (20)

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM