SlideShare a Scribd company logo
48 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
Research on Ontology Based Information Retrieval Techniques
KausarMukadam
Undergraduate Student,
Department of Computer Engineering
Dwarkadas J. Sanghvi College of
Engineering, Mumbai, India.
FuzailMisarwala
Undergraduate Student,
Department of Computer Engineering
Dwarkadas J. Sanghvi College of
Engineering, Mumbai, India.
Sindhu Nair
Assistant Professor,
Department of Computer Engineering
Dwarkadas J. Sanghvi College of
Engineering, Mumbai, India.
ABSTRACT
Information retrieval can be a daunting task owing to the
fact that there is colossal amount information available
on the web. Search engines have to precise and efficient
with the information they retrieve. They have to be
efficient in terms of time, space, and most importantly,
relevance of the documents retrieved. Users searching
using keywords, want results which are accurate and
match the intent of the user. In this paper, we study and
compare a few novel methodologies for information
retrieval in terms of their relevance scores and precision
ratings of the search results. The empirical data put forth
in this paper are directly obtained from the calculations
and results presented by the authors of the respective
proposed information retrieval techniques. We compare
the algorithms used in these proposals and their targeted
domains.
KEYWORDS
Ontology, Information, Retrieval.
INTRODUCTION
The advent of the World Wide Web brought with
itself a large volume of data and information that is
readily available for public use. Information
retrieval presents a means to gather this information
by reducing information overload. Information
retrieval (IR) can be defined as finding material or
documents of an unstructured nature, usually in the
form of text, which satisfies an information need
from large collections of data.
As most of the data is unstructured, numerous
information retrieval techniques have been
developed to help deal with the huge amounts of
unstructured knowledge accessible over networks.
The information retrieval techniques commonly
used are based on key-word, which uses lists of
keywords to describe the information content. The
main drawback of this method is that no
information about the semantic relationships
between these keywords is provided which makes
the use of these systems difficult for ordinary users.
Describing and then translating their needs into
keyword based request is a problem as information
needs cannot be expressed appropriately with
system terms.
One widely used approach to combat this issue is to
incorporate ontologies into the information system
which are used to represent essential concepts in a
subject area in addition to the semantic relationships
among them. Ontology provides metadata elements
and familiar vocabulary to put elucidation on
resources and uses class hierarchy and class
relations for metadata interpretation.
1. INFORMATION RETRIEVAL
MODELS
For retrieving of related documents through
information retrieval, the documents are usually
transformed into an appropriate representation.
Each information retrieval strategy uses a
specialized model for the representation of
documents which guide research and provide a
blueprint to implement a retrieval system. The
model predicts what will be relevant to the user
given the user query. For these predictions, the
models are grounded in some branch of
mathematics, in order to formalise a model, ensure
consistency, and to establish that it can be
implemented within a real system[1].
The major models for retrieval of information are
the Boolean model, the Statistical model (including
vector space and probabilistic retrieval model) and
the Linguistic and Knowledge-based models.
1.1. Boolean Model:
The Boolean model is one the first models of
information retrieval and is a much criticised model.
The model can be defined by thinking of the user
query term in the form of an unambiguous
definition of a document set. For example, the query
term „finance‟ defines the set of all documents that
are indexed with the term finance. In this model, the
operators of George Boole‟s mathematical logic-
logical product AND, logical sum OR and logical
difference NOT- can be combined along with query
49 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
terms and sets of documents to form new document
sets.
1.2. Extended Boolean Model:
Several methods have been developed to overcome
the disadvantages of the traditional model. The
traditional method has no provision for ranking, it
does not support the weight assignment to queries
or document terms, and the operators are too strict.
Smart Boolean approach and extended Boolean
models (for example- P-norm and Fuzzy Logic
approaches) provide relevance ranking to users.
The P-norm method allows query and document
terms to comprise of weights that are computed
through term frequency statistics with proper
normalization procedures. The normalized weights
are used to rank documents in decreasing order of
distance for an OR query, and increasing order of
distance for an AND query. The operators also have
an associated coefficient (P) to indicate the degree
of strictness of the Boolean operator (from 1 for
lowest strictness to infinity for highest strictness).
This method uses distance-based measure.
In Fuzzy Set theory, each element has a differing
degree of membership to a set which is a direct
contrast to traditional binary membership. The
index term weight for a given document reflects the
degree to which the term describes the document
content. The weight is an indication of membership
of the document in the associated fuzzy set.
1.3. Statistical Model:
These models use statistical information (term
frequencies) to determine the relevance of
documents with reference to the query, to produce a
list of documents ranked by an estimated relevance.
Some common types are vector space and
probabilistic models.
1.4. Vector Space Models:
The basic requirement of the vector space model is
that information retrieval objects are modelled as
elements in a vector space. Terms, documents,
queries, concepts are all represented as vectors in
the vector space. This implies that the system has
linear properties i.e. any two elements of the system
can be added to create a new element and can also
be multiplied by a real number[2]. The index
representations and query are represented as vectors
embedded in a dimensional Euclidean space, in
which each term is assigned a separate dimension.
The similarity measure is usually the cosine of the
angle that separates the two vectors d and q, where
d represents the documents index representation and
q represents the query.
1.5 Probabilistic Models:
Probabilistic models consider the information
retrieval process as a probabilistic inference.
Similarities in documents are assed as probabilities
that the document is pertinent to the query. These
models use various probabilistic theorems such as
Bayes' theorem. The Probabilistic retrieval model
implements the Probability Ranking Principle that
specifies that an IR system should rank the
documents on the basis of their probability of
relevance to the user query, using all the available
information. A variety of evidence sources are used
in this method, the most common one being the
statistical distribution of terms in the relevant and
non-relevant documents. Other probabilistic models
include Bayesian Network Models, 2-Poisson
Model, Probabilistic Indexing Model, etc.
1.6. Linguistic and Knowledge-Based Models:
Linguistic and knowledge-based approaches, which
have been developed to address various problems in
information retrieval, perform a semantic and
syntactic analysis in order to retrieve documents
more effectively.
2. ONTOLOGY:
The Oxford English Dictionary[3] defines ontology
as “A set of concepts and categories in a subject
area or domain that shows their properties and the
relations between them:” Ontology is a specification
of conceptualization which consists of a list of
terms (names and definitions) and the relationships
between them. The terms are used to represent
important concepts, or classes of objects, of the
domain. For example, in the university domain,
faculty, students, lecture rooms, courses and
departments can be some important concepts. The
ontology concept has found use in Artificial
Intelligence, Computer Science, and Knowledge
Engineering in a myriad set of related applications
including natural language processing, E-
commerce, information retrieval, and the Semantic
Web[4]. In information retrieval, ontologies have
been used to overcome the limitations of traditional
keyword-based search, and provide a vocabulary for
classification of the content and improve search
through class hierarchy based query expansion,
multifaceted browsing and searching, etc.
50 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
Figure 1. Basic process of text mining information retrieval based on ontology [5].
3. SURVEY OF NOVELMETHODOLOGIES
3.1. Retrieval Model for Traditional Chinese
Medicine:
Some shortcomings of traditional information
retrieval methods is discussed in [6]. The biggest
problems faced in information retrieval in the TCM
(Traditional Chinese Medicine Field) are those of
low coverage and high redundancy. The purpose of
working with traditional Chinese medicinal
literature and database is the lack of research in
TCM despite extensive and relevant research
carried out by scholars in other fields.
The TCM domain is constructed using a seven step
method. The paper then proceeds to summarize the
implementation process of their ontology based
information retrieval technique which is a two-step
technique. The next part deals with concept
similarity. It is stated that in the field of ontology,
the correlation between information is a
performance measure of the correlation between
concepts. In the domain of TCM, ontology follows
a clear hierarchical architecture. The relevance
between concepts is measured on a scale of 0 to 1.
If two concepts are unrelated, i.e. there is no
relevance, the relevance score is 0. This is an
effective method to quantify the correlation between
concepts and hence the effectiveness of the retrieval
system can be measured.
The paper defines three levels of correlation
between concepts. Thus by determining the degree
of correlation, the relevance between retrieved
information and information resources, the most
relevant information can be gathered.
In the final step of sorting the search results, a
sorting algorithm is proposed since the traditional
algorithms for sorting of search results may not be
very effective for the TCM domain. The measure of
concept similarity is used for the sorting of the
search results.
Figure 2. Information Retrieval framework for TCM[6].
51 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
inference that the newly proposed ontology based
information retrieval system was efficient and
effective in the Traditional Chinese Medicine
domain.
3.2. Semantic Indexing based Information
Retrieval Model:
Some limitations like the inability to describe
relations between search terms are dealt with in [7].
The proposed framework deals with important
issues related to semantic search and information
retrieval that are Scalability, Usability, and
Retrieval Performance. For the improvement of
scalability, the use of a semantic indexing approach
is suggested based on an entity retrieval model.
Usability is improved through the adoption of a
keyword based interface. The use of domain
specific information extraction, rules, and inference
is proposed to improve retrieval performance.
The framework proposed is based on three key
processes. They are representation of semantic
knowledge, semantic indexing, and querying. An
existing ontology is reused in the implementation of
information retrieval in transport systems for the
representation of semantic knowledge. OWL Web
Ontology Language is used. At the end of the first
step, useful OWL files are obtained that are indexed
for the search.
The next step is semantic indexing. An indexing
system is designed using entity retrieval model due
to the knowledge base being composed of entities
defined for RDF, OWL and RDFS. The knowledge
base, comprised of entities defined for RDF, is a
weighted and labelled graph where the edges are
properties and the nodes are the resources. The
graph is a set of RDF triples, which consist of three
components that are subject, predicate, and object.
The job of the subject is the identification of the
object described by the triple, while the function of
the predicate is the definition of the piece of data
present in the object that is given a value. The EAV
(Entity Attribute Value) model is adopted and used
for the indexing system. The indexing structure is
then described as it largely affects the retrieval
performance.
The next step is semantic querying. It is the process
of querying the EAV graph after the semantic
knowledge is represented and indexed. There are
three types of supported queries. Full text,
structural, and semi structural are the query types
supported. SIRE is used to search query and results
are obtained using a Boolean combination of an
attribute value pairs based on logical operators.
The proposed information retrieval method is then
evaluated using a set of pre-set queries, showing a
high rate of precision
Figure 3. Framework for Semantic Indexing based Information Retrieval [7].
52 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
3.3. Semantic Extension Retrieval Model:
A new technique aimed at tackling some key
problems posed by traditional keyword based
methods, is proposed in [8]. These problems are that
firstly, the keywords do not always convey the full
meaning of the content and the retrieved
information may be irrelevant. Secondly, the
keyword may have different meanings in different
contexts, which leads to difficulties in the
processing of query features, and thirdly, due to
polysemy and synonym problems in natural
language, keyword-based retrieval can only cover
information containing the same word, while other
information with similar meaning but different
words has been missing. [8] To overcome these
issues, an information retrieval technique based on
semantic extension is proposed.
The semantic retrieval is based on the semantic
extension. The strategy considers whether or not the
result is suitable for the user‟s query. The proposed
model is different when compared to the tradition
model of expressing content features through the
use of keywords, since the proposed model has the
provision of ontology annotation to summarize
semantic features of the information, and makes use
of semantic extension for retrieval. Two parts are
included in this model. They are ontology
annotation and retrieval of text based on semantic
extension.
Firstly, ontology annotation indexes documents
based on ontology of the domain. This serves as the
foundation for the text retrieval. This is followed by
the extension of the query keyword and turns it into
a full-text research. The results obtained are then
reordered. The indexing is executed by an index
writer which adds documents to the index and
serves as a core component for the construction of
the index. The core component for retrieval is the
index reader which reads the index. The analyser
pre-treats the documents through ontology
annotations and sends the content to the index
writer. It also matches the keyword from the query
to the domain. The analyser has a subcomponent
that is the ontology encoder. It processes the
elements of the domain ontology into a multi tree
which is in turn used for annotation and keyword
matching. The results obtained after further
processing is reordered.
For the performance evaluation of the proposed
technique, 1000 papers were collected and tested
on. Precision and recall measures were used for
evaluation, and the experimental results show a
fairly high rate of recall and precision as compared
to traditional keyword based information retrieval.
Figure 4. Proposed framework for Semantic Extension Retrieval Model [8].
53 KausarMukadam, FuzailMisarwala, Sindhu Nair
International Journal of Innovations & Advancement in Computer Science
IJIACS
ISSN 2347 – 8616
Volume 4, Issue 10
October 2015
4. PERFORMANCE MEASURES:
An ideal performance measure for an information
retrieval system would take into account the
resources used by the system to perform a retrieval
operation, the amount of effort time spent by a user
to obtain needed information, and the ability of the
system to retrieve useful items. But this approach is
extremely hard to implement. The user would want
the system to retrieve the highest number of
appropriate items possible and reduce the number of
non-relevant items in the response. The former
criterion is represented by Recall, and the latter one
is the concept of Precision[9].
Recall (also known as sensitivity in binary
classification) can be defined as the fraction of the
documents retrieved that are relevant to the user
query.
Precision (also known as positive predictive value)
is the fraction of the retrieved documents which
are relevant to the information requirement of the
user.
5. RESULT ANALYSIS:
Precision and Recall are the best performance
measures for any novel technique proposed by
researchers as the primary goal of these information
retrieval techniques is the return of relevant pages
as the search result. All of the above studied
techniques have presented a high success rate
through their own experiments. The average
precision and recall rates show a significant rise
when compared to traditional keyword based
methods, and serve as evidence for the fact that the
proposed methodologies have overcome the
difficulties that they set out to, in their respective
domains.
6. CONCLUSION AND FUTURE WORK:
Various novel ontology based information retrieval
techniques have been proposed by researchers,
which have been used in a specific or multiple
domain, aimed at defeating certain problems
encountered with the use of traditional keyword
based algorithms. These new techniques, manage to
overcome the stated issues and return high recall
and precision rates, and hence should be used with
increased frequency for information retrieval.
REFERENCES
[1] D. Hiemstra, "Information Retrieval Models∗,"
Goker, A., and Davies, J. Information Retrieval:
Searching in the 21st Century, 2009.
[2] V. V. Raghavan, "A critical analysis of vector space
model in information retrieval," Journal of American
Society for Information Science, 1986.
[3] "Oxford English Dictionary," [Online]. Available:
http://www.oxforddictionaries.com/definition/englis
h/ontology.
[4] S. S.Yasodha, "An Ontology-Based Framework for
Semantic Web Content Mining," International
Conference on Computer Communication and
Informatics (IEEE), 2014.
[5] M. Q. Song Yibing, "Research of literature
information retrieval method based on ontology,"
IEEE, 2014.
[6] Y. Z. D. Z. H. L. H. R. Aziguli Wulamu, "The
Research and Application of Ontology-Based
Information Retrieval," IEEE 9th Conference on
Industrial Electronics and Applications (ICIEA),
2014.
[7] M. A. Amir Zidi, "A Generalized Framework for
Ontology-Based Information Retrieval," IEEE ,
2013.
[8] H. L. Rui Zhang, "Design and Realization of
Semantic Extension Information Retrieval
Mechanism," Third International Conference on
Information Science and Technology, 2013.
[9] V. V. Raghavan, "A Critical Investigation of Recall
and Precision as MEasures of Retrieval System
Performance," ACM Transactions on Information
Systems, 1989.
Ad

More Related Content

What's hot (20)

Automatic indexing
Automatic indexingAutomatic indexing
Automatic indexing
dhatchayaninandu
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
El Habib NFAOUI
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
9866825059
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalization
Ameenababs
 
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
cseij
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
alaa223
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrieval
Nanthini Dominique
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
Leslie Vargas
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
alaa223
 
Text Indexing and Retrieval
Text Indexing and RetrievalText Indexing and Retrieval
Text Indexing and Retrieval
Rachmat Wahid Saleh Insani
 
Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrieval
unyil96
 
Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)
Sumit Kumar Gupta
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
El Habib NFAOUI
 
Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal
EdiFaizal2
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
KU Leuven
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival tools
Saroj Suwal
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Vikas Bhushan
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
Mounia Lalmas-Roelleke
 
Lec 2
Lec 2Lec 2
Lec 2
alaa223
 
Web_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_HabibWeb_Mining_Overview_Nfaoui_El_Habib
Web_Mining_Overview_Nfaoui_El_Habib
El Habib NFAOUI
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
9866825059
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalization
Ameenababs
 
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
AN ELABORATION OF TEXT CATEGORIZATION AND AUTOMATIC TEXT CLASSIFICATION THROU...
cseij
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
alaa223
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrieval
Nanthini Dominique
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
Leslie Vargas
 
Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrieval
unyil96
 
Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)Information Storage and Retrieval system (ISRS)
Information Storage and Retrieval system (ISRS)
Sumit Kumar Gupta
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
El Habib NFAOUI
 
Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal Konsep Dasar Information Retrieval - Edi faizal
Konsep Dasar Information Retrieval - Edi faizal
EdiFaizal2
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
KU Leuven
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival tools
Saroj Suwal
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Vikas Bhushan
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
Mounia Lalmas-Roelleke
 

Similar to Research on ontology based information retrieval techniques (20)

INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
cscpconf
 
G04124041046
G04124041046G04124041046
G04124041046
IOSR-JEN
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
IRJET Journal
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET Journal
 
An efficient information retrieval ontology system based indexing for context
An efficient information retrieval ontology system based indexing for contextAn efficient information retrieval ontology system based indexing for context
An efficient information retrieval ontology system based indexing for context
eSAT Journals
 
A simplified classification computational model of opinion mining using deep ...
A simplified classification computational model of opinion mining using deep ...A simplified classification computational model of opinion mining using deep ...
A simplified classification computational model of opinion mining using deep ...
IJECEIAES
 
D1802023136
D1802023136D1802023136
D1802023136
IOSR Journals
 
Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...
IRJET Journal
 
Ijetcas14 446
Ijetcas14 446Ijetcas14 446
Ijetcas14 446
Iasir Journals
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
IOSR Journals
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Technique
paperpublications3
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...
ijcsity
 
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
IOSR Journals
 
M017116571
M017116571M017116571
M017116571
IOSR Journals
 
M045067275
M045067275M045067275
M045067275
IJERA Editor
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
IJNSA Journal
 
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
AIRCC Publishing Corporation
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
AIRCC Publishing Corporation
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
ijcsit
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
IRJET Journal
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
cscpconf
 
G04124041046
G04124041046G04124041046
G04124041046
IOSR-JEN
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
IRJET Journal
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET Journal
 
An efficient information retrieval ontology system based indexing for context
An efficient information retrieval ontology system based indexing for contextAn efficient information retrieval ontology system based indexing for context
An efficient information retrieval ontology system based indexing for context
eSAT Journals
 
A simplified classification computational model of opinion mining using deep ...
A simplified classification computational model of opinion mining using deep ...A simplified classification computational model of opinion mining using deep ...
A simplified classification computational model of opinion mining using deep ...
IJECEIAES
 
Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...
IRJET Journal
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
IOSR Journals
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Technique
paperpublications3
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...
ijcsity
 
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...Research Paper Selection Based On an Ontology and Text Mining Technique Using...
Research Paper Selection Based On an Ontology and Text Mining Technique Using...
IOSR Journals
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
IJNSA Journal
 
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
A SEMANTIC RETRIEVAL SYSTEM FOR EXTRACTING RELATIONSHIPS FROM BIOLOGICAL CORPUS
AIRCC Publishing Corporation
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
AIRCC Publishing Corporation
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
ijcsit
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
IRJET Journal
 
Ad

Recently uploaded (20)

Engineering Chemistry First Year Fullerenes
Engineering Chemistry First Year FullerenesEngineering Chemistry First Year Fullerenes
Engineering Chemistry First Year Fullerenes
5g2jpd9sp4
 
Upstream_processing of industrial products.pptx
Upstream_processing of industrial products.pptxUpstream_processing of industrial products.pptx
Upstream_processing of industrial products.pptx
KshitijJayswal2
 
Crack the Domain with Event Storming By Vivek
Crack the Domain with Event Storming By VivekCrack the Domain with Event Storming By Vivek
Crack the Domain with Event Storming By Vivek
Vivek Srivastava
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Dust Suppressants: A Sustainable Approach to Dust Pollution Control
Dust Suppressants: A Sustainable Approach to Dust Pollution ControlDust Suppressants: A Sustainable Approach to Dust Pollution Control
Dust Suppressants: A Sustainable Approach to Dust Pollution Control
Janapriya Roy
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
aset and manufacturing optimization and connecting edge
aset and manufacturing optimization and connecting edgeaset and manufacturing optimization and connecting edge
aset and manufacturing optimization and connecting edge
alilamisse
 
BTech_CSE_LPU_Presentation.pptx.........
BTech_CSE_LPU_Presentation.pptx.........BTech_CSE_LPU_Presentation.pptx.........
BTech_CSE_LPU_Presentation.pptx.........
jinny kaur
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
How to Make Material Space Qu___ (1).pptx
How to Make Material Space Qu___ (1).pptxHow to Make Material Space Qu___ (1).pptx
How to Make Material Space Qu___ (1).pptx
engaash9
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Basic Principles for Electronics Students
Basic Principles for Electronics StudentsBasic Principles for Electronics Students
Basic Principles for Electronics Students
cbdbizdev04
 
comparison of motors.pptx 1. Motor Terminology.ppt
comparison of motors.pptx 1. Motor Terminology.pptcomparison of motors.pptx 1. Motor Terminology.ppt
comparison of motors.pptx 1. Motor Terminology.ppt
yadavmrr7
 
Engineering Chemistry First Year Fullerenes
Engineering Chemistry First Year FullerenesEngineering Chemistry First Year Fullerenes
Engineering Chemistry First Year Fullerenes
5g2jpd9sp4
 
Upstream_processing of industrial products.pptx
Upstream_processing of industrial products.pptxUpstream_processing of industrial products.pptx
Upstream_processing of industrial products.pptx
KshitijJayswal2
 
Crack the Domain with Event Storming By Vivek
Crack the Domain with Event Storming By VivekCrack the Domain with Event Storming By Vivek
Crack the Domain with Event Storming By Vivek
Vivek Srivastava
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Dust Suppressants: A Sustainable Approach to Dust Pollution Control
Dust Suppressants: A Sustainable Approach to Dust Pollution ControlDust Suppressants: A Sustainable Approach to Dust Pollution Control
Dust Suppressants: A Sustainable Approach to Dust Pollution Control
Janapriya Roy
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
aset and manufacturing optimization and connecting edge
aset and manufacturing optimization and connecting edgeaset and manufacturing optimization and connecting edge
aset and manufacturing optimization and connecting edge
alilamisse
 
BTech_CSE_LPU_Presentation.pptx.........
BTech_CSE_LPU_Presentation.pptx.........BTech_CSE_LPU_Presentation.pptx.........
BTech_CSE_LPU_Presentation.pptx.........
jinny kaur
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
How to Make Material Space Qu___ (1).pptx
How to Make Material Space Qu___ (1).pptxHow to Make Material Space Qu___ (1).pptx
How to Make Material Space Qu___ (1).pptx
engaash9
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Basic Principles for Electronics Students
Basic Principles for Electronics StudentsBasic Principles for Electronics Students
Basic Principles for Electronics Students
cbdbizdev04
 
comparison of motors.pptx 1. Motor Terminology.ppt
comparison of motors.pptx 1. Motor Terminology.pptcomparison of motors.pptx 1. Motor Terminology.ppt
comparison of motors.pptx 1. Motor Terminology.ppt
yadavmrr7
 
Ad

Research on ontology based information retrieval techniques

  • 1. 48 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 Research on Ontology Based Information Retrieval Techniques KausarMukadam Undergraduate Student, Department of Computer Engineering Dwarkadas J. Sanghvi College of Engineering, Mumbai, India. FuzailMisarwala Undergraduate Student, Department of Computer Engineering Dwarkadas J. Sanghvi College of Engineering, Mumbai, India. Sindhu Nair Assistant Professor, Department of Computer Engineering Dwarkadas J. Sanghvi College of Engineering, Mumbai, India. ABSTRACT Information retrieval can be a daunting task owing to the fact that there is colossal amount information available on the web. Search engines have to precise and efficient with the information they retrieve. They have to be efficient in terms of time, space, and most importantly, relevance of the documents retrieved. Users searching using keywords, want results which are accurate and match the intent of the user. In this paper, we study and compare a few novel methodologies for information retrieval in terms of their relevance scores and precision ratings of the search results. The empirical data put forth in this paper are directly obtained from the calculations and results presented by the authors of the respective proposed information retrieval techniques. We compare the algorithms used in these proposals and their targeted domains. KEYWORDS Ontology, Information, Retrieval. INTRODUCTION The advent of the World Wide Web brought with itself a large volume of data and information that is readily available for public use. Information retrieval presents a means to gather this information by reducing information overload. Information retrieval (IR) can be defined as finding material or documents of an unstructured nature, usually in the form of text, which satisfies an information need from large collections of data. As most of the data is unstructured, numerous information retrieval techniques have been developed to help deal with the huge amounts of unstructured knowledge accessible over networks. The information retrieval techniques commonly used are based on key-word, which uses lists of keywords to describe the information content. The main drawback of this method is that no information about the semantic relationships between these keywords is provided which makes the use of these systems difficult for ordinary users. Describing and then translating their needs into keyword based request is a problem as information needs cannot be expressed appropriately with system terms. One widely used approach to combat this issue is to incorporate ontologies into the information system which are used to represent essential concepts in a subject area in addition to the semantic relationships among them. Ontology provides metadata elements and familiar vocabulary to put elucidation on resources and uses class hierarchy and class relations for metadata interpretation. 1. INFORMATION RETRIEVAL MODELS For retrieving of related documents through information retrieval, the documents are usually transformed into an appropriate representation. Each information retrieval strategy uses a specialized model for the representation of documents which guide research and provide a blueprint to implement a retrieval system. The model predicts what will be relevant to the user given the user query. For these predictions, the models are grounded in some branch of mathematics, in order to formalise a model, ensure consistency, and to establish that it can be implemented within a real system[1]. The major models for retrieval of information are the Boolean model, the Statistical model (including vector space and probabilistic retrieval model) and the Linguistic and Knowledge-based models. 1.1. Boolean Model: The Boolean model is one the first models of information retrieval and is a much criticised model. The model can be defined by thinking of the user query term in the form of an unambiguous definition of a document set. For example, the query term „finance‟ defines the set of all documents that are indexed with the term finance. In this model, the operators of George Boole‟s mathematical logic- logical product AND, logical sum OR and logical difference NOT- can be combined along with query
  • 2. 49 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 terms and sets of documents to form new document sets. 1.2. Extended Boolean Model: Several methods have been developed to overcome the disadvantages of the traditional model. The traditional method has no provision for ranking, it does not support the weight assignment to queries or document terms, and the operators are too strict. Smart Boolean approach and extended Boolean models (for example- P-norm and Fuzzy Logic approaches) provide relevance ranking to users. The P-norm method allows query and document terms to comprise of weights that are computed through term frequency statistics with proper normalization procedures. The normalized weights are used to rank documents in decreasing order of distance for an OR query, and increasing order of distance for an AND query. The operators also have an associated coefficient (P) to indicate the degree of strictness of the Boolean operator (from 1 for lowest strictness to infinity for highest strictness). This method uses distance-based measure. In Fuzzy Set theory, each element has a differing degree of membership to a set which is a direct contrast to traditional binary membership. The index term weight for a given document reflects the degree to which the term describes the document content. The weight is an indication of membership of the document in the associated fuzzy set. 1.3. Statistical Model: These models use statistical information (term frequencies) to determine the relevance of documents with reference to the query, to produce a list of documents ranked by an estimated relevance. Some common types are vector space and probabilistic models. 1.4. Vector Space Models: The basic requirement of the vector space model is that information retrieval objects are modelled as elements in a vector space. Terms, documents, queries, concepts are all represented as vectors in the vector space. This implies that the system has linear properties i.e. any two elements of the system can be added to create a new element and can also be multiplied by a real number[2]. The index representations and query are represented as vectors embedded in a dimensional Euclidean space, in which each term is assigned a separate dimension. The similarity measure is usually the cosine of the angle that separates the two vectors d and q, where d represents the documents index representation and q represents the query. 1.5 Probabilistic Models: Probabilistic models consider the information retrieval process as a probabilistic inference. Similarities in documents are assed as probabilities that the document is pertinent to the query. These models use various probabilistic theorems such as Bayes' theorem. The Probabilistic retrieval model implements the Probability Ranking Principle that specifies that an IR system should rank the documents on the basis of their probability of relevance to the user query, using all the available information. A variety of evidence sources are used in this method, the most common one being the statistical distribution of terms in the relevant and non-relevant documents. Other probabilistic models include Bayesian Network Models, 2-Poisson Model, Probabilistic Indexing Model, etc. 1.6. Linguistic and Knowledge-Based Models: Linguistic and knowledge-based approaches, which have been developed to address various problems in information retrieval, perform a semantic and syntactic analysis in order to retrieve documents more effectively. 2. ONTOLOGY: The Oxford English Dictionary[3] defines ontology as “A set of concepts and categories in a subject area or domain that shows their properties and the relations between them:” Ontology is a specification of conceptualization which consists of a list of terms (names and definitions) and the relationships between them. The terms are used to represent important concepts, or classes of objects, of the domain. For example, in the university domain, faculty, students, lecture rooms, courses and departments can be some important concepts. The ontology concept has found use in Artificial Intelligence, Computer Science, and Knowledge Engineering in a myriad set of related applications including natural language processing, E- commerce, information retrieval, and the Semantic Web[4]. In information retrieval, ontologies have been used to overcome the limitations of traditional keyword-based search, and provide a vocabulary for classification of the content and improve search through class hierarchy based query expansion, multifaceted browsing and searching, etc.
  • 3. 50 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 Figure 1. Basic process of text mining information retrieval based on ontology [5]. 3. SURVEY OF NOVELMETHODOLOGIES 3.1. Retrieval Model for Traditional Chinese Medicine: Some shortcomings of traditional information retrieval methods is discussed in [6]. The biggest problems faced in information retrieval in the TCM (Traditional Chinese Medicine Field) are those of low coverage and high redundancy. The purpose of working with traditional Chinese medicinal literature and database is the lack of research in TCM despite extensive and relevant research carried out by scholars in other fields. The TCM domain is constructed using a seven step method. The paper then proceeds to summarize the implementation process of their ontology based information retrieval technique which is a two-step technique. The next part deals with concept similarity. It is stated that in the field of ontology, the correlation between information is a performance measure of the correlation between concepts. In the domain of TCM, ontology follows a clear hierarchical architecture. The relevance between concepts is measured on a scale of 0 to 1. If two concepts are unrelated, i.e. there is no relevance, the relevance score is 0. This is an effective method to quantify the correlation between concepts and hence the effectiveness of the retrieval system can be measured. The paper defines three levels of correlation between concepts. Thus by determining the degree of correlation, the relevance between retrieved information and information resources, the most relevant information can be gathered. In the final step of sorting the search results, a sorting algorithm is proposed since the traditional algorithms for sorting of search results may not be very effective for the TCM domain. The measure of concept similarity is used for the sorting of the search results. Figure 2. Information Retrieval framework for TCM[6].
  • 4. 51 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 inference that the newly proposed ontology based information retrieval system was efficient and effective in the Traditional Chinese Medicine domain. 3.2. Semantic Indexing based Information Retrieval Model: Some limitations like the inability to describe relations between search terms are dealt with in [7]. The proposed framework deals with important issues related to semantic search and information retrieval that are Scalability, Usability, and Retrieval Performance. For the improvement of scalability, the use of a semantic indexing approach is suggested based on an entity retrieval model. Usability is improved through the adoption of a keyword based interface. The use of domain specific information extraction, rules, and inference is proposed to improve retrieval performance. The framework proposed is based on three key processes. They are representation of semantic knowledge, semantic indexing, and querying. An existing ontology is reused in the implementation of information retrieval in transport systems for the representation of semantic knowledge. OWL Web Ontology Language is used. At the end of the first step, useful OWL files are obtained that are indexed for the search. The next step is semantic indexing. An indexing system is designed using entity retrieval model due to the knowledge base being composed of entities defined for RDF, OWL and RDFS. The knowledge base, comprised of entities defined for RDF, is a weighted and labelled graph where the edges are properties and the nodes are the resources. The graph is a set of RDF triples, which consist of three components that are subject, predicate, and object. The job of the subject is the identification of the object described by the triple, while the function of the predicate is the definition of the piece of data present in the object that is given a value. The EAV (Entity Attribute Value) model is adopted and used for the indexing system. The indexing structure is then described as it largely affects the retrieval performance. The next step is semantic querying. It is the process of querying the EAV graph after the semantic knowledge is represented and indexed. There are three types of supported queries. Full text, structural, and semi structural are the query types supported. SIRE is used to search query and results are obtained using a Boolean combination of an attribute value pairs based on logical operators. The proposed information retrieval method is then evaluated using a set of pre-set queries, showing a high rate of precision Figure 3. Framework for Semantic Indexing based Information Retrieval [7].
  • 5. 52 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 3.3. Semantic Extension Retrieval Model: A new technique aimed at tackling some key problems posed by traditional keyword based methods, is proposed in [8]. These problems are that firstly, the keywords do not always convey the full meaning of the content and the retrieved information may be irrelevant. Secondly, the keyword may have different meanings in different contexts, which leads to difficulties in the processing of query features, and thirdly, due to polysemy and synonym problems in natural language, keyword-based retrieval can only cover information containing the same word, while other information with similar meaning but different words has been missing. [8] To overcome these issues, an information retrieval technique based on semantic extension is proposed. The semantic retrieval is based on the semantic extension. The strategy considers whether or not the result is suitable for the user‟s query. The proposed model is different when compared to the tradition model of expressing content features through the use of keywords, since the proposed model has the provision of ontology annotation to summarize semantic features of the information, and makes use of semantic extension for retrieval. Two parts are included in this model. They are ontology annotation and retrieval of text based on semantic extension. Firstly, ontology annotation indexes documents based on ontology of the domain. This serves as the foundation for the text retrieval. This is followed by the extension of the query keyword and turns it into a full-text research. The results obtained are then reordered. The indexing is executed by an index writer which adds documents to the index and serves as a core component for the construction of the index. The core component for retrieval is the index reader which reads the index. The analyser pre-treats the documents through ontology annotations and sends the content to the index writer. It also matches the keyword from the query to the domain. The analyser has a subcomponent that is the ontology encoder. It processes the elements of the domain ontology into a multi tree which is in turn used for annotation and keyword matching. The results obtained after further processing is reordered. For the performance evaluation of the proposed technique, 1000 papers were collected and tested on. Precision and recall measures were used for evaluation, and the experimental results show a fairly high rate of recall and precision as compared to traditional keyword based information retrieval. Figure 4. Proposed framework for Semantic Extension Retrieval Model [8].
  • 6. 53 KausarMukadam, FuzailMisarwala, Sindhu Nair International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 4, Issue 10 October 2015 4. PERFORMANCE MEASURES: An ideal performance measure for an information retrieval system would take into account the resources used by the system to perform a retrieval operation, the amount of effort time spent by a user to obtain needed information, and the ability of the system to retrieve useful items. But this approach is extremely hard to implement. The user would want the system to retrieve the highest number of appropriate items possible and reduce the number of non-relevant items in the response. The former criterion is represented by Recall, and the latter one is the concept of Precision[9]. Recall (also known as sensitivity in binary classification) can be defined as the fraction of the documents retrieved that are relevant to the user query. Precision (also known as positive predictive value) is the fraction of the retrieved documents which are relevant to the information requirement of the user. 5. RESULT ANALYSIS: Precision and Recall are the best performance measures for any novel technique proposed by researchers as the primary goal of these information retrieval techniques is the return of relevant pages as the search result. All of the above studied techniques have presented a high success rate through their own experiments. The average precision and recall rates show a significant rise when compared to traditional keyword based methods, and serve as evidence for the fact that the proposed methodologies have overcome the difficulties that they set out to, in their respective domains. 6. CONCLUSION AND FUTURE WORK: Various novel ontology based information retrieval techniques have been proposed by researchers, which have been used in a specific or multiple domain, aimed at defeating certain problems encountered with the use of traditional keyword based algorithms. These new techniques, manage to overcome the stated issues and return high recall and precision rates, and hence should be used with increased frequency for information retrieval. REFERENCES [1] D. Hiemstra, "Information Retrieval Models∗," Goker, A., and Davies, J. Information Retrieval: Searching in the 21st Century, 2009. [2] V. V. Raghavan, "A critical analysis of vector space model in information retrieval," Journal of American Society for Information Science, 1986. [3] "Oxford English Dictionary," [Online]. Available: http://www.oxforddictionaries.com/definition/englis h/ontology. [4] S. S.Yasodha, "An Ontology-Based Framework for Semantic Web Content Mining," International Conference on Computer Communication and Informatics (IEEE), 2014. [5] M. Q. Song Yibing, "Research of literature information retrieval method based on ontology," IEEE, 2014. [6] Y. Z. D. Z. H. L. H. R. Aziguli Wulamu, "The Research and Application of Ontology-Based Information Retrieval," IEEE 9th Conference on Industrial Electronics and Applications (ICIEA), 2014. [7] M. A. Amir Zidi, "A Generalized Framework for Ontology-Based Information Retrieval," IEEE , 2013. [8] H. L. Rui Zhang, "Design and Realization of Semantic Extension Information Retrieval Mechanism," Third International Conference on Information Science and Technology, 2013. [9] V. V. Raghavan, "A Critical Investigation of Recall and Precision as MEasures of Retrieval System Performance," ACM Transactions on Information Systems, 1989.