Mobile Image Retrieval Using Integration of Geo-Sensing and Visual Descriptor
Mobile Image Retrieval Using Integration of Geo-Sensing and Visual Descriptor
Abstract—In this paper, we propose a new efficient photo with analysis for the collected images and extraction of the
image retrieval method that automatically indexed for feature information. Currently, the primitive features that
searching relevant photo images using a combination of geo- use widely in the area of content-based image retrieval
referenced attributes and low-level visual features. Photo (CBIR) involve low-level features such as color and
image is labeled with its GPS (Global Positioning System) texture.
coordinates at the moment of capture, and a flat layer index (3) Querying image retrieval is done by searching with
is generated with the pair of latitude and longitude. These user’s query image. The retrieval system waits for user’s
are then utilized to create a hierarchical layer indexes for input through whole or part of the image, and calculates
spatial information after uploaded to media server. Then,
the similarity distance between the query image and
low-level visual features such as color histogram and edge
histogram are extracted and combined with geo-spatial
images in the image collection when user input the query
information for indexing and searching photo images. For image that need to search. Then, several relevant images
user’s querying process, the proposed method adopts two with the minimum similarity distance are returned to the
different steps as progressive approach, filtering and/or user.
selecting the relevant subset prior to content-based retrieval. (4) The relevant results are arranged and processed on
To evaluate the performance of the proposed descriptor, we user’s interface with some rule, after searching is done. If
assess the simulation performance in terms of average the user does not satisfied with the results, he/she can re-
precision and F-score using digital photo collections. retrieve the image again, and searches database again.
Comparing the proposed approach to search using visual There are two approaches in the image retrieval; Text-
content alone, an improvement of around 20% was observed based method and content-based method [3]. The popular
in experimental trials. These results reveal that combination text-based approach requires images to be annotated with
of context and content analysis is markedly more effective one or more keywords that can then be easily searched.
and meaningful than using only visual content for this task. However, this involves a vast amount of labor and tends to
be colored by personal subjectivity; the resulting lack of
Keywords- Photo Image Retrieval; Mobile Image Search; clarity often leads to mismatches in the retrieval process.
Geo-spatial Indexing; Content-based Visual Descriptor; The alternative approach, content-based image retrieval
(CBIR), indexes images in a database by identifying
I. INTRODUCTION similarities between them based on visual features such as
The amount of mobile devices, which are used in our color, texture and shape [3-5]. Typically, a CBIR system
daily life, continues to grow. Most of them are equipped requires the construction of an image descriptor, which is
with powerful camera and current mobile devices characterized by two functions: an extraction process that
including digital camera provide high quality images with encodes image into feature vectors, and a similarity
low cost. This has caused that many studies have been measure that compares between two images. The image
increased interest in image retrieval. An image retrieval descriptor D is formulated into 2-tuples as (FD, SD), where
system is defined as a system enables a user to browse, FD is a function that extracts a feature vector f from image
search and retrieve image from a large database of digital I, and SD is a distance measure function that computes the
image [1]. There are four major tasks in realization of the similarity between two feature vectors corresponding to
image retrieval system [2]: organizing image data images in the database [5].
collection, building up feature database, querying image Figure 1 shows the approaches traditionally used to
retrieval in the database, and arrange the query results with search digital images. The first type of image search and
some order. retrieval system, shown in Figure 1(a), requires each
(1) Image data is organized by collecting from the image to be associated with one or more keywords entered
mobile device in an automatic fashion with transmission, by a human operator, while the second shown in Figure
and by the Internet spider program to automatically collect 1(b), uses an image as a query and attempts to retrieve
web images through the URL. other images which are similar. This represents the current
(2) Building up image feature database can be done state-of-the-art in CBIR systems from JPEG
standardization JPSearch [6].
†
Corresponding author.
744
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:17 UTC from IEEE Xplore. Restrictions apply.
taken at certain times (from datetime1 to datetimen), which is displayed
by ordering a time-series, and Ci indicates their classification in terms of
their geo-location, which is a set of consecutive photos taken in close
proximity to each other, corresponding to same location cluster. Arrows
in this figure represent a user’s movements. Note that non-sequential
photos may be included in the same cluster. For example, p1 and p10 are
both included in C
745
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:17 UTC from IEEE Xplore. Restrictions apply.
C. Perform Query-by-Example and Measure of collections to enable important clues pertaining to geo-
Similarity coded image content to be utilized. The image database
For a user’s querying process, the proposed method consisted of 2,800 images belonging to 10 classes that
adopts two different steps as progressive approach to filter have been rescaled to 640*480 JPEG format. In this
the relevant subset prior to content-based retrieval. In the study, the classes are segmented to generate the ground
first approach for the flat indexing, all images outside a truth for evaluation, and used only to calculate the
certain distance from the user’s query image are removed effectiveness of the new approach. In the experiment,
from the collection and the remaining images ranked in retrieved images were considered to be relevant if they
terms of the similarity of visual contents. The relevant belonged to the same class as the query image.
images are firstly filtered by equation (1). Because GPS requires a line-of-sight connections with
the satellites, the signal can be lost inside buildings or in
filtering with flat index ∈ { ∀ I : D qt ≤ ε } heavily built-up areas. While it is possible to integrate the
(1) cell ID from mobile phone or advanced indoor GPS
technology [19], this issue was not addressed in this
where I is the subset of images filtered, t is the user- study. Instead, consecutive photos with no location were
defined criterion, is the user-defined boundary ranges assumed to belong to the same GPS track.
for location. Dqt is the distance threshold between query
image and target image computed with equation (2), at ( pl , pn ∈Ci ∧ (time( pl ) ≤ time( pm ) ≤ time( pn )))
which two images were generated. (5)
pm ∈ Ci
[
Dqt = C × a cos (sin (lat q )× sin (lat t )) +
The most common evaluation measures used in IR
(cos(lat )× cos(lat )× cos(long
q t q − long t )) ] (2)
(information retrieval) are precision and recall [22].
Precision denotes the fraction of all possible relevant
where lat and long are the decimal degrees of latitude images retrieved and recall indicates the fraction of the
and longitude, q and t mean the query and the target images retrieved that are actually relevant to the query.
image, and C is a constant used to convert the angle from Both are calculated as precision = a/a+b and recall =
radians to degree. a/a+c, where a is the number of relevant images retrieved,
In the other approach for the hierarchical indexing, all b is the number of relevant images that were not retrieved,
images with the same node with the query image are and c is the number of irrelevant images retrieved.
selected and included in the ranking retrieval process. The Since both scores are not always the most appropriate
selecting subset is defined by equation (3). measures for evaluating retrieval [23], they are often
combined into a single measure of performance, known as
selecting with hierarchy ∈ ∀ I : { ¦
m
j =1
[Sj ] } the F-score, which is formulated as follow.
746
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:17 UTC from IEEE Xplore. Restrictions apply.
collection. The results for this dataset, also confirm that The extraction time for an image and the retrieval time
the retrieval performances using the hierarchical layer, for relevant images over the entire database (2,800 images)
which select photographs that are in the same level of were computed. Even through the proposed methods
hierarchy, are as effective as those filtering based on required little more computing time and storage in
distance from the query image. These results revealed that comparison to other methods, they were noticeably more
H-CH and H-EHD both exhibited a slight improvement in accurate when applied to an image search within a
retrieval effectiveness. reasonable time and system resource allocation.
TABLE I
COMPARISON OF RETRIEVAL EFFECTIVENESS USING F-SCORE OVER ALL QUERIES, AND SUMMARY OF COMPUTATIONAL CHARACTERISTICS OF THE METHODS.
THE PROPOSED F-CH AND F-EHD INDICATE THE ADDITION OF LOCATION-BASED FILTERING PRIOR TO RANKING RETRIEVAL FOR CH-HSV AND EHD-G,
RESPECTIVELY. ALSO, THE PROPOSED H-CH AND H-EHD DENOTE PRIOR SELECTION BASED ON THE HIERARCHICAL INDEX, NAMELY CH-HSV AND EHD-G
RESPECTIVELY.
Figure 5. Example of top 10 ranking lists for each retrieval method. The query image is at the left top-left. (a) Query with F-CH, (b) Query with F-EHD,
and (c) Query with Location-only
747
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:17 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [21] Rathiah Hashim, Mohammad Sibghotulloh Ikhmatiar, Miswan Surip,
Masiri Karmin, Tutut Herawan, Mosque Tracking on Mobile GPS
[1] Website: http://en.wikipedia.org and Prayer times Synchronization for Unfamiliar Area, International
[2] Nidhi Singhai and Shishir K. Shandilya, “A Survey on: Content Journal of Future Generation Communication and Networking, Vol.4,
Based Image Retrieval Systems”, International Journal of Computer No.2, (2011)
Applications, Vol.4, No.2, July, 2010. [22] Vittorio Castelli, Lawrence D. Bergman, Image Databases: Search
[3] Arnol W.M. Smeulders, Marcel Worring, Simone Santini, Amarnarth and Retrieval of Digital Imagery, Wile Inter-Science, (2002)
Gupta, Ramesh Jain, Content-based Image Retrieval at the End of the [23] Baeza-Yates Ricarno, Berthier Ribeiro-Neto, Modern Information
Early Years, IEEE Transactions on Pattern Analysis and Machine Retrieval, ACM Press, (1999)
Intelligence, vol.22, no.12, 2000.
[4] B.S. Manjunath, Hens-Rainer Ohm, Vinod V. Vasudevan, Akio
Yamada, Color and Texture Descriptors, IEEE Transactions on
Circuits and System for Video Technology, vol.11, no.6, pp.703~715,
June, 2001
[5] Ricardo Silva Torres, Alexandre Zavier Falcao, Content-based Image
Retrieval: Theory and Applications, Brazilian Symposium on
Computer Graphics and Image Processing (SIBGRAPO), Oct., 2006.
[6] Mun-Kew Leong, Wo Chang : Framework and System Components,
ISO/IEC JTC1/SC29/WG1N3684, July, 2005.
[7] Smeulders AWM, Worring M, Santini S, Gupta A and Jain R,
“Content-Based Image Retrieval at the End of the Early Years”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol.22,
No.12, 2000
[8] Hui Wang, Dzulkifli Mohamad and N.A. Ismail, “Semantic Gap in
CBIR: Automatic Objects Spatial Relationships Semantic Extraction
and Representation”, International Journal of Image Processing,
Vol.4, No.3, 2010
[9] Lakdashti, Shahram Moin and Badie, “A Novel Semantic-based
Image Retrieval Method”. International Conference on Advanced
Communication Technology, pp.969~974, 2008.
[10] V. Vijaya Kumar, N. Gnaneswara Rao, A.L. Narsimha Rao, RTL:
Reduced Texture Spectrum with Lag Value Based Image Retrieval
for Medical Images, International Journal of Future Generation
Communication and Networking, Vol.2, No.4, (2009)
[11] Ramech Jain, Photo Retrieval: Multimedia’s Chance to Solve a Real
Problem for Real People, IEEE Multimedia, Vol.14, Issue.3, (2007)
[12] Yong-Hwan Lee, B.N. Kim, H.J. Kim, Photograph Indexing and
Retrieval using Combined Geo-information and Visual Features,
International Conference on Complex, Intelligent and Software
Intensive Systems, (2010) pp.790~793.
[13] Kentaro Toyama, Ron Logan, Asta Roseway, P. Anandan,
Geographic Location Tags on Digital Images, ACM International
Conference on Multimedia, (2003) pp.156~166.
[14] Yu Zheng, Yukun Chen, Xing Xie, Wei-Ying Ma, GeoLife2.0: A
Location-based Social Networking Service, International Conference
on Mobile Data Management: Systems, Services and Middleware,
(2009) pp.357~358.
[15] Yih-Farn Chen, Giuseppe Di Fabbrizio, David Gibbon, Rittwik Jana,
Serban Jora, Bernard Renger, Bin Wei, GeoTracker- Geospatial and
Temporal RSS Navigation, International Conference on World Wide
Web, (2007) pp.41~50.
[16] Arun Qamra, Edward Y. Chang, Scalable Landmark Recognition
using EXTENT, Multimedia Tools and Applications, Vol. 38, Issue.
2, pp.187~208, (2008)
[17] Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, Tye
Rattenbury, How Flickr Helps us Make Sense of the World: Context
and Content in Community-Contributed Media Collections,
International Conference on Multimedia, (2007) pp.631~640.
[18] Konrad Tollmar, Tom Yeh, Trevor Darrell, IDeixis - Searching the
Web with Mobile Images for Location-based Information, Mobile
Human-Computer Interaction, LNCS 3160, pp.288~299, (2004)
[19] Muhammad Fahad Khan, Saira Beg, Fakhra Kashif, Image
Transference and Retrieval over SMS, International Journal of
Computer Science Issue, Vol.8, Issue.5, No.2, (2011)
[20] Akio Yamada, Robert O’Callaghan, S.K. Kim, MPEG-7 Visual part
of experimentation model version 27.0. ISO/IEC JTC1/SC29/
WG11N7808, (2006)
748
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on June 28,2023 at 07:53:17 UTC from IEEE Xplore. Restrictions apply.