Currency Recognition On Mobile Phones Proposed System Modules
Currency Recognition On Mobile Phones Proposed System Modules
Segmentation
Feature Extraction
Instance Retrieval
1. Building a Visual Vocabulary
2. Image Indexing Using Text Retrieval Methods
3. Retrieval Stage
4. Spatial re-ranking
5. Classification
Adaptation to Mobile
Performance analysis
Module description
A. Segmentation
The images might be captured in a wide variety of environments, in terms of lighting
condition and background while the bill in the image itself could be deformed. Image
segmentation is important not just for reducing the data to process but also for reducing
irrelevant features (background region) that would affect the decision-making. This work
starts with a fixed rectangular region of interest (ROI) which is forty pixels smaller from all
four sides than the image itself. This work assumes that a major part of the bill will be present
inside this region. Everything outside this ROI is a probable background. Once this region is
obtained, it must be extended to a segmentation of the entire object. Let x be an image and
let y be a partition of the image into foreground (object) and background components. Let x i
R3 be the color of the ith pixel and let y be equal to +1 if the pixel belongs to the object
i
and to -1, otherwise. For segmentation this work use a graph cut based energy minimization
formulation. The cost function is given by
E ( x , y )= log p ( y i|x i) +
i
S ( y i , y jx)
(i , j)
The edge system E determines the pixel neighborhoods and is the popular eight-way
connection. The pair wise potential S(yi , yj|x) favors neighbor pixels with similar color to
have the same label. Then the segmentation is defined as the minimize arg min y E(x,y). We
use the Grab Cut algorithm, which is based on iterative graph cuts, to carry out foreground/
background segmentation of the images captured by the user. The system should be able to
segment the foreground object correctly and quickly without any user interaction. Whenever
the foreground area is smaller than a pre-decided threshold, a fixed central region of the
image is marked as foreground.
B. Instance Retrieval
5.3.1. Building a Visual Vocabulary
This work first locates keypoints in the foreground region of the image (obtained from
segmentation) and describes the key point regions, using any descriptor extractor like SIFT,
SURF or ORB-FREAK . This work obtains a set of clusters of features using hierarchical Kmeans algorithm. The distance function between two descriptors x1 and x2 is given by
d ( x1 , x2 ) = ( x 1x 2) 1( x1 x2 )
Where is the covariance matrix of descriptors. As is standard, the descriptor space is affine
transformed by the square root of so that Euclidean distance may be used. The set of
clusters forms the visual vocabulary of image.
5.3.2. Image Indexing Using Text Retrieval Methods
For every training image, after matching each descriptor to its nearest cluster, we get a vector
of frequencies (histogram) of visual words in the image. Instead of directly using visual word
frequencies for indexing, we employ a standard term frequency - inverse document
frequency (tf-idf ) weighting. Suppose there is a vocabulary of k words, then each image is
represented by a k-vector
V d =(t 1 , , t i , , t k )
components
ti =
nid
N
log
nd
ni
( )
Here nid is the number of occurrences of word i in document d, n d is the total number of
words in the document d, ni is the total number of occurrences of term i in the whole database
and N is the total number of documents in the whole database. The weighting is a product of
nid
two terms: the word frequency ni , and the inverse document frequency log
( Nn )
i
.However, retrieval on this representation is slow and requires lots of memory. This makes it
impractical for applications on mobile phones. Therefore, we use an inverted index for
instance retrieval. The inverted index contains a posting list, where each posting contains the
occurrences information (e.g. frequencies, and positions) for documents that contain the term.
To rank the documents in response to a query, the posting lists for the terms of the query must
be traversed, which can be costly, especially for long posting lists.
5.3.3. Retrieval Stage
At the retrieval stage, this work obtains a histogram of visual words (query vector) for the
test image. Image retrieval is performed by computing the normalized scalar product (cosine
of the angle) between the query vector and all tf-idf weighted histograms in the database.
They are then ranked according to decreasing scalar product. This work selects the first 10
images for further processing.
5.3.4. Spatial re-ranking
The Bag of Words (BoW) model fails to incorporate the spatial information into the ranking
of retrieved images. In order to confirm image similarity, this work checks whether the key
points in the test image are in spatial consistency with the retrieved images. This work use the
popular method of geometric verification (GV) by fitting fundamental matrix to find out the
number of key points of the test image that are spatially consistent with those of the retrieved
images.
5.3.5. Classification
In the voting mechanism, each retrieved image adds votes to its image class (type of bill) by
the number of spatially consistent key points it has (computed in the previous step). The class
with the highest vote is declared as the result.
C. Adaptation to Mobile
The recognition model needed for retrieval cannot be used directly on a mobile phone
because of the memory requirement. The system was able to adapt the above solution to a
mobile environment by making very significant reductions in complexity, as much as
possible, without sacrificing the effective accuracy. This allows us to achieve the best
possible performance, given the severe restrictions in various aspects of the pipeline that we
have to contend with.
D. Performance analysis
In this step evaluate the performance metrics such as accuracy, and precision for the proposed
system..
CHAPTER 2
INTRODUCTION
2.1 Computer Imaging
It can be defined a acquisition and processing of visual information by computer. Computer
representation of an image requires the equivalent of many thousands of words of data, so the
massive amount of data required for image is a primary reason for the development of many
sub areas with field of computer imaging, such as image compression and segmentation.
Another important aspect of computer imaging involves the ultimate receiver of visual
information in some case the human visual system and in some cases the human visual
system and in others the computer itself.
Computer imaging can be separate into two primary categories:
1. Computer Vision.
2. Image Processing
3.Image Compression
Involves reducing the typically massive amount of data needed to represent an image. This
done by eliminating data that are visually unnecessary and by taking advantage of the
redundancy that is inherent in most images. Image processing systems are used in many and
various types of environments, such as:
1. Medical community
2. Computer Aided Design
3. Virtual Reality
4. Image Processing.
turned into a digital image by sampling the continuous signal at affixed rate. The value of the
voltage at each instant is converted into a number that is stored, corresponding to the
brightness of the image at that point. Note that the image brightness of the image at that point
depends on both the intrinsic properties of the object and the lighting conditions in the scene.
2.6. Image Representation
We have seen that the human visual system (HVS) receives an input image as a collection of
spatially distributed light energy; this is form is called an optical image. Optical images are
the type we deal with every day cameras captures them, monitors display them, and we see
them [we know that these optical images are represented as video information in the form of
analog electrical signals and have seen how these are sampled to generate the digital image
I(r , c).
The digital image I (r, c) is represented as a two- dimensional array of data, where each pixel
value corresponds to the brightness of the image at the point (r, c). in linear algebra terms , a
two-dimensional array like our image model I( r, c ) is referred to as a matrix , and one row
( or column) is called a vector.
The image types we will consider are:
1. Binary Image
Binary images are the simplest type of images and can take on two values, typically black
and white, or 0 and 1. These types of images are most frequently in computer vision
application where the only information required for the task is general shapes, or outlines
information. For example, to position a robotics gripper to grasp ) )an object or in
optical character recognition (OCR). Binary images are often created from gray-scale images
via a threshold value is turned white (1), and those below it are turned black (0).
CHAPTER 3
LITERATURE SURVEY
1) Monitoring of the Rice Cropping System in the Mekong Delta Using
ENVISAT/ASAR Dual Polarization Data-Alexandre Bouvet, Thuy Le Toan, and Nguyen
Lam-Dao, 2009.
Introduction
In recent years, changes in cultural practices have been observed in different regions of the
world. The rice growth region in the Mekong Delta in Vietnam is a good example of changes
from the traditional to modern rice cultivation system in the last ten years. A multiple
cropping system is implemented, increasing the number of crops per year from one or two to
two, three, or even more. Dike infrastructures have been built and intensified after 2000 to
block the flood way into the fields during the flood season so as to allow an additional crop
cycle. Short-cycle rice varieties (80100 days) are planted in order to harvest three crops per
year instead of one or two. Finally, modern water management has been partly introduced in
the last three years, consisting in intermittent drainage between two irrigation operations. For
those changes in cultural practices, the intensity temporal change method for rice mapping
and monitoring needs to be upgraded. In this work, a method using polarization information
is developed and assessed for this purpose. Because of the vertical structure of rice plants, the
difference between HH and VV backscattering is expected to be higher than that of other crop
or land cover types, and through the relation with wave attenuation in the canopy, the ratio of
the HH and VV backscattering coefficients (hereafter called HH/VV) can be related to the
vegetation biomass. A joint analysis of ERS and RADARSAT-1 data , and the modeling of Cband HH and VV revealed that HH is significantly higher than VV, and the difference can
reach 67 dB at the peak growth stage. Based on these findings, HH/VV is potentially a good
classifier for rice monitoring, and methods using HH/VV need to be developed and assessed.
Specifically, in this work, the method is developed using a time series of dual polarization
(HH and VV) ASAR data and tested in the province of An Giang in the Mekong Delta.
Advantages
This promising result shows that methods using SAR data can be timely and cost
effective.
The method is well-suited to regions where fields have multiple crops and shifted
calendars.
Disadvantages
Need to consider the improvement of the method by using HH/VV and the temporal
change of HH and/or VV in the multi date approach.
The proposed method is able to provide estimation on rice fields based on dual-pol
SAR imagery.
It achieves results with higher resolution ground truth data in order to validate this
methodology.
Disadvantages
Need to study the generation of models for other kind of crops that behaves in a
similar way and try to apply an analogous approach to the phonological estimation.
Disadvantages
The main drawback of using dual-pol TerraSAR-X images for this application is their
narrow swath (around 15 km on the ground), which is too small for devising a
monitoring scheme on large-scale rice plantations.
Noise level of the system (NESZ around 19 dB), which may result very close or
even higher than the backscattering from rice fields, especially at the early stages of
the cultivation cycle.
Disadvantages
The 3D instaneous rate vectors and correspondingly variances are usually difficult to
exactly identify without any priori information.
The proposed approach can be potentially used to include other measurements, such
as GPS and leveling, in the solutions.
Disadvantages
It becomes unstable when the measurement noise is high due to the polar-orbiting
imaging geometries of the current satellite SAR sensors.
The WOFOST model can simulate the growth curve and yield of corn, especially with
respect to crop carbon absorption in agri-ecological systems
This study aimed to assess the feasibility of assimilating areal observation data into a
crop growth model to improve spatial estimates of crop yields and carbon pools.
Disadvantages
GFS-patterns is used to extract sets of pixels sharing similar evolution from Satellite
Image Time Series over cultivated areas
Even on poor quality inputs (i.e., noisy images, rough quantization), the method can
exhibit various level of details of primary interest in agro-modelling
Disadvantages
The contribution due to the stratified atmosphere can be roughly estimated by using
DEMs and meteorological data, but the effects of the turbulent atmosphere still
degrade interferograms.
The proposed method is validated to estimate cereal yield levels using solelyoptical
and SAR satellite data.
The averaged composite SAR modeled grain yield level was 3,750 kg/ha (RMSE =
10.3%, 387 kg/ha) for high latitude spring cereals.
Disadvantages
The early emergence in vegetative phase (ap, BBCH 012) in two leaf stage before
double ridge induction and the senescence phase after full maturity and harvest (dp),
BBCH > 90) were difficult to estimate.
Disadvantages
The proposed method enables processing of very large datasets, either in spatial extent
or through time
Disadvantages
This step-wise land-use transitions result into deviation from the assumption of
sequential phases in land conversion process and limit the application of most change
detection algorithms.
CHAPTER 4
CONCLUSION AND FUTURE WORK
CONCLUSION
Visual object recognition is an recent trend which is used to recognize the objects visually
through the systems. Currency recognition through mobile phones will be a most effective
methodology which will be most useful for visually impaired persons. In this work, we have
ported the system to a mobile environment, working around like limited processing power
and memory, while achieving high accuracy and low reporting time. Currency retrieval and
thereafter recognition is an example of fine-grained retrieval of instances which are highly
similar. Thus the result of our experimental results proves that it is more robust to
illumination changes than the SIFT descriptor.
FUTURE WORK
The system implemented in our work is used to implement on Indian currency rupees
whereas in further research it can be implemented to support a world level currency notes.
REFERENCE
1. Jayaguru Panda, Michael S. Brown, C. V. Jawahar, Offline Mobile Instance Retrieval
with a Small Memory Footprint, Computer Vision (ICCV), PP: 1257 1264, 1-8 Dec.
2013
2. A. Ms. Trupti Pathrabe and B. Dr. N. G. Bawane , Paper Currency Recognition System
using Characteristics, International journal latest trends in computing, Vol 1, Issue 2,
December, 2010
3. H. Hassanpour and E. Hallajian, Using Hidden Markov Models for Feature Extraction in
Paper Currency Recognition, Advances in Soft Computing and Its Applications, volume
8266, 2013, pp 403-412
4. Bhawani sharma, Amandeep kaur, Vipan, Recognition of indian paper currency based on
LBP, International journal of Computer applications, Vol 59,Dec 2012
5. Faiz M. Hasanuzzaman, Xiaodong Yang, and YingLi Tian,
10. Petri Tanskanen, Kalin Kolev, Lorenz Meier, Federico Camposeco, Olivier Saurer, Marc
Pollefeys, Live metric 3D Reconstrction on mobile phones, Computer Vision (ICCV),
2013 IEEE International Conference on 1-8 Dec. 2013