0% found this document useful (0 votes)
44 views9 pages

Object Recognition in Infrared Image Sequences Using Scale Invariant Feature Transform

This document discusses object recognition in infrared image sequences using scale invariant feature transform (SIFT). It proposes using SIFT for effective feature extraction in a PowerPC-based infrared imaging system. The method involves two stages: localizing interest points in objects across position and scale, and building descriptions of interest points to recognize moving objects. SIFT is used because it extracts features that are invariant to image scaling and rotation, and partially invariant to illumination changes. Experimental results show the proposed method can extract objects' feature values in the infrared imaging system.

Uploaded by

JesycaFuenmayor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views9 pages

Object Recognition in Infrared Image Sequences Using Scale Invariant Feature Transform

This document discusses object recognition in infrared image sequences using scale invariant feature transform (SIFT). It proposes using SIFT for effective feature extraction in a PowerPC-based infrared imaging system. The method involves two stages: localizing interest points in objects across position and scale, and building descriptions of interest points to recognize moving objects. SIFT is used because it extracts features that are invariant to image scaling and rotation, and partially invariant to illumination changes. Experimental results show the proposed method can extract objects' feature values in the infrared imaging system.

Uploaded by

JesycaFuenmayor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Object recognition in infrared image sequences

using scale invariant feature transform


Changhan Parka, Kyung-hoon Baea, and Jik-Han Jungb
a
Advanced Technology R&D Center, Samsung Thales Co., Ltd.,
San 14-1, Nongseo-dong, Giheung-Gu, Yongin, Gyeonggi 446-712, Korea,
b
School of EECS, KAIST, 373-1, Guseong-Dong, Yuseong-Gu, Daejeon, Korea
ABSTRACT
In this paper, we propose an automated target recognition by using scale-invariant feature transform (SIFT) in PowerPCbased infrared (IR) imaging system. An IR image can be acquired more feature values at night than in the daytime, but
visual image can be acquired more feature values in the daytime. IR-based object recognition puts application into digital
surveillance system because it exist some more feature values at night than in the daytime. Feature of IR image in its
system appears a little feature value in the daytime. It is not comprised within an effective feature values at a visual
image from an IR of the daytime. Proposed method consists of two stages. First, we must localize the interest point in
position and scale of moving objects. Second, we must build a description of the interest point and recognize moving
objects. Proposed method uses SIFT for an effective feature extraction in PowerPC-based IR imaging system. Proposed
SIFT method consists of scale space, extrema detection, orientation assignment, key point description, and feature
matching. SIFT descriptor sets up extensive range about 1.5 times than visual image when feature value of SIFT in IR
image is less than visual image. Because an object in IR image is analogized by field test that it exist more expanse form
than visual image. Therefore, proposed SIFT descriptor is constituted at more expanse term for a precise matching of
object. Based on experimental results, the proposed method is extracted objects feature values in PowerPC-based IR
imaging system, and the result is presented by experiment.
Keywords: Automated Target Recognition, Infrared (IR) Image, Visual Image, Scale Invariant Feature Transform
(SIFT), SIFT Descriptor, and Gaussian Noise.

1. INTRODUCTION
Recently, a visual simultaneous localization and map building (v-SLAM) is required multi-modal sensors, such as
ultrasound sensor, range sensor, infrared (IR) sensor, encoder (odometer), and multiple visual sensors. Recognitionbased localization is considered as the most promising method of image-based SLAM [1]. Multi-sensor based image
system is a challenging task and has many applications, such as security systems, defense applications, and intelligent
machines. Object recognition techniques have been actively investigated and have wide application in various fields. An
infrared and visual image is a potential solution to improve targets detection, tracking, recognition, and fusion
performance. Tracking and recognition based on the visual image is sensitive to variations in illumination conditions [2].
Tracking and recognition of target using different imaging modalities, in particular infrared (IR) images has become an
area of growing interest. Thermal IR imagery is nearly invariant to changes in ambient illumination, and provides a
capability for identification under all lighting conditions including total darkness. Infrared sensors are routinely used in
remote sensing applications. Coupling an infrared sensor with a visible band sensor - for frame of reference or for
additional spectral information - and properly processing the two information streams has the potential to provide
valuable information in night and/or poor visibility conditions [3].

*Correspondence: [email protected]; phone +82-31-280-1638; fax +82-31-280-1591; http://www.samsungthales.com


Signal Processing, Sensor Fusion, and Target Recognition XVII, edited by Ivan Kadar
Proc. of SPIE Vol. 6968, 69681P, (2008) 0277-786X/08/$18 doi: 10.1117/12.777976

Proc. of SPIE Vol. 6968 69681P-1


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

Object detection and recognition techniques are proven to be more popular than other biometric features based on
efficiency and convenience. It can also use a low-cost personal computer (PC) camera instead of expensive equipments,
and require minimal user interface. Object authentication has become a potential a research field related to object
recognition [4]. Object recognition using infrared imaging sensors has become an area of growing interest in recent
years. Thermal infrared technique have been used in object recognition system, which have advantages in object
detection, detection of disguised objects, and object recognition under poor lighting conditions. However, thermal
infrared is not desirable because of the higher cost of thermal sensors and its instability in different temperature. Whereas
IR has attracted more and more attention due to its preferable attribute and low cost, which it is also adopted in this
paper.
An object recognition system tends to be classified into three categories. The first category includes geometric
feature-based methods, where feature vectors are used for representation. Feature vectors can decide identity of an object
as well as validity of object region. The most popular solution for the robust recognition method is scale-invariant feature
transform (SIFT) approach that transforms an input image into a large collection of local feature vectors, each of which
is invariant to image translation, scaling, and rotation [1]. Local descriptors [5] are commonly employed in a number of
real-world applications such as object recognition [5] and image retrieval [6] because they can be computed efficiently,
are resistant to partial occlusion, and are relatively insensitive to changes in viewpoint. In this paper, proposed feature
extraction for object recognition uses SIFT. Template-based methods fall into the second category including; correlationbased, Karhunen Loeve (KL) expansion, linear discriminant, singular value decomposition (SVD), matching pursuit,
neural network (NN), and dynamic link methods. A template is made from multiple features. Template-based methods
are very robust to illumination change, but sensitive to scale change. The third category includes model-based methods
based on hidden markov model (HMM) for object recognition and detection. They integrate information in various
scales and directions by using a probability model. Model-based methods are particularly useful in the case where feature
vectors cannot decide originality of the object template [4].
Proposed method consists of two stages. First, we must localize the interest point in position and scale of moving
objects. Second, we must build a description of the interest point and recognize moving objects. Proposed method uses
SIFT for an effective feature extraction in PowerPC-based IR imaging system [7]. Proposed SIFT method consists of
scale space, extrema detection, orientation assignment, key point description, and feature matching. SIFT descriptor sets
up extensive range about 1.5 times than visual image when feature value of SIFT in IR image is less than visual image.
Because, object of IR image is analogized by field test that it exists more expanse form than visual image. Therefore,
proposed SIFT descriptor is constituted at more expanse term for a precise matching of object. Based on experimental
results, the proposed method is extracted objects feature values in PowerPC-based IR imaging system, and the result is
presented by experiment. This paper is organized as follows: Section 2 describes feature extraction and recognition using
the SIFT in our system. Section 3 presents robust algorithm against Gaussian noise. Section 4 presents experimental, and
Section 5 concludes the paper.

2. FEATURE EXTRACTION AND RECOGNITION USING SIFT


In this section, we deal with feature extraction for object recognition by using SIFT in PowerPC-based IR imaging
system. Proposed method for object recognition can be used our algorithm which the method presented in [5]. And we
are simulated and apply to our system it.
2.1 Feature extraction using scale invariant feature transform (SIFT)
In this session, we deal with feature extraction for object recognition in PowerPC-based IR imaging system. Feature
extraction and object matching is a fundamental aspect of many problems in computer vision, such as object or scene
detection and recognition, stereo correspondence, and motion tracking. Image features have many properties that make
them suitable for matching differing images of an object in IR sensor. The features are invariant to image scaling and

Proc. of SPIE Vol. 6968 69681P-2


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

rotation, and partially invariant to change in illumination. Proposed method uses SIFT for an effective feature extraction
in PowerPC-based IR imaging system. Following are the major stages of computation [5] used to generate the set of IR
image features:
1) Scale-space extrema detection
2) Keypoint localization
3) Orientation assignment
4) Keypoint descriptor
For object matching and recognition, SIFT features are first extracted from a set of reference IR images and stored in
a database. In this paper, we adopt matching algorithm for feature that a new IR image is matched by individually
comparing each feature from the new IR image to this previous database and finding candidate matching features based
on Euclidean distance of their feature vectors.
The first stage of keypoint detection is to identify locations and scales that can be repeatedly assigned under differing
views of the same object. According to SIFT [5], scale-space kernel is the Gaussian function. The scale space of an
image is defined as a function, L( x, y , ) , that is produced from the convolution of a variable-scale Gaussian,

G ( x, y, ) , with an input image, I ( x, y ) :


L ( x, y , ) = G ( x , y , ) I ( x, y ) ,
where

(1)

is the convolution operation in x and y , and


G ( x, y , ) =

1
2

e( x

+ y 2 ) / 2 2

(2)

To efficiently detect stable keypoint locations in scale space using scale-space extrema in the difference-of-Gaussian
function convolved with the image, D( x, y, ) , which can be computed from the difference of two nearby scales
separated by a constant multiplicative factor k :

G ( x, y, ) = (G ( x, y, k ) G ( x, y, ) ) I ( x, y )
= K ( x, y, k ) K ( x, y, )
In order to detect the local maxima and minima of

(3)

D( x, y, ) , each sample point is compared to its eight neighbors

in the current image and nine neighbors in the scale above and below. It is selected only if it is larger than all of these
neighbors or smaller than all of them.
2.2 Keypoint localization
Once a keypoint candidate has been found by comparing a pixel to its neighbors, the next step is to perform a detailed fit
to the nearby data for location, scale, and ratio of principal curvatures. Lowes approach uses the Taylor expansion of the
scale-space function, D( x, y, ) , shifted so that the origin is at the sample point:

1 T 2D
DT
D ( x) = D +
x+ x
x,
x
2
x 2
where

(4)

D and its derivatives are evaluated at the sample point and x = ( x, y, )T is the offset from this point. The

location of the extremum, x , is determined by taking the derivative of this function with respect to x and setting it to
zero, giving

Proc. of SPIE Vol. 6968 69681P-3


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

x=

2 D 1 D
.
x 2 x

(5)

2.3 Orientation assignment


By assigning a consistent orientation to each keypoint based on local image properties, the keypoint descriptor can be
represented relative to this orientation and therefore achieve invariance to image rotation. The scale of the keypoint is
used to select the Gaussian smoothed IR image, L , with the closest scale, so that all computations are performed in a
scale-invariant manner. These parameters in proposed method set up a region in the ratio of scale, and it selects gradient
about each pixel. For each IR image sample, L( x, y ) , at this scale, the gradient magnitude, m( x, y ) , and orientation,

( x, y ) , is pre-computed using pixel differences:

(L( x + 1, y ) L( x 1, y ) )2 + (L( x, y + 1) L( x, y 1) )2 ,

(6)

( x, y ) = tan 1 ((L( x + 1, y ) L( x 1, y ) ) / (L( x, y + 1) L( x, y 1) )) .

(7)

m ( x, y ) =

2.4 The local image descriptor


The previous operations have assigned an image location, scale, and orientation to each keypoint. These parameters
describe keypoint with independent size and rotation. The next step is to compute a descriptor for the local image region
that is highly distinctive yet is as invariant as possible to remaining variations, such as change in illumination or 3D
viewpoint. We use a gradient instead of intensity in an IR image. Because it is few change by illumination. Method
describing keypoint is follow as:
1) Keypoint in a local image region set up center in the ratio of scale.
2) Acquired orientation in previous stage makes united y axis.
3) Each region divide by x axis and y axis as shown in Figure 1.
4) Internal gradient in each region is added quantize in regard to eight directions.

(a)

(b)

Fig. 1. Key point description: (a) image gradients and (b) keypoint descriptor

Proc. of SPIE Vol. 6968 69681P-4


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

2.5 Image feature matching


An important remaining issue for measuring the distinctiveness of features is how the reliability of matching varies as a
function of the number of features in the database being matched. Proposed matching method is follow as:
1) Distance between keypoints is defined difference sum of each element in the descriptor.
j

2) Calculate distance between keypoint K i and keypoint K k , where

K represents keypoint, and

i and j represent an IR image in our database (DB), respectively.


j
3) These keypoint saves correspondence pairs ( K i , K k ) between the 1st close K i and the 2nd
close

K kj in our DB.

4) All detected keypoint in current IR image is perform 2) ~3).


Figure 2 shows method of suppression when it is detected false keypoint in our DB. Because it is detected a DB 2 in
DB1 group. Keypoint in itself correspondence pairs is deleted because it exists others DB. Therefore, proposed method
looks for correspondence pairs by using affine transform.

DB

x
DB 3

Fig. 2. Suppression of false detected keypoint

The affine transformation of a model point [ x

u m1
v = m
3

y ]T to an image point [u v]T can be written as


m2 x t x
+ ,
m4 y t y

(8)

ty ]T and the affine rotation, scale, and stretch are represented by the mi
parameters, respectively. Position of keypoint corresponding with affine transform represent ( x, y ) and (u , v) ,

where the model translation represents [tx


respectively.

To solve for the transformation parameters, so the equation above can be rewritten to gather the unknowns into a
column vector:

m1

x y 0 0 1 0 m2
0 0 x y 0 1 m u

3 = v .

m4
L L

M
L L

x

y

Proc. of SPIE Vol. 6968 69681P-5


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

(9)

Eq. (9) shows a single match, but any number of further matches can be added, with each match contributing two
more rows to the first and last matrix. At least three matches are needed to provide a solution. We can write this linear
system as

Ax = b ,

(10)

These parameters calculate Pseudo inverse. Here, the least-squares solution for the parameters x can be determined
by solving the corresponding normal equations,

x = AT A AT b ,

(11)

which minimizes the sum of the squares of the distances from the projected model locations to the corresponding image
locations. The result can be recognized with estimated parameter.

3. ROBUST ALGORITHM AGAINST GAUSSIAN NOISE


The performance of the SIFT-based recognition is affected by noise, especially zero-mean Gaussian noise. Gaussian
noise can shift the position of the maximum extreme point or make a false maximum extreme point as illustrated in
Figure 3. Here represents a noise, represents maximal extreme point,
represents deformed function.

represents original function, and

(a)

(b)
Fig. 3. (a) Position shift and (b) Generation of a false maximum extreme point by Gaussian noise

3.1 Robust method for object recognition against Gaussian Noise


To mitigate the noise effect, various details can be used. Under the assumption that an object is brighter than the
background, the intensity of the object is likely to be larger than the noise standard deviation. Let us consider the
situation where an object is moving from the ( k 1) th frame to the k th frame. If the object is small and moving, the
probability that the difference between the intensity of the object pixel in the current frame and that of the samepositioned pixel in the previous frame is larger than standard deviation of noise is close to 1. Let I i ( r , c) be defined as

I (r , c) at the i th frame, and ni (r , c) be Gaussian noise on (r , c) at the i th frame, which r and c represent a row
and a column in image, respectively. We assume:

Proc. of SPIE Vol. 6968 69681P-6


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

ni (r , c) ~ N (0, n2 ) .

(11)

Figure 4 shows the procedure of object recognition in our system. Our system implements PowerPC-based IR
imaging system. In this architecture, first, IR or CCD image sequences and the parameters involved with recognition
algorithms are transmitted to the PowerPC. The recognition can process image sequences captured by frame grabber in
the slave unit or stored in the master unit. The parameters are selected by user and transmitted to recognition form the
master unit.
bbC

rJJ

W1JU bur

r
LG2flJ2

H1bC
DAD

FCD

HDD

JJJJG 20fl1LCG

CCD 1K
crbWL
1W

bLoccJI 2

II

Fig. 4. Block diagram for recognition in PowerPC-based IR imaging system

4. EXPERIMENTAL RESULTS
In this section we present some of the experiments using image sequences with PowerPC-based IR imaging system and
recognition and tracking algorithms. Figure 5 shows the recognition and tracking result of an object. If it disappears the
recognized object now, our system change tracking mode. Figure 6 shows the recognition and tracking result of an object
2

with zero-mean Gaussian noise N (0,15 ) .

5. CONCLUSIONS
In this paper, we proposed automated target recognition by using SIFT in PowerPC-based IR imaging system. Proposed
method consists of two stages. First, we must localize the interest point in position and scale of moving objects. Second,
we must build a description of the interest point and recognize moving objects. Proposed method uses SIFT for an
effective feature extraction in PowerPC-based IR imaging system. Proposed SIFT method consists of scale space,
extrema detection, orientation assignment, key point description, and feature matching. SIFT descriptor sets up extensive
range about 1.5 times than visual image when feature value of SIFT in IR image is less than visual image. Because,
object of IR image is analogized by field test that it exists more expanse form than visual image. Therefore, proposed
SIFT descriptor is constituted at more expanse term for a precise matching of object. Based on experimental results, the
proposed method is extracted objects feature values in PowerPC-based IR imaging system, and the result is presented by
experiment.

Proc. of SPIE Vol. 6968 69681P-7


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

REFERENCES
[1]

[2]

[3]
[4]
[5]
[6]
[7]

J. Lee, Y. Kim, C. Park, C. Park, and J. Paik. Robust feature detection using 2D wavelet transform under low light
environment, Proc. Intelligent Computing in Signal Processing and Pattern Recognition (ICIC2006) (345), 10421050(2006).
J. Wang, J. Liang, H. Hu, Y. Li, and B. Feng, Performance evaluation of infrared and visible image fusion
algorithms for face recognition, Proc. International Conf. Intelligent Systems and Knowledge Engineering (ISKE
2007), 1-8(2007).
D. Socolinsky, A. Selinger, and J. Neuheisel, Face recognition with visible and thermal infrared imagery,
Computer Vision and Image Understanding Papers 91(1-2), 72-114(2003).
C. Park, Multimodal human verification using stereo-based 3D information, IR, and speech, Proc. SPIE 6543,
65431D-10(2007).
D. Lowe. Distinctive image features from scale-invariant keypoints, Int. Journal of Computer Vision Papers
60(2), 91-110(2004).
K. Mikolajczyk, and C. Schmid, Indexing based on scale invariant interest points, Proc. Int. Conf. Computer
Vision, 525531(2001).
J. Lee, J. Youn, and C. Park, "PowerPC-based system for tracking in infrared image sequences, Proc. SPIE 6737,
67370S-9(2007).

Proc. of SPIE Vol. 6968 69681P-8


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

(a)

(b)

(c)

(d)

Fig. 5. Experimental results of the proposed recognition and tracking algorithms: (a) 24th frame (b) 28th frame (c) 32nd
frame (d) 40th frame

(a)

(b)

(c)

(d)

Fig. 6. Experimental results of the proposed recognition and tracking algorithms with Gaussian noise
frame (b) 28th frame (c) 32nd frame (d) 40th frame

Proc. of SPIE Vol. 6968 69681P-9


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/11/2014 Terms of Use: http://spiedl.org/terms

N (0,152 ) : (a) 24th

You might also like