Paper Title: Author Name Affiliation
Paper Title: Author Name Affiliation
1. INTRODUCTION
Template matching is a classical problem in a scene analysis: given a reference image of an object, decide whether that object exists in a scene image under analysis, and find its location if it does. The template matching process involves cross-correlating the template with the scene image and computing a measure of similarity between them to determine the displacement [1]. Since the evaluation of the correlation is computationally expensive, there has been a need for lowcost correlation algorithms for real-time processing. A large number of correlation-type algorithms have been proposed [2]. One of the approaches is to use an image pyramid for both the template and the scene image, and to perform the registration by a top-down search [3]. Other fast matching techniques use two pass algorithms; use a sub-template at a coarsely spaced grid in the first pass, and search for a better match in the neighborhood of the previously found positions in the second pass [4. A. Rosenfeld and A. Kak. Digital Image Processing (2nd Edition, Vol. 2 ed.),, Academic Press, Orlando (1982).4].Afterwards, Jane You[5] presented a wavelet based high performance hierarchical scheme for image matching which includes dynamic detection of interesting points, adaptive thresholding selection and a guided searching strategy for best matching from coarse level to fine level. In order to improve the accuracy of matching and at the same time to reduce the computation load, In this paper, we proposed a robust image matching approach which decreases a large amount of unnecessary searches in contrast to conventional scheme and can achieve a better matching accuracy. Discrete wavelet transform is done firstly on a reference image and a scene image, and low frequency parts of them is extracted, then we use harris corner detection to detect the interesting point in low frequency parts of them to determined the matching candidate region of scene image in reference image, SIFT is used to extracting feature on the matching candidate region and scene image, The extracted features are matched by k-d tree and bidirectional matching strategy to enhance the accuracy of matching. Experiment show that, the algorithm can improve the accuracy of matching and at the same time to reduce the computation load.
C=
f (i, j) k (i + m, j + n)
i =1 j =1
f (i, j) (k (i + m, j + n))
2 i =1 j =1 i =1 j =1
( 1)
Where f (i, j ) and k (i , j ) are the gray value of the template window and the inquiring window, both of the windows with a size M N . The aimed area is the random vector which has the maximum cross-correlation coefficient C . This method is univocal, easy implementing and accurate. But the large amount of calculation needs a long running time for the computer. It is difficult to realize measurement online.
L = log 2 ( N / 8)
(2) Where L is the maximum times of decomposition, the image size is
NN .
R = det M k ( graceM )2
(3)
I ( x, y ) I uv ( x, y ) M ( x, y) = 2 I uv ( x, y ) I v ( x, y ) (4)
2 u
I u2 ( x, y ) = X 2 ( x, y ) I v2 ( x, y ) = Y 2 ( x, y ) I uv ( x, y ) = XY ( x, y )
(5)
1 ( x, y ) = e 2
(6) Where k is an empirical value; of the gray scale in direction and
x2 + y 2 2 2
Y are the first-order directional differentials, which can be approximately calculated by convolving the gray scale and difference operators in direction u and v ; refers to convolution. Gaussian function is used to reduce the impact of noise, because first-order directional differentials are sensitive to noise. If R exceeds certain threshold, then take the
point as a corner.
G ( x, y , ) = (1 22 )e ( x
+ y 2 ) 2 2
( 8)
To efficiently detect stable keypoint locations in scale space, David G. Low proposed using scale-space extrema in the difference-of-Gaussian function convolved with the image
D ( x , y ,) =[G ( x , y , k) G ( x, y ,)] I ( x, y ) . = L( x , y , k) L(( x , y ,))
( 9)
In order to detect the local maxima and minima of G(x, y, ), each sample point is compared to its eight neighbors in the current image and nine neighbors in the scale above and below (see Fig.3).
It is selected only if it is larger than all of these neighbors or smaller than all of them. To improve the stability of matching, we must reject the points that have low contrast(and are therefore sensitive to noise) or are poorly localized along an edge. The Taylor expansion of the scale-space function, D(x, y, ), shifted so that the origin is at the sample point is
D ( X ) = D +( T ) X +(1 2) X T ( 2 D 2 ) X D X X
(10) Where D and its derivatives are evaluated at the sample point and X = (x, y, )T is the offset from this point. The location of the extremum, X , is determined by taking the derivative of this function with respect to X and setting it to zero, giving
X = 2 D 1 2 ) ( X D X
D ( X ) =D +(1 2)( T ) X . D X
(12)
The value is useful for rejecting unstable extrema with low contrast. Usually the extema with a value of D ( X ) less than 0.03 should be discarded. The difference-of-Gaussian function will have a strong response along edges, even is the location along the edge is poorly determined and therefore unstable to small amounts of noise.
13)
D xx H = D xy
D xy D yy .
Let be the eigenvalue with the largest magnitude and be the smaller one.
Det ( H ) = D xx D yy ( D xy ) 2 = . (14)
Let = r, then
Tr ( H ) = D xx + D yy = +
Tr ( H ) 2 ( + ) 2 ( r + ) 2 ( r + 1) 2 = = = . Det ( H ) r r
( 15) This quantity is at a minimum when the two eigenvalues are equal and it increases with r. David G. Low proposed using r = 10. 3.1.2 Descriptorrepresentation The previous operations have assigned an image location, scale, and orientation to each keypoint. A keypoint descriptor is created by first computing the gradient magnitude and orientation at each image sample point in a region around the keypoint location. These samples are then accumulated into orientation histograms summarizing the contents over 44 subregions. David G. Low proposed using a 44 array of histograms with 8 orientation bins in each. Therefore, the feature vector has 448 = 128 elements. The descriptor is formed from a vector containing the values of all the orientation histogram entries. Finally, the vector is normalized to unit length to reduce the effects of illumination change.
4. CONCLUSION
In this paper, an algorithm for a robust template matching method based on the combination of the wavelet transform method and SIFT is proposed. Discrete wavelet transform is done firstly on a reference image and a template image, and low frequency parts of them is extracted, then we use harris corner detection to detect the interesting point in low frequency parts of them to determined the matching candidate region of template image in reference image, extracting SIFT features on the matching candidate region and template image, The extracted SIFT features are matched by k-d tree and bidirectional matching strategy. Experiment show that, the algorithm can improve the accuracy of matching and at the same time to reduce the computation load. An open problem is still precision, consistency and efficiency of the algorithm. Future work will be focused on the development of a sophisticated method of deriving an optimized high-precision matching result under the influence of noise.
REFERENCES
[1] [2] [3] [4]
J.K. Aggarwal, L.S. Davis and W.N. Martin,Correspondence process in dynamic scene analysis,Proc IEEE 695, 562-572, (1981) J.P. Secilla and N. Garcia,Template location in noisy pictures,Signal Process. 14, 347-361, (1987). R.Y. Wong and E.L. Hall, Sequential hierarchical scene matching.,IEEE Trans. Comput. 27 , 359-366,(1978) A. Rosenfeld and A. Kak. Digital Image Processing (2nd Edition, Vol. 2 ed.),Academic Press, Orlando (1982)
[5]
Jane You and Prabir Bhattacharya, A Wavelet-Based Coarse-to-Fine Image Matching Scheme in A Parallel Virtual Machine Environment, IEEE Trans. Image Proc. 9, 1547-1559, (2000).