0% found this document useful (0 votes)
74 views

Tempered Image Detection

A hybrid image matching algorithm for image analysis is proposed and evaluated. Features are extracted using blob detectors and interest point detectors. The authenticity of some news photographs has been questioned more frequently recently.

Uploaded by

jkumardk
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

Tempered Image Detection

A hybrid image matching algorithm for image analysis is proposed and evaluated. Features are extracted using blob detectors and interest point detectors. The authenticity of some news photographs has been questioned more frequently recently.

Uploaded by

jkumardk
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Fifth International Conference on Computer Graphics, Imaging and Visualization

Visualisation

Tampered Image Detection using Image Matching

Zhenghao Li1,2, A.Y.C. Nee2, S.K. Ong2, Weiguo Gong1


1.
Key Lab of Optoelectronic Technology and System of Ministry of Education,
Chongqing University, Chongqing, China 400030
2.
Mechanical Engineering Department, Faculty of Engineering,
National University of Singapore, Singapore 117576
{lizhenghao|mpeneeyc|mpeongsk}@nus.edu.sg, [email protected]

Abstract advocate that the essential way to solve these kinds of


problems is, at first, to extract the hidden information
With the development of image processing software, which cannot be quantitatively detected by human eyes
such as Photoshop, scientific analysis for detecting from the images, employing the computer vision and
tampered images becomes a critical research issue. A digital image processing technology, and then to find
hybrid image matching algorithm for image analysis is traces of tampering using the information mentioned
proposed and evaluated. Features are extracted using above. According to this procedure, a novel solution
blob detectors and interest point detectors. This for detecting tampered image based on image matching
combination ensures sufficient correspondences in is proposed.
small image patches while maintaining high accuracy. There are two main contributions in this paper. First,
Nearest neighbors are searched using hybrid spill tree we present a hybrid matching algorithm for detailed
in order to reduce the computational load. Applying image analysis based on the multi-scale theory and
the algorithm to tampered image detection, a novel local feature descriptors. There are two desirable
solution is proposed utilizing the scale space properties in this algorithm. The first property is that
information and gradient orientation information from the algorithm can generate more accurate and reliable
the corresponding features. Furthermore, an order correspondences and the second is it can obtain a more
regularity of scale space between the corresponding explicit description for the scale space of image
patches and a description for object rotation are also features. We detect the features using the Maximally
proposed. In addition, we conducted experiments on Stable Extremal Region (MSER) [1] and Laplacian-of-
tampered images, proving the method to be powerful. Gaussian (LoG) extrema, and then wrap the features to
a SIFT-like descriptor [2]. Lastly, we find the
1. Introduction correspondences using hybrid spill tree (SP-Tree) [3].
An evaluation is conducted based on the reconstruction
With the development of digital image processing similarity. It is shown that the proposed algorithm
software, it becomes difficult for people to distinguish outperforms others. Second, we explore the use of the
tampered images from the genuine ones. In fact, the information embedded in the correspondences, and
authenticity of some news photographs has been introduce image matching to tampered image detection.
questioned more frequently recently. Therefore, it is This idea not only enriches the use of the matching
pertinent to develop theories and algorithms for technology, but also develops a powerful solution for
detecting tampered images, and this has already tampered image detection.
received the attention of many scientists and The rest of the paper is organized as follows. In
researchers. Section 2, the proposed matching algorithm is
It is known that tampered digital images should be introduced and evaluated. The order regularity of scale
produced in accordance with the principles of space between the corresponding patches is explained
computer vision, such as perspective transformation, in Section 3. In Section 4, the description for object
and epipolar constraint, and it is necessary to use rotation is given. Lastly in Section 5, conclusions and
image processing technology, for example, synthesis, future work are presented and proposed.
scale, skew, Gaussian blur, etc. In this case, we

978-0-7695-3359-9/08 $25.00 © 2008 IEEE 174


DOI 10.1109/CGIV.2008.13
2. Matching algorithm selected from the data set [8], and are regulated to the
size of 800*600 before use. We discarded the duplicate
In computer vision, sets of data acquired through detections which are of the 8-neighborhood with other
sampling from the same scene or object at different features. Reliable matches are obtained by performing
times, or from different perspectives, will be in RANSAC constraint after matching. The average
different coordinate systems. Image matching is the numbers of extracted features are shown in Table 1. It
process of finding the correspondences between these can be observed from Table 1 that the best
sets of data. performances are obtained using the combination of
We break up a matching task into three modules [4], MSER and LoG extrema.
and a modularized design method [5] is adopted for
selecting the appropriate algorithm. It starts with a Table 1. Comparisons of different
feature detection module that detects features detector combinations
(interesting points, regions) from the images. This is
followed by a description module, where local image Combination I & II II & III I & III I, II & III
regions are selected and converted to descriptor Totally Detected 907 609 648 1082
vectors that can be compared with the other sets of Discard Duplicates 493 492 519 530
Matched 185 193 200 202
descriptors in the final matching module. Note that the
Reliable Matched 174 185 189 189
main focus is on precision rather than computational
load.
It should be noted that the features are detected in a
2.1. Feature detection multi-scale representation, from which a set of
smoothed images are obtained through a successive
The intention which is also the challenge of this convolution of the Gaussian kernel.
module is to find the features that are invariant to LG ( x, y ) GG ( x, y ) … I ( x, y ) ,
viewing conditions. Though image features are usually 1 x2  y 2
classified into two categories, global and local, the where GG ( x, y ) Exp( ).
latter has been shown to be better suited for matching
2SG 2 2G 2
as they are robust to occlusion and background clutter. A multi-scale pyramid is built by several octaves
Different solutions for local feature extraction have with different initial size, and each octave is divided
been proposed in recent years. Some researchers into several intervals. We construct the scale factor
proved that the interest points are robust and stable, but (o,i) = 2(o+i/I)/2, and 1/(21/2) bicubic interpolation is
others suggested distinguished regions are more performed to obtain the initial image of the next octave
appropriate for matching. Mikolajczyk et al. [6, 7] when the variance of internal scale factor equals to
systematically reviewed and compared the 21/2, where o and i are defined as the sequence
performance of these detectors. The experimental number of octave and interval.
results revealed that their Harris-Laplace detector and
the DoG extrema proposed by Lowe [2], which is an 2.2. Description
approximation of LoG, outperform other interest point
detectors, and the MSER detector proposed by Matas For a robust image matching algorithm, extracting a
et al. [1] performs best in the region detectors on a sufficient number of distinctive potential features is a
wide range of test sequences. prerequisite but certainly cannot guarantee success. In
As a matter of fact, some light-weight detectors most cases, the description module also acts an
have already performed well in narrow baseline important role to the system. Though the algorithms
systems, but the number of reliable matches drops for building the descriptors vary to a large extent
sharply when there are large viewpoint and currently, the influential reviews by Winder et al. [5]
illumination changes, even if using the most complex and Mikolajczyk et al. [9] ranked the Scale Invariant
detectors. Feature Transform (SIFT) and Gradient Location and
We found an effective way to solve this problem Orientation Histogram (GLOH) [8] as the best
through the fusion of different detectors. Three descriptors that are able to achieve high accuracy.
candidate detectors, LoG extrema (I), Harris-Laplace In this module, the SIFT descriptor is adopted for
points (II) and MSER (III), are selected in our its intuitional representation of a 3D space. After a
experiments. We extracted the features using all the sequence of successive processing, the descriptor
possible detector combinations. The test images are becomes not only invariant to image scaling and

175
rotation, but also partial invariant to changes in the We propose a Reconstruction Similarity test to
illumination and the 3D camera viewpoint. avoid the need for the information of the data set. The
flow chart is shown in Figure 1. We define a non-
2.3. Matching overlapping feature to represent a kind of feature
which the distance to any other features is larger than a
Exact NN search, such as the Hash Table and the threshold (a threshold of 4 pixels is chosen in our
KD-Tree, may bring “curse of dimensionality” into our experiments), and the rest are overlapping features.
image matching system due to the high dimensionality After performing an image matching task between two
of the SIFT-like descriptors. In the standard SIFT, the images, we firstly remove the matched overlapping
Best-Bin-First (BBF) approach [10] is used, which can features in one reference image, and a Delaunay
reduce the search time through maintaining a priority triangulation mesh [11, 12] is built on the non-
queue of bins that are not visited. We adopted a new overlapping features. Then, a piece-wise affine
approximate NN approach called the hybrid SP-Tree deformation is performed on the mesh to reconstruct
[3] which has been shown to outperform other existing image 2 using the materials from image 1. The last step
approaches. The hybrid SP-Tree is considered as a is similarity measurement, in which, features are
combination of the “Defeatist search” in the non- extracted using 2D-PCA [13], and the similarity is
overlapping nodes and the SP-Tree in the overlapping computed using a cosine classifier.
nodes. The key idea of this approach is to properly set
an overlapping buffer that the child-nodes can share Image 1 Image 2
their points, with which the time cost of back-tracking Image Matching
can be greatly reduced.
Finally, an effective way of eliminating the error Non-overlapping
Feature
matches, such as RANSAC or epipolar geometry Selection
consistent, should not be omitted.
The comparisons of computational time between Delaunay
Triangulation
different image matching algorithms are conducted and
the results are shown in Table 2. The results show that Affine
Reconstruction
the hybrid SP-Tree works more efficiently than the
BBF while maintaining the same accuracy. Image 1 Image 2*
Trim Trim
Table 2. Comparisons of different
image matching algorithms Similarity Measurement
Reconstruction
Similarity
LoG+MSER Proposed
Algorithm SIFT GLOH MSER +BBF Algorithm
Time (sec) 2.312 2.507 1.475 3.144 2.389 Figure 1. Flow chart of
Reconstruction Reconstruction Similarity
Similarity 0.966 0.969 0.911 0.972 0.972

It is known that local image deformations cannot be


2.4. Evaluation
realistically approximated even when using a full
affine translation in the wide baseline system.
In the field of image matching, the performance of
However, there is no doubt that if more accurate
the different algorithms is usually measured based on
correspondences and greater number of reliable non-
the repeatability rate, i.e., the percentage of the number
overlapping matches can be obtained, then a higher
of points that are simultaneously present in two images.
reconstruction similarity can be achieved.
Though the repeatability rate indicates the potential
Based on our experience, in a wide baseline system,
matches, it cannot fully represent the overall
some wrong matches are hard to be eliminated using
performance. Mikolajczyk et al. [9] and Winder et al.
RANSAC or the epipolar constraint, but the remaining
[5] introduced the Recall-Precision and Receiver
ones are usually cause of fatal error. Though ROC and
Operating Characteristics (ROC) for evaluation, and
Recall-Precision can reflect the number of the matches,
gave the comprehensive reviews of the current
neither of them can describe the degree of error. The
algorithms. Other researchers cannot achieve the same
advantage of the reconstruction similarity is that it is
comparisons because they may not know the ground
covariant with both these two factors.
truth of the test data set.

176
(1) (2)

Figure 2. Two matched images with 3 adjacent corresponding pairs A, B, and C

(1) (2)

Figure 4. Two matched images with corresponding regions A and B

We tested our algorithm on the image library [8] when the camera is closer to A in image 1 than it in
with three highly recognized methods, namely, SIFT, image 2, accordingly, the camera will be closer to 1B
GLOH and MSER. The experimental result shown in than to 2B. It can be decomposed as follows:
Table 2 demonstrates that algorithm we proposed ­ E[1 A / E[1B | E[ 2 A / E[ 2 B
performs better than others, benefiting from the ® .
efficient searching engine and the fusion of region- ¯ E[1 A  E[ 2 A | E[1B  E[ 2 B
based and point-based detectors. However, it is more complicated in a real system
due to changes in the camera settings and the
3. Order regularity of scale space viewpoint. More significantly, the durations of certain
features in a scale space are not the same. Some
According to the theory of scale space, the linear features can be detected with large scale variances
scale space mainly reflects the distance between the (stable), while other features only exist in a narrow
camera and the object. In a simplified model, the value range of the scale change (unstable). Thus, there is a
of  (denote  = log2 as an indicator of the scale space) common phenomenon that a feature from a higher
is proportional to the distance. Assuming object A and scale space has been matched with one from a lower
B simultaneously appeared in image 1 and 2 is scale space, which makes E usually smaller than its
adjacent, the order regularity can be expressed as actual value. Wang et al. investigated the relation
E[1 A  E[ 2 A between the two types of extrema, ɮ1 and ɮ2, and the
!0, information content in the scale space [14]. They
E[1B  E[ 2 B concluded that the higher the variation rate of the
where E is the average of the scale space indicators in information content is, the lower is the ratio of the
an image patch. Intuitively, as illustrated in Figure 2, stable to unstable features. Therefore, we can

177
compensate the change of E using the information radius should be adopted to avoid the problem (rX and
theory. Owing to the instability of the cardinality of ɮ1 rY are set to two times of rZ).
and ɮ2, we substitute the entropy for the cardinality. Figure 4 demonstrated the use of this rotation
( E[1 A ˜ E[ 2 B ) /( E[ 2 A ˜ E[1B )  R , expression. The horizontal view angles vary much of
­>1, 1  D @ En'1 A ! En'2 A or En'1B ! En'2 B
the two given images, but it is abnormal that Y:x,y of
R® , region A and B remains the same. In other words, the
¯>1  D , 1@ En'1 A  En'2 A or En'1B  En'2 B change of Y:x,y does not reflect the rotation relative to
where  is set to be 0.16 in our experiments. the camera, so the authenticity is questionable.
For instance, as shown in Figure 2, it is easy to find
that the change of EC from images 1 to 2 cannot meet 5. Conclusions and future work
the order regularity with EA and EB. Hence, if image
1 is authentic, we can say that region C in image 2 has In conclusion, a hybrid matching algorithm is
been tampered. proposed in this paper. It has been proven to be very
suitable for scientific image analysis due to its
4. Description for object rotation desirable properties. We also propose the order
regularity of the scale space and the description for
Y object rotation, which can be used as powerful tools
for tampered image detection. In future, the following
three aspects will be further studied.
z Finding reliable correspondences from images of
X a scene taken from arbitrary viewpoints in
different illumination conditions is still a difficult
Z and critical research issue. A potential solution is
to extend the current algorithm, usually designed
for gray level images, to the color space. Thus,
Figure 3. Rotation in a 3D space we are planning to replace gray-level based
detectors by color-based detectors in our
In Section 2.2, the orientation x,y and the gradient algorithm.
magnitude mx,y of a descriptor are computed using the z While most researchers have concentrated on
pixel differences in order to ensure that they are obtaining more reliable matches, the useful
invariant to rotation. information hidden in matched pairs has been
Z : mx , y ( Lx 1, y  Lx 1, y ) 2  ( Lx , y 1  Lx , y 1 ) 2 ˜ w ignored. An attempt to explore this is made in this
paper, but it is a very small percentage of the total
Z : T x, y tan 1 (( Lx , y 1  Lx , y 1 ) /( Lx 1, y  Lx 1, y ) matches. More embedded information can be
where mx,y is weighted by the Gaussian-weighted mined and utilized in image processing.
circular window. z Completing a semi-automatic system for
rZ
2 detecting tampered images is now under
w Exp( ) consideration.
2 ˜ (1.5 ˜ V ) 2
Z:x,y and Z:mx,y can be treated as an indicator of the Acknowledgements
rotation of an object with respect to the Z-axis as
illustrated in Figure 3. We extend another two This research is supported by the National High-
orthogonal orientations to fully describe the rotation in Tech Research and Development Plan of China (863
a 3D space: Program) under Grant No. 2007AA01Z423, by the
X : mx , y ( Lx , y 1  Lx , y ) 2  ( Lx , y 1  Lx , y ) 2 ˜ w' Defense Basic Research Project of the ‘Eleventh Five-
Year-Plan’ of China under Grant No. C10020060355,
X : T x, y tan 1 (( Lx , y 1  Lx , y ) /( L x , y 1  Lx , y )) by the Key Project of Chinese Ministry of Education
under Grant No. 02057 and by the Key Research
Y : mx , y ( Lx1, y  Lx , y ) 2  ( Lx1, y  Lx , y ) 2 ˜ w' Project of the Natural Science Foundation of
.
Y : T x, y tan 1 (( Lx1, y  Lx , y ) /( Lx1, y  Lx , y )) Chongqing Municipality of China under Grant No.
CSTC2005BA2002, CSTC2007AC2018. In addition,
The absence of the depth information usually the research is conducted at the CIPMAS AR
causes the instability of the two presented orientations. Laboratory at the National University of Singapore.
Therefore, a larger Gaussian-weighted circular window

178
References
[13] J. Yang, D. Zhang, A.F. Frangi, and J.Y. Yang, “Two-
[1] J. Matas, O. Chuma, M. Urbana, and T. Pajdla, “Wide- dimensional PCA: a new approach to appearance-
Baseline Stereo from Maximally Stable Extremal based face representation and recognition”, IEEE
Regions”, Image and Vision Computing, 2004, 22(10), Transactions on Pattern Analysis and Machine
pp.761-767. Intelligence, 2004, 26(1), pp.131-137.

[2] D.G. Lowe, “Distinctive Image Features from Scale- [14] Z.Y. Wang, Z.X. Cheng, and S.J. Tang, “Information
Invariant Keypoints”, International Journal of Measures of Scale-Space based on Visual Characters”,
Computer Vision, 2004, 60(2), pp.91-110. Journal of Image and Graphics, 2005, 10(7), pp.922-
928.
[3] T. Liu, A.W. Moore, A. Gray, and K. Yang, “An
Investigation of Practical Approximate Nearest
Neighbor Algorithms”, Advances in Neural
Information Processing Systems, 2005, pp.825-832.

[4] P.E. Forssén, “Maximally Stable Colour Regions for


Recognition and Matching”, Proceedings of the IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition, 2007, pp.1220-1227.

[5] S.A. J. Winder and M. Brown, “Learning local image


descriptors”, Proceedings of the IEEE Computer
Society Conference on Computer Vision and Pattern
Recognition, pp.17-24.

[6] K. Mikolajczyk and C. Schmid, “Scale & Affine


Invariant Interest Point Detectors”, International
Journal of Computer Vision, 2004, 60(1/2), pp.63-86.

[7] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A.


Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L.
Vangool, “Comparison of Affine Region Detectors”,
International Journal of Computer Vision, 2005,
65(1/2), pp.43-72.

[8] Affine Covariant Regions Datasets, url: http://www.


robots.ox.ac.uk/~vgg/data/, 2004.

[9] K. Mikolajczyk and C. Schmid, “A Performance


Evaluation of Local Descriptors”, IEEE Transactions
on Pattern Analysis and Machine Intelligence, 2005,
27(10), pp.1615-1630.

[10] J.S. Beis and D.G. Lowe, “Shape Indexing Using


Approximate Nearest-Neighbour Search in High-
Dimensional Spaces”, Proceedings of the IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition, 1997, pp.1000-1006.

[11] A. Bowyer, “Computing Dirichlet tessellations”,


Computer Journal, 1981, 24(2), pp.162-166.

[12] H. Øyvind and D. Morten, “Triangulations and


Applications”, Mathematics and Visualization,
Springer, Berlin, 2006.

179

You might also like