Steganography and Steganalysis Methods
Steganography and Steganalysis Methods
CHAPTER 2
STEGANOGRAPHY AND STEGANALYSIS METHODS
2.1 INTRODUCTION
The term steganography is derived from the Greek words
cover
steganography is to provide the secret transmission of data.
Steganalysis provides a way of detecting the presence of hidden
information.
Carrier medium
Steganography
Spatial Transform
Domain Domain
1. Spatial domain
This technique embeds messages in the intensity of the pixels
directly. Some of the spatial domain methods are:
1. Least Significant Bit (LSB) Matching.
2. Least Significant Bit (LSB) Replacement.
3. Matrix Embedding.
4. Pixel-value-based image hiding.
5. Difference Expansion (DE).
6. Histogram modification.
7. Predicted based image hiding.
This research focuses on LSB Replacement method for data hiding
which is described in detail in section 2.3.2. Among all message
embedding techniques, the LSB insertion / modification is considered a
difficult one to detect (Wayner [115]; Petitcolas et al. [83]). Spatial
domain reversible data hiding is performed based on the methods
difference expansion (DE) [146] and histogram modification [153],
[147]. The former method provides higher capacity whereas the later
provides better quality image. In DE method, the embedded bit stream
includes 2 parts. The first part is the payload that conveys the secret
message and the second part is the auxiliary information that contains
embedding information. The size of the second part should be kept very
small to increase embedding capacity.
Tian [155] proposed a prototype using DE embedding that has
larger embedding capacity and also easy to embed. Ni et al. [153]
proposed a reversible data hiding scheme based on histogram
modification. This scheme adjusts pixel values between peak point and
zero point to conceal data and to achieve reversibility. In this scheme,
part of the cover image histogram is shifted rightward or leftward to
produce redundancy for data embedding. Li et al. [154] proposed
21
reversible data hiding method called adjacent pixel difference (APD). This
method is based on the neighbor pixel differences modification. In this
method, an inverse S order is adopted to scan the image pixels. Tai et
al. [147] proposed a pixel difference based reversible data hiding
scheme. Tsai et al. [156] proposed a block-based reversible data hiding
scheme using prediction coding. However, this scheme had problems in
prediction coding and dividing histogram into two sets.
2. Transform Domain
In Transform domain, images are first transformed and then the
message is embedded into it. These are robust methods for data hiding.
It is more complex method to hide secret message into an image. It
performs data hiding by manipulating mathematical functions and image
transformations. Transformation of cover image is performed by
tweeking the coefficients and inverts the transformation. Popular
transformations include the two-dimensional discrete cosine
transformation (DCT) (Dongdong et al. [18]) discrete Fourier
transformation (DFT) (Shi et al. [101]) and discrete wavelet
transformation (DWT) (Mehrabi et al. [74]) that are commonly used in
image steganalysis. The data hiding is an active field with new methods
constantly introduced, thus enable as a natural way of starting the
research work towards steganalysis.
8765 8765
Stego image
software used for embedding the data into JPEG image. Later, the JSteg-
Shell was designed.
WbStego4open: It does not require registration. It is an open source
application which works in Windows and Linux platform. Bitmaps, Text
files, PDF files, and HTML files can be considered as carrier files. It is an
effective tool for embedding copyright information without modifying
carrier file.
Invisible Secrets: This tool is used to hide data in image or sound files.
It provides extra protection by using AES encryption algorithm. During
the creation of stego files, password is created and stored.
Other steganography tools: Some of the other tools used for image
steganography comprises of Crypto123, Hermetic stego, IBM DLS,
Invisible Secrets, Info stego, Syscop, StegMark, Cloak, Contraband Hell,
Contraband, Dound, Gif it Up, S-Tools, JSteg_Shell, Blindside,
CameraShy, dc-Steganograph, F5, Gif Shuffle, Hide4PGP, JstegJpeg,
Mandelste, PGMStealth, Steghide.
on the internet but JPSeek is rather special. The design objective is same
as JPHide.
StegSecret: It is a steganalysis open source project that makes possible
the detection of hidden information in different digital media. StegSecret
is java-based multiplatform steganalysis tool that allows the detection of
hidden information by using the most known steganographic methods. It
detects EOF, LSB, and DCT like techniques.
StegBreak: It launches brute-force dictionary attacks on JPG image. The
StegBreak states a brute-force dictionary attack against the specified
JPG images.
Other steganalysis tools: Some more image steganalysis tools are
2Mosaic, StirMark Benchmark, Phototile, StegSpy, Stego Suite,
Steganalysis Analyzer Real-Time Scanner, JSteg detection, JPHide
detection, OutGuess detection.
Wij
X1
Output
WX
X2
2.8.4 RS steganalysis
Fridrich et al. [35] developed a steganalytic technique based on
this for detection of LSB embedding in color and grayscale images. They
analyze the capacity for embedding lossless data in LSBs. Randomizing
the LSBs decreases this capacity. To examine an image, they define
Regular groups (R) and Singular groups (S) of pixels depending upon
some properties. Then with the help of relative frequencies of these
groups in the given image, in the image obtained from the original image
with LSBs flipped and an image obtained by randomizing LSBs of the
original image, they try to predict the levels of embedding.
38
image and comparing with that of a stegoed image, the hidden message
length can be calculated. Sullivan et al. [82] use an empirical matrix as
the feature set to construct a steganalysis. Chen et al. [14] enhanced
and applied the statistical moments on JPEG image steganalysis.
hiding. This scheme does not auto-calibrate on a per image basis, and
instead calibrates on a training set of cover and stego images. The
scheme works better than a generic steganalysis scheme, but not as well
as state-of-the-art LSB steganalysis.
Another LSB detection scheme was proposed using binary
similarity measures between the 7th bit plane and the 8th (least
significant) bit plane. It is assumed that there is a natural correlation
between the bit planes that is disrupted by LSB hiding. This scheme does
not auto-calibrate on a per image basis, and instead calibrates on a
training set of cover and stego images. The scheme works better than a
generic steganalysis scheme, but not as well as state-of-the-art LSB
steganalysis.
Scheme, proposed by Fridrich et al. [27] is a specific steganalysis
method for detecting LSB data hiding in images. Sample pair analysis is
a more rigorous analysis due to (Dumitrescu et al. [19]) of the basis of
the RS method, explaining why and when it works. Roue et al. [92] uses
estimates of the joint probability mass function (PMF) to increase the
detection rate of RS/sample pair analysis. Fridrich et al. [26] uses local
estimators based on pixel neighborhoods to slightly improve LSB
detection over RS.
by adding data. In some respects this assumption lies at the heart of all
steganalysis
the systems learn using some form of supervised training.
An early approach was proposed by (Avcibas et al. [7]) to detect
arbitrary hiding schemes. He design a feature set based on image quality
metrics (IQM), metrics designed to mimic the human visual system
(HVS). In particular they measure the difference between a received
image and a filtered (weighted sum of 3 × 3 neighborhood) version of
the image. This is very similar in spirit to the work by (Celik et al. [9])
except with filtering instead of compression. The key observation is that
filtering an image without hidden data changes the IQMs differently than
an image with hidden data. The reasoning here is that the embedding is
done locally (either pixel-wise or block wise), causing localized
discrepancies.
A supervised learning has been used to detect general steganalysis
(Lyu et al. [68]). Lyu et al. [67] use a feature set based on higher-order
statistics of wavelet sub band coefficients for generic detection. The
earlier work used a two-class classifier to discriminate between cover
and stego images made with one specific hiding scheme. Later work
however uses a one class, multiple hyper sphere, SVM classifier. The
single class is trained to cluster clean cover images. Any image with a
feature set falling outside of this class is classified as stego. In this way,
the same classifier can be used for many different embedding schemes.
The one-class cluster of feature vectors can be said to capture a
s et al. [5], the general
applicability leads to a performance hit in detection power compared with
detectors tuned to a specific embedding scheme. However the results are
acceptable for many applications.
42
Ying et al. [134]; Mei et al. [75]; Yuan et al. [135]; Lingna et al. [64];
Ferreira et al. [24]; Han et al. [44]; Xiongfei et al. [131]; Ziwen et al.
[141]; Malekmohamadi et al. [70]) Describing the supervised learning
steganalysis method in a general scenario, some image features are first
extracted and given as training input to a learning machine. These
examples include both stego and non stego messages. The learning
classifier iteratively updates its classification rule based on its prediction
and the ground truth. Upon convergence the final stego classifier is
obtained. Some of the major advantages using supervised learning
based steganalysis are as follows:
1. Construction of universal steganalysis detectors using learning
techniques and
2. Several freely available software packages on the Internet could be
directly used to train a steganalysis detector.
Martin et al. [72] found that data hidden certainly caused shifts
from the natural set, knowledge of the specific data hiding scheme
provides far better detection performance. A variation of passive
steganalysis is active steganalysis, deals in determining or estimating the
length of the secret message and the extraction of actual contents of the
message (Chandramouli et al. [11]; Fridrich et al. [30]; Chandramouli
[12]; Jacob et al. [54]; Ming et al. [78]; Shaohui et al. [99]; Xiangyang
et al. [44]). The methods that estimate the length of secret message or
extract the hidden contents are known as embedding- specific methods.
A universal or generic steganalytic method that should be independent of
embedding-specific method suits best in digital forensics.
Most of the present literature on steganalysis follows either a blind
model (Farid [22]; Jacob et al. [54]; Lyu [67]; Celik et al. [9]; Guo [43];
Hongchen et al. [50]; Chen et al. [14]; Gul et al. [42]; Zhuo et al.
[140]; Xiao et al. [125]; Xue et al. [132]; Wang et al. [23]; Feng et al.
51
2.9 SUMMARY
This chapter has presented an overview of various types of
steganography and steganalysis methods. Some of the steganographic
and steganalysis tools are discussed. Limitations of steganalysis as well
as review of literature on steganalysis are also described. Generation of
data is described in chapter 3.