Steganography in Digital Forensics
Steganography in Digital Forensics
ABSTRACT
This review paper is focused on steganography and its role and importance in modern day digital
forensics. Steganography is the phenomena of digitally hiding information in unsuspecting files
such as text, audio or video files to keep the information confidential. It is used in cybercrimes to
conceal data and it poses difficulties in investigations. The review looks at steganalysis
techniques, such as statistical analysis and pattern recognition, checking their importance in
forensic investigations. It also talks about the difficulties that forensic analysts face since
cybercriminals develop more complex methods. It also discusses the strategies that law
enforcement can use to face the challenges in the digital realm. The goal of this paper is to be
helpful to forensic investigators in analyzing how criminals can use steganography to conduct
crimes.
INTRODUCTION
The wide usage of Internet has made it possible for anyone and everyone to be able to
communicate data wherever and whenever. The issue that has risen here is that a lot of people
take advantage of this communication and are able to communicate data which is illegal or is not
legitimate for others. To do this on a public network like the Internet, they have to get creative
and make it so that not everyone will be able to read that data. To do this, there are two main
methods, steganography and cryptography. In this paper, we will be giving an overview of
steganography and how it can be detected and used by digital forensics investigators.
STEGANOGRAPHY
Steganography is a word of Greek origin, ‘steganos’ meaning hidden or concealed, and ‘graphy’
which means writing. Steganography is the science of communicating information in such a way
that it remains hidden. Steganography hides the existence of a message by transmitting
information through various carriers. Its goal is to prevent the detection of a secret message. [1]
Carriers such as image files, audio files or video files can contain hidden information embedded
in such a way that only the recipient would know how to read or extract it. These carriers are also
called cover carriers. Steganography consists of two terms that is message and cover image.
Message is the secret data that needs to hide and cover image is the carrier that hides the message
in it. [2] Steganography equation is ‘Stego-medium = Cover medium + Secret message +
Stego key’.[3] The process of hiding information considers three main components: a sender,
receiver, and medium. Each one of these components presents a threat in some sense. There are
attacks that target the sender and others that focus on the receiver. The medium is also targeted
and can be attacked either passively or actively. A passive attack relies on observing the
communication , the two parties, and the medium they are using. Once an attacker gets these
details, he is more capable of recovering the hidden message, especially with the use of advanced
technologies.[4]
STEGANALYSIS
The method of discovering the information hidden through steganography is called steganalysis.
It is to steganography what cryptanalysis is to cryptography. The goal of steganalysis is to
discover hidden information and to break the security of its carriers.
Forensic experts use automated tools to inspect steganography and access hidden data in slack or
unallocated spaces on storage devices.
It is possible to hide tiny amounts of data in unused file headers as well. Digital forensic experts
research network channels such as TCP/IP protocol because this sends data that triggers offenses
such as illicit messaging, theft, manipulating electronic payments, gaming, and prostitution,
abuse, malware, pedophilia.[7]
Advanced technology has many benefits but it also enables new forms of criminal activity,
requiring sophisticated investigative methods to counteract its negative impacts. Steganography
has many types which include text steganography the art of hiding data within text, image
steganography which is embedding data in images, audio steganography that involves
concealing data in audio files and video steganography that consists of hiding data within video
streams. Following methods are discussed of each type to understand how criminals hide
data and furthermore we will discuss how to uncover this hidden data .
Image Steganography in forensics
The central concept of Image Steganography is the process of hiding the data within an image so
that it will be invisible to the eye in the original image. Taking the cover object as an image to
conceal the information, and it depends on the quality of the pixels to hide the data.[8] There are
several techniques used to hide information in images, that includes the Transform Domain, least
significant bit and Masking and Filtering techniques.
.LSB Technique: In image steganography, the first technique is called the least significant bit
(LSB) and defined as the substitution of single LSB with the bit pattern, so the bits are embedded
in the image’s data, which are called pixels. These changes are likely to be invisible to the
human visual system (HVS). The embedded algorithm of LSB steganography is based on the
following formula :Yi = 2|x1 2 | + mi, where mi is the i-th message bit, xi is the i-th selected
pixel value before embedding, and yi is the i-th selected pixel value after embedding. Let
Px(x = 0),Px(x = 1) refers to the distribution of the least significant bits of the cover image, and
Pm(m = 0),Pm(m = 1) refer to the distribution of the secret binary message bits. To keep the
distribution message which is equal to Pm(m =0)∼ = Pm(m =1) ∼ = 1 2 . Also, the cover image
secrecy of the message, we encrypt the message before embedding, as the average of the
Where P is the embedding rate, measured in bits per pixel (bpp) . When applying this embedding
technique, it is possible to elicit the embedded message from the selected pixels in the LSBs
technique [9].
. Transform Domain Technique: The transform technique, also known as the frequency
technique, embeds messages by altering coefficients to apply transformations in the frequency
domain. In image steganography, multiple different algorithms are used with this method to
modify images into their frequency domain for secure and safe data embedding.
Audio steganography
Audio files are widely used for hiding data due to their availability and ability to carry hidden
information. However, audio steganalysis remains less explored, as the human auditory system
can detect minorchanges, making it difficult yet exploitable for criminal activities. Audio
steganography is hiding information in sound files, and the sound signals are concealed by
means of Phase encoding, Spread Spectrum, and Echo hiding techniques. The Phase
Modulation technique concerns itself with modifying the phase components of audio
(imperceptible to the human ear) in such a way that the data is securely embedded with quite
some resistance to signal distortion. Spread spectrum places noisily encrypted content within the
various parts of the frequency spectrum so that the noise can withstand interference but may
create more noise in the process. Echo hiding conducts modulation on echo amplitude and decay
in order to discretely embed information in sound signals. These measures though provide
varying levels of security and embedding capacity overwells in one way or the other interfere
with the performance of steganalysis and forensic detection tasks.
Video steganography
This is a technique in which digital video format is used to hide data. A video file that collects
different image frames is used as the carrier to cover the data. Generally, discrete cosine
transformation (DCT) is used because human eyes do not understand it. Different types of
formats used in video steganography include H.264, Mp4, MPEG, AVI. The basic block diagram
is given in Figure 3.
This technique obscures hidden information by modifying the visually less significant regions of
the video. For example:
LSB (Least Significant Bit): The least significant bits are replaced with data bits in pixel
values; however you can easily detect and attack this.
Adaptive Technique:
This technique studies the video to identify active or significant regions (such as locations in
motion or regions containing excessive detail) where we would have a secure place to conceal
data. This internalization method is tailor-made to the video’s characteristics, embedding all this
data as seamlessly as possible, resulting in higher quality
Text steganography
This method is considered one of the oldest techniques in steganography as well as the most
difficult one, for the reason of the lack of redundant information in a text file. Nowadays,
computer systems have simplified hiding in formation in texts. Consequently, the range of using
hidden information in the text has also developed. Text steganography is broadly classified into
three types- format-based, random, and applied math generations and Linguistic methodology.
Two of which are described here:
1) Format-based Technique: It is used to alter the format of the cover-text to cover knowledge.
They are not doing any modification to word or sentence. It typically modifies the present text to
cover the stenographic text. A format-based text steganography method is an open space method
[12]
2)Lexical Steganography, this technique uses certain words from the text, which are selected,
then their synonyms are identified. After that, the terms along with their synonyms are used to
hide the secret message in the text, and the alternative of the word to be chosen from the list of
synonyms would rely on secret bits; it used synonym replacement by using a synonym.[13]
OPEN CHALLENGES:
As steganalysis has gotten attention from last many years but some of its challenges remain
unsolved like:
Firstly, to date, there is no generalized CNN model that can detect hidden messages in unseen
data. Secondly, to date, there is no generalized CNN model that can detect hidden messages in
unseen data. Third, many datasets are available with different specifications, such as the data
domain. However, the current steganalysis deep learning models use specific datasets. So, there
is a need for a steganalysis deep learning model that can learn from different datasets to use the
available data efficiently. Fourth, there are some open questions regarding the use of 3D datasets
in steganalysis with deep learning. Deep learning methods have shown promising results in 2D
image steganalysis, but there are two potentially challenging questions: Do 3D Steg analytic
methods based on deep learning provide better performances? Is there enough 3D data that can
be used to train the steganalysis CNN models? Finally, Steganography is designed to pass hidden
messages through media such as the internet, and the data might be exposed to manipulation
during the transmission process (e.g., by rotation or corruption). Thus, an interesting direction of
research would be to build deep learning models that can learn to predict the hidden messages
correctly. Feature dimensionality is still a problem especially when performing real-time
steganalysis. Therefore, it is necessary to find an appropriate accelerating algorithm to speed up
the learning process even in deep learning methods 998 without compromising the accuracy.[14]
As steganography becomes more widely available and the amount of data on local machines and
Internet increases, the issue of detection of the use of steganography by digital forensics
personnel becomes increasingly important. In theory, this should be evaluated in any type of case
involving computer use. In practice, most cases will involve audiovisual files, such as in child
pornography. However, cases of industrial espionage and fraud could be encountered.[15]
Though steganography tools may be used for legitimate business applications such as protecting
strategic corporate information during transmission (Schmidt et al, 2004), they have emerged as
a significant issue to forensic investigators and others who are concerned with malicious and
illegal uses. As steganography tools become more widely available and easier to use, protection
against malicious use demands attention, and the balance between protection from illicit use and
interference with legitimate use emerges as a new challenge.[15]
CONCLUSION
Digital steganography has a multitude of applications, both legal and illegal, and is considered an
advantageous technique that is relatively easy to access for the common user. It is important to
note that, while it has multiple advantages (e.g., exposing plagiarism), it is also a technique
widely used as an anti-forensic measure. [5] Steganography is a science that is improving and
becoming more advanced day by day. Although, the resources to learn and implement it are easy
to obtain, it is still not very well known to common people. But since it is advancing, it is a very
short time till when people begin to focus on this and as people will use it, they will come up
with new techniques and it will make it harder for law enforcement agencies to detect it.
Because of this, digital forensics analysts and steganalysts need to continue to improve their
technologies and methods to always be able to catch up to the steganographic techniques. They
also need to remain aware of the techniques to be able to utilize them in their working.
REFERENCES
[1] Richer, P. "Steganalysis: Detecting hidden information with computer forensic analysis."
[3] Sumathi, C. P., Santanam, T., and Umamaheswari, G. "A study of various steganographic
techniques used for information hiding."
[4] Alabdali, N., and Alzahrani, S. "An overview of steganography through history."
[5] Fernandes, C. "Steganography and computer forensics - the art of hiding information: A
systematic review."
[6] Hasan, Raza, Salman Mahmood, and Akshyadeep Raghav. "Overview of computer forensics
tools." Proceedings of 2012 UKACC International Conference on Control. IEEE, 2012.
[7] Cosić, Jasmin, and Miroslav Bača. "Steganography and its implication in forensic
investigation." Infoteh 2010. 2010.
[8] Hariri, Mehdi, Ronak Karimi, and Masoud Nosrati. "An introduction to steganography
methods." World Applied Programming 1.3 (2011): 191-195.
[10] Bodhak, V., and L. Gunjal. "Improved protection in video steganography using DCT &
LSB." International Journal of Engineering and Innovative Technology (IJEIT) vol. 1, issue 4.
(2012).
[11] Sadek, Mennatallah M., Amal S. Khalifa, and Mostafa GM Mostafa. "Video steganography:
A comprehensive review." Multimedia Tools and Applications 74.17 (2015): 7063-7094.
[12] Bender, Walter, et al. "Techniques for data hiding." IBM Systems Journal 35.3.4 (1996):
313-336.
[13] Alsaidi, Nawal, et al. "Digital Steganography in Computer Forensics." College of Computer
Science and Engineering, Cybersecurity Department, University of Jeddah, Saudi Arabia.
[14] Eid, W. M., Alotaibi, S. S., Alqahtani, H. M., & Salehd, S. Q. "Digital image steganalysis:
Current methodologies and future challenges."
[15] Warkentin, M., Bekkering, E., & Schmidt, M. B. "Steganography forensic, security, and
legal issues." Mississippi State University, Northeastern State University, St. Cloud State
University.