Underwater communications with acoustic steganography - Recovery analysis and modeling
Underwater communications with acoustic steganography - Recovery analysis and modeling
DSpace Repository
2021-12
Strelkoff, Samuel
Monterey, CA; Naval Postgraduate School
https://hdl.handle.net/10945/68748
This publication is a work of the U.S. Government as defined in Title 17, United
States Code, Section 101. Copyright protection is not available for this work in the
United States.
THESIS
by
Samuel Strelkoff
December 2021
i
THIS PAGE INTENTIONALLY LEFT BLANK
ii
Approved for public release. Distribution is unlimited.
Samuel Strelkoff
Lieutenant, United States Navy
BS, United States Naval Academy, 2013
from the
Charles D. Prince
Second Reader
Gurminder Singh
Chair, Department of Computer Science
iii
THIS PAGE INTENTIONALLY LEFT BLANK
iv
ABSTRACT
v
THIS PAGE INTENTIONALLY LEFT BLANK
vi
Table of Contents
1 Introduction 1
1.1 Benefit to the Department of Defense . . . . . . . . . . . . . . . . 1
1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Approach and Significant Findings . . . . . . . . . . . . . . . . . 3
1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 7
2.1 Acoustic Steganography Communication System . . . . . . . . . . . . 7
2.2 Fading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Underwater Acoustic Communication . . . . . . . . . . . . . . . . 15
2.4 Data Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Methodology 33
3.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Mathematical Refutations . . . . . . . . . . . . . . . . . . . . . 37
3.3 Alternative Solutions . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Channel Effects . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
vii
5.1 Key Findings . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . 76
5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
List of References 79
viii
List of Figures
Figure 4.2 Average Root Mean Square Error (RMSE) vs. Symbol Space. . . 59
Figure 4.3 Average Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ra-
tio (PSNR) vs. Symbol Space. . . . . . . . . . . . . . . . . . . . 61
ix
THIS PAGE INTENTIONALLY LEFT BLANK
x
List of Tables
xi
THIS PAGE INTENTIONALLY LEFT BLANK
xii
List of Acronyms and Abbreviations
xv
THIS PAGE INTENTIONALLY LEFT BLANK
xvi
Acknowledgments
First, I need to thank my advisor, Dr. Rohrer, and second reader, Charles Prince, without
whom none of this would be possible. Thank you for introducing me to this topic, treating
me as a peer, maintaining a high standard, and providing much needed guidance and support.
To my wife, I owe an unpayable debt of gratitude. Without her patience, love, support,
pragmatism, and sacrifices, I never would have completed this project.
Finally, I am grateful to all my friends and relatives who showed interest in my work, and
in ways both large and small, provided moral support. I am humbled by the quality of the
people who have graced my life.
xvii
THIS PAGE INTENTIONALLY LEFT BLANK
xviii
CHAPTER 1:
Introduction
In the modern warfare environment, communication is a critical but not necessarily guaran-
teed requirement. The evermore sobering reality of communications denied environments
highlights the need for communications systems with Low Probability of Intercept and De-
tection (LPI/LPD). This is doubly true in the undersea environment, where communications
and sonar systems can reveal the tactical location of platforms and capabilities, subverting
their operational and strategic advantage.
This thesis explores the ability to improve symbol recovery in the frequency domain stegano-
graphic technique developed by LT Ferrao [1], based on Passerieux’s time domain work [2],
[3]. We seek to expand on the system presented by Ferrao in his thesis by developing tech-
niques to expand the symbol space, mitigate channel effects, and improve symbol recovery.
We utilize simulation based experimentation to assess the effectiveness of these techniques.
In this chapter, we describe the benefit of our research to the Department of Defense (DOD),
state our objectives, define the scope of our research, present our findings, and outline how
this thesis is organized.
1
detectable, interceptable, locatable, and deniable by our adversaries. Any of these four out-
comes are undesirable for subsurface platforms, and their very possibility degrades mission
capability. Since the research is designed to result in a LPI/LPD communications software
system, no new hardware contracts would be necessary, easing adoption across the USN to
enhance warfighting capability.
• What effect does the underwater acoustic channel have on transmitted signals? Can
equalization techniques reduce the Bit Error Rate (BER) below 10-3 ?
• What channel equalization techniques can mitigate channel effects and operate within
the constrained structure of our steganographic transmission system?
• Does the implementation of symbol recovery techniques and symbol rate improve-
ments to minimize BER maintain the inherent covertness of the scheme in terms of rel-
ative distortion, Mean Square Error (MSE), and Peak Signal-to-Noise Ratio (PSNR)?
• Are brute-force methods likely capable of detecting our steganographic system?
• How well does our system perform—in terms of data rate, BER, and PSNR—at
various depths, ranges, and acoustic environments?
1.3 Scope
The primary goals of the study are to understand the impediments to symbol recovery
and present a technique that maximizes accurate reception under simulated underwater
transmission conditions. Additionally, improved symbol rate is pursued. At the conclusion of
our research, we make recommendations for subsequent research and relevant applications.
In order to reduce complexity sufficiently to progress the state of the art, the following are
assumed to be solved or negligible for the purposes of our work and are outside the scope
of this thesis:
2
• Doppler effects in the transmission channel
• Particulars of and compatibility with specific hydrophone and array designs or pa-
rameters
• Channel equalization techniques
• Signal detection and receiver synchronization
• System performance in an actual at-sea environment
• Different symbol encoding schemes
• Different steganographic systems and the resistance of our chosen scheme to attack
We first derive an analytical recovery method based on the receiver’s knowledge of the
structure of the received signal and a shared secret key. Secondly, we develop a numerical
recovery solution by implementing numerical derivation and integration techniques that
attempt to isolate a symbol value that is constant over some fixed number of samples. For
our third recovery solution, we present a basic comparative recovery technique based on
shared knowledge of the cover signal. While we implement these recovery schemes within
the system given by Ferrao [1] that uses a 2048 Hz bandwidth centered around 2500 Hz,
our steganographic technique is fundamentally agnostic to the particular bandwidth.
We model the effects of transmission on recovery using two underwater waveguides out-
put by the Monterey-Miami Parabolic Equation (MMPE), then we simulate transmission
through a deep water channel and a more adverse shallow channel. We show that our symbol
encoding technique does not negatively affect the steganographic properties of our signal.
We also show that statistical recovery solutions are unlikely to be successful for our stegano-
graphic scheme. We find that because the Discrete Fourier Transform (DFT) generates a
point-wise approximation of a continuous Fourier transform, its derivative with respect to
3
frequency is inherently discontinuous, and therefore, a numerical approximation of the in-
definite integral of this derivative cannot exit. We successfully demonstrate that a recovery
scheme based on locally constant variables can be highly performant in a variety of chan-
nels, given an assumption of accurate synchronization and equalization. The assumption of
perfect synchronization and equalization results in the unexpected phenomenon of constant
recovery performance with distance, so we also show how different assumptions impact
performance. Anything other than near-perfect equalization of phase results in recovery
performance effectively no better than random guessing.
1.5 Organization
The remainder of this thesis uses the following organization:
Chapter 2: Background
This section provides a survey of existing work related to wireless and acoustic communica-
tions. Wireless effects, and their specific application within the underwater acoustic channel
are discussed. Existing statistical models of wireless communications are briefly presented
to provided context to the existing work on data recovery methods. The bulk of this chapter
consists of a discussion on existing data recovery methods.
Chapter 3: Methodology
First, mathematical refutations of assertions from previous works are presented. Then the
specific data recovery techniques chosen for analysis are presented and formally described in
this section. The theoretical bases, as well as any assumptions or constraints, are discussed.
Broadly, this chapter outlines our research goals and the path to achieve them.
The chapter introduces the simulation method along with specific modelling parameters.
We present the data collected, describe and apply the models and methods of analysis, and
explain the results. Challenges encountered are explained, accompanied by solutions and
mitigating factors. Hypotheses for observed trends and anomalies are discussed, and we
compare different recovery schemes.
4
Chapter 5: Conclusions and Future Work
This chapter revisits the research and major findings of the previous chapters. Limitations
and remained unanswered questions are discussed. The limitations and gaps, as well as new
questions discovered through the research, generate recommendations for future work.
5
THIS PAGE INTENTIONALLY LEFT BLANK
6
CHAPTER 2:
Background
The problem of signal detection and synchronization in both the underwater and radio fre-
quency environments is well studied, though not all solutions are relevant to our study. In this
chapter, we position our research within the problem domain of existing work. Section 2.1
provides an overview of our chosen steganographic system. Section 2.2 describes various
sources and models of fading in the wireless communications channel, while Section 2.3
details the particular challenges and characteristics of the underwater acoustic communica-
tions channel. Finally, Section 2.4 surveys existing approaches to detection, synchronization
and equalization.
The method can be divided into two components, the transmitter and the receiver. At
the transmitter, a variable-length cover signal is selected and divided into fixed duration
intervals. These intervals are contiguous, but non-overlapping, and generally on the order
of several tenths of a second. One symbol is encoded into each interval. To encode a
7
symbol, a given interval is further divided into sub-intervals. Passerieux gives two sub-
division schemes. In one method, the sub-intervals are divided in the same way as the main
intervals—constant length and contiguous, but not overlapping. The second method divides
the main interval into an even number of variable length sub-intervals. The sub-intervals
are separated by guard periods and symmetrically paired about the halfway point of the
interval. Two sub-intervals that are the same distance from the halfway of the main interval
have the same length, and sub-intervals that are closer to halfway are longer than those that
are farther.
The methods for generating an auxiliary signal are different, though analogous, between
the two sub-division schemes. Though the first method has a stronger steganographic key,
it breaks down when there is relative motion between the transmitter and receiver. Thus,
the interested reader is referred to Passerieux’s patent application [3] for a presentation
of constant sub-intervals, and we proceed only with the second sub-division scheme here.
Pairs of sub-intervals are swapped over the halfway point of the main interval, and then
each sub-interval is time-reversed (flipped about its halfway point in the sub-interval). The
permuted and time-reversed sub-intervals are each multiplied by a phase-term. The set of
phase terms applied across the sub-intervals constitutes the steganographic key. To the edges
of the sub-intervals is applied a smoothing function, which minimizes broadband transients
at the transitions.
The signal, reconstructed from the guard periods and modified sub-intervals, constitutes the
auxiliary signal for the interval. The interval auxiliary signal is multiplied by an amplitude
factor and a phase factor. The amplitude factor should be a gain that reduces the amplitude
of the auxiliary signal relative to the original signal so as to make it imperceptible. The data
symbol is encoded by the combination of the amplitude and phase factors. The full auxiliary
signal is thus the compilation of all interval auxiliary symbols. The auxiliary signal is added
onto the original signal and the result is then transmitted.
The receiver processes a signal which is a combination of the original signal, the auxiliary
signal, and channel effects—such as noise, Doppler spread, multipath, etc. However, if the
gain on the auxiliary signal was sufficiently small and the acoustic channel was not exces-
sively adverse, the received signal will approximate the original signal. The receiver then
uses the approximate signal as the original signal and completes the same sequence of steps
8
as the transmitter, applying the set of phase-terms on the sub-intervals as the steganographic
key. Instead of encoding a symbol, however, the approximate interval auxiliary signal, with-
out gain or phase factors, is used to determine the symbol encoded in the received signal.
The cross-correlation between the received signal and the approximate auxiliary signal
will have peaks separated by the interval duration when the approximate auxiliary signal
is generated with the correct steganographic key. The peaks will be maximized when the
approximate auxiliary signal is adjusted by a Doppler compensation term, which is deter-
mined iteratively. Between the peaks, the approximate auxiliary signal will have a phase
and amplitude proportional to the transmitted symbol.
The auxiliary signal is generated only from X2, which is sub-divided into guard bands and
sub-intervals. In lieu of permutation and time-reversal, as in the time domain, frequency
domain analogues are applied—complex conjugate and multiplication by a phase term. The
steganographic key is then applied across the sub-intervals via multiplication by a complex
exponential factor. As in the time domain, a gain and phase factor are applied to the interval
to reduce its relative amplitude and encode a symbol. The intervals are added onto the
original signal, the inverse transform is applied to the result, and the time domain signal is
transmitted.
At the receiver, the signal is received in time domain and transformed to frequency domain
9
via the DFT. Out-of-band frequencies are ignored, and from frequencies in the transmission
range are symbols recovered. The auxiliary signal is generated from the received signal—
mirroring the process applied to the original signal at the emitter—and then the complex
conjugate is taken. The complex conjugate is multiplied by the received signal, and the real
terms of the result are proportional to the transmitted signal.
Ferrao tested the frequency domain implementation of Passerieux’s technique over two
different simulated channels. Using the MMPE acoustic propagation model [5], a benign
deep water channel and a challenging shallow water channel were generated. The deep
channel consisted of a stationary source and receiver at a depth of 500m and maximum
range of 1 km in a 2km deep channel. The shallow channel consisted of a stationary
source and receiver at a depth of 60m and maximum range of 3km in a 200m deep
channel. In the deep channel, multipath is minimized and spherical spreading dominates
in transmission losses. Conversely, the shallow channel is characterized by multipath and
cylindrical spreading. The range of detectable transmission was primarily dependent on
Signal-to-Noise Ratio (SNR), but the simulations demonstrated adequate symbol recovery
from successfully steganographic transmissions at ranges of 1000 meters or better and a bit
rate of four bits per second (bps).
Ferrao identified several lines of inquiry for future work, but of particular note are the
following:
• Evaluating the algorithm’s performance with an even more realistic channel model
by implementing spectrum-dependent noise and leveraging MMPE’s options and
parameters for increased accuracy.
• Developing a method for synchronizing the receiver and estimating data alignment.
• Developing a method to resist or compensate for Doppler effects.
• Improving data rate.
2.2 Fading
Solutions to the channel detection and synchronization problem are well-studied in the radio
frequency (RF) environment and are generally classified by their applicability to different
fading characteristics and channel models. These have varied relevance to the underwater
10
acoustic channel, but a brief overview of fading and models is warranted to provide context
to the state of the art in our area of research.
Sources
In general, all fading can be considered the complex interaction of five source phenomena:
diffraction, scattering, reflection, multipath, and Doppler effects.
Diffraction
Diffraction describes the modifications to a wave’s propagation path when obstructed by
an object larger than the wavelength. The diffraction effect is commonly referenced in the
literature as shadowing. Diffraction results in periodic areas of loss and availability given by
Fresnel zones. Fresnel zones describe regions where the difference between the diffracted
path and the line of sight (LOS) path is a multiple of a half wavelength. Fresnel zones
describe diffraction-loss observed as a function of distance from an obstruction, since the
signal intensity increases up to the first Fresnel zone, decreases until the second Fresnel
zone, etc. [6].
Scattering
Scattering defines the interaction between a wave and a rough surface smaller than (or equal
to) the wavelength. A wave is generally scattered in all directions upon interaction with such
an object. Scattering results in received power that is less than what would be predicted by
reflection and diffraction alone [6]. Additionally, scattering can result in increased noise at
the transmitter, which can cause interference if the transmitter is a transducer [7].
Reflection
Reflection is the result of a wave interacting with a smooth surface that is large compared to
the wavelength. In such an interaction, various portions of the wave’s energy are reflected,
absorbed, and transmitted. The proportion of these effects depends primarily on the permit-
tivity, permeability, and conductance of the reflector, which are a function of the material.
Inasmuch as the reflector is smooth, (i.e., given a large conductance) the majority of the
signal is reflected such that the incidence angle and reflection angle are equal. Any imper-
fections in the surface result in some scattering of the signal, so that the reflected signal is
11
more diffuse than the arriving signal, but in general, the reflected waves can be considered
a single signal. The proportion of the signal that is absorbed is a function of permittivity,
and the amount that transmits through the reflector is a function of the permeability. The
amplitude of the reflected and transmitted signals is a function of the reflector’s material
properties and the wave’s polarization, wavelength, and arrival angle [6].
While the precise types and locations of reflectors and scatters cannot be determined or
anticipated, signals will certainly encounter both effects. Together, they result in variations in
amplitude, phase, and time delay, which are random since the distribution and characteristics
of the obstructions in the channel are random [6]. Of note, interaction with high conductance
reflectors is the primary source of multipath signals.
Multipath
Multipath describes the phenomenon wherein multiple copies of the same signal arrive at
the receiver by different propagation paths. Most commonly, the primary path will be the
direct, straight-line path from emitter to receiver, and the additional paths will be longer
and so the signals will arrive at the receiver after some time delay. The most common
source of multipath is reflection (e.g., off the ground), though scattering and refraction
(where stratified media have different propagation speeds, causing the signal to bend) can
also contribute. Since each path is different, each arrival will not only have a different time
delay, but different phase and amplitude fluctuation [6]. The interaction of these arriving
signals can be either constructive or destructive. Multipath causes significant fading and
Intersymbol Interference (ISI), where previously transmitted encodings corrupt the current
signal [8].
Doppler Effects
Doppler effects are most commonly caused by relative motion between the transmitter and
the receiver, though motion of the medium can also result in Doppler effects [9]. Divergent
motion results in expansion of the signal and an apparent down-shift in carrier frequency.
Closing motion causes the signal to compress and generates an apparent increase in carrier
frequency. The greater the speed of the relative motion compared to propagation speed,
the greater the Doppler effect, though each path in a multipath arrival will have a different
Doppler shift, since the angle of arrival affects the Doppler shift [6]. The shift in frequency
12
can cause intercarrier, or interchannel, interference in a frequency division system. Even in
the absence of interference, Doppler shifts negatively affect carrier demodulation and data
decoding.
2.2.1 Models
In general, there are two types of channel models—large scale and small scale. Large
scale models describe general signal characteristics within a wide area, and are typically a
function of distance between emitter and receiver. Small scale models are generally more
complicated, and attempt to describe the signal variations arising from minor environmental
fluctuations or changes in position as small as half a wavelength [6].
Large-Scale Models
There are two foundational large-scale models, the Free-Space model and the Lognormal
Path Loss model. The Free-Space propagation model describes unimpeded LOS transmis-
sions as a frequency-selective function of transmitted power, antenna gains, wavelength,
and distance, where the only losses are due to spherical spreading. The Lognormal Path
Loss propagation model describes transmission losses independent of transmitter power,
antenna gains, or the presence of a direct transmission path. The lognormal path loss is a
function of the distance between the transmitter and receiver divided by a reference distance
and raised by a path loss exponent. Reference distances and the path loss exponents have
been experimentally determined for a number of different media and environments [6].
In addition, there are models that designed around specific large-scale phenomena. The
2-Ray Ground Reflection model describes RF communications with exactly two paths: the
direct path and a single round reflection. The 2-Ray Ground Reflection model is valid when
the distance between the transmitter and receiver is much greater than the square root of the
product of their heights. The Knife-Edge Diffraction Model describes losses in the shadow
of a single building, hill, or similar obstruction by modelling it as a single knife edge with
infinite width. Various versions of the Multiple Knife Edge model also exist. Finally, the
Bistatic Radar Equation models losses due to a single large scatterer via the object’s Radar
Cross Section, which is effectively the reflected signal strength relative to the signal that
would be reflected off of a perfect sphere with a volume of about 0.75 m3 . The Bistatic
13
Radar Equation is valid when the scattering object is between, and sufficiently far from,
both the transmitter and receiver [6].
Many models have been designed to described transmissions in common outdoor environ-
ments, but particularly common ones include the Longley-Rice model, the Okumara model,
the Walfisch-Bertoni model, and the wideband model. The Longley-Rice model describes
point-to-point transmissions in the 40 MHz to 100 GHz range over various terrains, and has
been extended to include attenuation factors for urban environments. The Okumara model
is a set of experimentally determined frequency and distance vs. attenuation curves along
with frequency-dependent correction factor curves, and is among the most popular urban
models. The Okumara model is valid from 150 MHz to 2 GHz and out to about 100 km.
The Walfisch-Bertoni model is based on the Free-Space Path Loss model, but includes loss
factors for diffraction and scattering due to buildings and considers the effects of multiple
rows of buildings. The wideband model operates based on the determination that the 2-Ray
Ground Reflection model is sufficient for LOS transmissions and the Lognormal Path-Loss
propagation model holds for obstructed environments [6].
Small-Scale Models
In small-scale models, channels are typically either flat or frequency selective and slow
or fast. In a flat-fading channel, all multipath propagations arrive during the period of a
single symbol, so there is no intersymbol interference, and transmission characteristics
are preserved. Conversely, frequency-selective channels are characterized by intersymbol
interference and the channel effects are frequency-dependent. Flat versus frequency selective
channels are dictated primarily by the position in the channel of the transmitter and receiver,
and the static or slowly varying channel characteristics. Slow versus fast fading channels
are determined by the dynamic properties of the transmitter, receiver, and channel itself.
Movement of the channel, emitter, and receiver generate Doppler spread. When the Doppler
spread is smaller than the channel bandwidth, channel characteristics are approximately
constant over several symbol periods, and the channel is considered to be slowly fading.
Conversely, the channel is considered fast when the symbol period is longer than the amount
of time the channel stays approximately constant [6]. Fast fading induces a lower-bound
error rate, increases the error rate overall, and degrades coherent detection schemes [10].
A frequency-selective, fast channel (which usually describes the underwater environment)
14
is the most adverse, wherein each multipath component arrives at different times and with
different phase and amplitude [6].
The Rayleigh and Rician models are the most commonly used models for small-scale
fading. The Rayleigh model most accurately describes flat multipath channels that do not
include a direct path. In this case, the strength of the received signals follows a Rayleigh
−𝑥 2
distribution. The Probability Density Function (PDF) of a Rayleigh distribution is 𝜎𝑥2 𝑒 2𝜎2 ,
where x is the signal strength and 𝜎 2 is the signal variance. When the channel is flat,
but there is a direct path, then it is modeled by the Rician distribution. The direct path is
modeled as a continuous transmission onto which is added random fluctuations that are
primarily caused by multipath. The Rician distribution is a generalization of the Rayleigh
model: when the direct path component is set to zero, the Rician distribution reduces to
Rayleigh [6]. Another common small-scale model, the Nakagami model or m-distribution,
is a further generalization of the Rician distribution. When the m parameter is one, the
Nakagami model reduces to the Rayleigh model, and when m is between one and two, the
Nakagami distribution matches the Rician. Lower m values (e.g., 0.5 ≤ m ≤ 1) describe
higher-frequency, highly-fading channels, and as m goes to infinity, fading tends to zero [11].
Equalization and synchronization solutions are often designed given the assumption of a
particular small scale fading model. The review of common models is valuable in order to
contextualize the background research.
15
into a narrow bandwidth and are subjected to significant latency and very complex mul-
tipath propagation, resulting in large error rates [13]. Two factors contribute primarily to
the differences between RF channels and UACs: propagation speed and variability of the
medium.
The ocean fluctuates more rapidly than the atmosphere, and this effect is amplified by the
comparatively slow symbol rate in underwater communications [14]. Channel character-
istics vary over the long term due to macro-oceanographic effects and seasonal changes.
Small scale changes in the short term result of minor perturbations in the environment,
causing phase and amplitude variations within a single symbol and adjusting the statistics
of received signals. Short-term variations in the wave guide adjust the sound speed profile,
which results in multipath propagation and non-constant levels of coherence between recep-
tions in the same location. Additionally, relative velocities of the emitter and receiver caused
by surge, sway, yaw, roll, and changes in depth cause path variations that can potentially
have effects within a symbol signal [15].
Electromagnetic waves travel at the speed of light, which is five orders of magnitude faster
than the speed of sound [16]. Additionally, acoustic wave propagation speed is variable—
dependent on temperature, salinity, and pressure—which can change drastically over short
distances in the underwater environment [13]. Comparatively, the speed of light is effectively
constant [17]. As a result, Doppler effects are more significant in the underwater channel than
for RF communications [18], since the propagation speed is closer to the velocities of the
transmitter, receiver, waves, and currents. In RF channels, Doppler shifts cause rotation in
the received phase and a carrier frequency shift, but in UACs, symbol duration expansion or
compression (depending on the direction of the relative motion) occurs as well [18], [19]. Just
as the speed of sound amplifies Doppler effects, multipath is also more severe. The relatively
slow propagation speed increases the multipath due to the transmission geometry and also
increases the severity of the effect by lengthening reverberation times [14]. As a result,
multipath can affect tens to hundreds of symbols in UACs, while in RF channels, interference
usually extends to less than five symbols [20]. Finally, there is often less correlation between
separate receivers receiving the same signal in the underwater environment compared to
RF communications [21].
16
2.3.2 Underwater Acoustic Channel Effects
Frequency selectivity, varying multipath, latency, and severe Doppler shift rank UACs
among the most adverse communications media [22]. While underwater acoustic commu-
nications are affected by the carrier frequency, varying sound speed profile, sea state, ocean
depth and bottom, and configuration and dynamics of the communication elements [23], the
primary detractors for underwater acoustic communications are transmission loss, noise,
multipath, and Doppler shift [22].
Transmission Loss
Transmission loss has two primary culprits: signal attenuation and spreading. Attenuation is
mainly caused by the absorption of acoustical energy by seawater and the conversion of that
energy into heat [21]. These losses are strongly positively correlated with both frequency
and range [24]. Spreading is the dispersion of acoustical energy as the wavefront expands,
which increases with distance [21]. Commonly, a direct path assumption is made, and
spreading is modeled as spherical, so losses are proportional to path length squared. Carrier
frequency determinations and available bandwidth are strongly influenced by transmission
losses [24].
Noise
Noise inhibits signal reception, and high-intensity, short-duration impulsive noises can
entirely mask incoming signals [25]. Noises arise from both natural and man-made en-
vironmental sources and from the platform on which the receiver is mounted. This latter
phenomenon is termed self-noise. Environmental noise is dependent on frequency and loca-
tion, since noise emitters are generally narrowband, and near-shore, shallow environments
are more congested and noisier [24]. Man-made noise thus dominates in littoral regions,
while in a deep environment, natural noises dominate [22]. Man-made noise sources include
machinery, (such as pumps, gears, and engines) ship traffic, (such as propeller cavitation
and hull noises) and industrial work (such as pile driving, grinding, and explosions) [21],
[26]. Natural noise sources include wind and sea state, bubbles (though these can have
non-natural sources as well), precipitation and sea spray, seismic events, and biological
emitters [26].
Self-noise is generated by the platform on which the receiver is mounted (e.g., by the
17
propeller or on-board machinery), and is transmitted to the receiver through the water or
through the platform’s structure. Additionally, self-noise can be generated by the turbulence
that results from the motion of the platform or receiver relative to the water. While all
turbulence, including that resulting from currents, causes noise, that noise is generally
low compared to ambient noise. However, the pressure changes caused by turbulence are
significantly greater than the radiated acoustical energy, so a pressure-sensitive hydrophone
may be affected by self-noise due to turbulence, even if there is little radiated noise [26].
Self-noise positively correlates with speed, so at sufficiently slow speeds, ambient noise
becomes the limiting factor [7].
Noise is often assumed to be white and Gaussian, but the frequency and position dependence
of noise, as well as the prevalence of multipath in the underwater acoustic environment,
undermines the validity of this assumption [11].
Multipath
The two main fundamental causes of multipath are reflection and refraction. Which mul-
tipath method is dominant depends on water depth, transmission frequency, and commu-
nication range, though the shallow-versus-deep distinction (generally delimited by about
100 meters of depth or the termination of the continental shelf) is the main determiner
of the principle multipath mechanism [22]. In shallow water, reflections are the primary
source of multipath, while refraction is the dominant mechanism in deep water [24]. In the
shallow water case, an increase in transmitter gain does not necessarily improve reception,
and indeed, may have a deleterious effect [27]. In both cases, the causal reflectors and/or
refractors are dynamic, so each multipath component is time-variant, resulting in severe,
highly spatially dependent, and rapidly fluctuating fading [15].
In general, channel geometry and knowledge of the communication environment allows for
prior calculation of, and compensation for, deterministic macro-multipaths [22]. Addition-
ally, there are random micro-multipaths, which cause increased time-spreading variability.
In some cases, the micro-multipath effects can be modeled statistically [20]. Oceanic mul-
tipath results in severe, varying, and spatially-dependent intersymbol interference, phase
distortion, and time-spreading of the received signal, with this latter effect also being called
reverberation in the literature. The spatial dependence of ocean channel effects further
complicate transmission reception in mobile systems [24].
18
Doppler Effects
Doppler shift describes the apparent change in carrier frequency caused by relative motion
between the transmitter and receiver or of the propagation medium. This effect is determin-
istic if the relative motion and sound speed are known a priori. However, this information
is not generally precisely known, and compensation is even more difficult when the relative
velocities are non-constant [28].
The multiple multipath previously described also has an effect on the signal, causing Doppler
spread. First, different incidence angles at the receiver due to each multipath component
result in differing relative velocities. Additionally, the various Sound Speed Profiles (SSPs)
along the different paths adjust the speed of each path [22]. Finally, motion of the reflectors
and scatterers causing the multipath will induce their own Doppler variability. The result is
differential Doppler and time-scaling [9].
19
2.4.1 Detection
Signal arrival detection is a critical first step for communication systems. Detection provides
coarse synchronization and prevents wasted computation on signal-less acoustic energy.
Furthermore, improved detection algorithms reduce the necessary SNR for communication,
which improves system range and/or follow-on algorithm performance [30]. This step
is yet more important in UACs, since the hostile doubly spread channel increases the
difficulty of detection, equalization, synchronization, and data recovery [31]. The goal of
detection methodologies is to maximize accurate detection and minimize false alarms, but
in general, these two outcomes are not jointly optimizable [32]. Common detection methods
are envelope detection, matched filtering, binary hypothesis testing, and change detection.
A simple digital transmitter designed by Woodward and Sari [33] band-filters received
energy to narrow the detection range to the desired frequencies, and then applies an en-
velope detector to determine signals of sufficient acoustic energy. Locke and White [34]
apply a matched filter via a fractional Fourier transform to detect linear frequency modulated
narrow-band signals. In [30], Austin expands the matched filter technique to coded wideband
signals with good autocorrelation and low cross-correlation properties such as Welch-Costas
and Gold codes. In [35], Ling et al. also review waveforms that have the desired auto- and
cross-correlation characteristics, commonly called Zero-Correlation Zone (ZCZ) sequences,
such as Frank sequences, for covert applications. Many of these ZCZ sequences have a fixed
length, making them brute-force detectable, so the authors recommend flexible waveforms
synthesized by the Weighted CAN (WeCAN) or Periodic-correlation CAN (PeCAN) algo-
rithms. Aparicio and Shimura [31] apply the matched filter technique to multiuser (Direct
Sequence Code Division Multiple Access (CDMA) [CDMA]) system through the use of
a sliding window and a preamble consisting of a common Zadoff-Chu sequence and user-
specific complementary set of sequences. Loham [32] and Howland [36] discuss detection
methods for Orthogonal Frequency-Division Multiplexing (OFDM) packets, which include
a short-long preamble structure. Detection is achieved only on the short preamble using
hypothesis testing on the correlation properties of multiple samples, the threshold for which
is determined based on a likelihood ratio test (for estimable probabilities) or the Neyman-
Pearson test (for a given false alarm probability constraint). Goh et al. [37] present the use
of both a second-order matched filter and a third-order correlation zero-lag matched filter,
each with hypothesis testing on different thresholds, to detect deterministic signals.
20
To avoid the need for a known preamble, Yang et al. [38] discuss a detector for deterministic
or non-Gaussian signals in the presence of noise with a symmetrical probability density
function by detecting asymmetry in the higher-order statistics. Other change detection
techniques keep track of the ambient noise. Hoppe and Roan [39] introduce signal detection
in a multi-sensor system using principal component analysis of the covariance between two
channels. A single receiver hybrid detection technique is described by Lopatka et al. [40]
that starts with a learning period and then applies an adaptive threshold against which four
detector techniques are applied: impulse detection, speech detection, variance detection, and
histogram detection. The impulse detector is a short-duration envelop detector; the speech
detector operates on the peak-valley difference of the signal, the variance detector tracks the
time-varying signal components in narrow-band signals, and the histogram detector makes
decisions for wideband signals based on the distance metric compared to a background
noise model.
2.4.2 Synchronization
Synchronization is a broad term that may refer to the signal frame, carrier frequency, ref-
erence phase, or transmitted symbol [32]. Phase synchronization is also known as phase
tracking, which will be covered later. Symbol synchronization, or data detection, is central
to the discussion on equalizers. Frame and frequency synchronization are critical steps that
enable successful data decoding, but variable and large propagation delays and Doppler ef-
fects in the underwater environment significantly increase the difficulty [41]. This difficulty
is compounded by the multipath characteristics that obscure a principle signal arrival, under-
mining synchronization and, by extension, equalization [14]. Proper frame synchronization
enables accurate sampling of the waveform to determine data symbols. Without a sufficient
approximation of the start of the symbol period, the signal cannot be properly demodulated.
Improper synchronization results in failed receptions and/or more severe ISI [36].
The simple system designed by Woodward and Sari [33] uses a full-frame synchronization
code to align the transmitter and receiver clocks. Catipovic and Freitag [25] expand this
concept by setting an extended Delay-Locked Loop (DLL) to a known synchronization
waveform that is transmitted at a fixed interval and interpolating frame timing therefrom.
Alternatively, Aliesawi et al. [42] perform frame synchronization by using a chirp signal at
the start of each packet, as do Yurdakul and Senturk [43], though they recommend a chirp in
21
the same band but with lower-power than the data signal. In [22], Stojanovic advocates the
use of a short Barker sequence as a channel probe for frame synchronization. Loham [32]
and Howland [36] discuss cross-correlation and autocorrelation methods that use the IEEE
standard OFDM preambles for coarse frequency and timing synchronization along with a
pilot tone to prevent drift. Kung and Parhi [44] achieve joint timing synchronization and
equalization for OFDM using a Pseudo-Noise (PN) sequence as a preamble and maximum-
likelihood optimization in a sliding window. Yachil et al. [45] introduce an improved
synchronization system for Time Division Multiple Access (TDMA)/Frequency Division
Multiple Access (FDMA) RF communications using a preamble to jointly synchronize
timing and frequency. Similarly, Grotz et al. [46] use a preamble to jointly synchronize fre-
quency, time, and phase based on a maximum-likelihood optimization for a multi-frequency
TDMA system.
The receiver designed by Brady and Catipovic [47] achieves coarse synchronization by
maximum peak detection for the correlation of the reception and known packet head-
ers. Phase tracking and fine synchronization is subsequently achieved by the equalization
and interference cancellation algorithm. Similarly, in [48], Stojanovic et al. introduce an
adaptive, fractionally-spaced Decision-Feedback Equalizer (DFE) with second-order digital
Phase-Locked Loop (PLL) and DLL that jointly optimizes synchronization and equaliza-
tion. Coarse frame synchronization is achieved by means of a pre-defined channel probe.
Ali et al. [49] propose three frame synchronization algorithms. An optimal trellis-based
maximum a posteriori probability (MAP) algorithm operates on the full signal, while a
faster, lower-complexity windowing version is also provided. Additionally, a finite au-
tomaton combining header recovery and hypothesis testing is presented to achieve frame
synchronization sample-by-sample.
2.4.3 Equalization
Equalization mitigates the channel’s influence on the signal, including amplitude fluctua-
tions, phase and frequency shifts, intersymbol interference, and time delay. Equalization
techniques commonly use some combination of preambles, pilot tones, coding, and sta-
tistical analysis. Without equalization or compensation, most signaling schemes will not
work [14]. Equalizers generally fit into six categories: linear, decision feedback, adaptive,
blind, turbo, and probabilistic-detection [17].
22
Linear
A linear equalizer is a filter matched to a specific finite impulse response. Typically, linear
equalizers only work in benign channels, since they tend to amplify noise signals [50].
Since the ocean channel varies rapidly, linear equalizers are typically insufficient in this
environment. However, they are frequently used as the basis for adaptive algorithms, which
update the finite impulse response parameters as the channel changes.
Decision-Feedback
A DFE is an extension of the linear equalizer operating on the principle assumption that the
effects of a previously determined symbol can be estimated, detected, and compensated for
in future symbols. A DFE is effectively a two-step process. In the first section, a linear feed-
forward filter cancels channel effects. In the second section, a feedback filter uses previous
symbol decisions to compensate current ISI [15]. In practical applications, pure DFEs are
not sufficient to achieve necessary performance requirements, but they are commonly used
in adaptive equalizers [23].
DFEs tend to succumb to error propagation, wherein an estimate error propagates and
compounds on successive iterations. To combat this, Brady and Catipovic [47] developed
a soft-decision-based DFE for a specific multiuser acoustic local area network in which a
central receiver is able to track the round-trip transmission durations (which are expected
to verily slowly) for all transmitters in the system. In this system, user determination and
course synchronization are achieved via user-specific headers, which are followed by a
training sequence to initialize the estimated channel parameters. The algorithm uses a soft-
decision DFE on the received signal before using soft-decision interference suppression to
determine the number of interfering users, and the amplitude, Doppler, delay, and phase of
each user. This algorithm has the added benefit of overcoming the near-far problem—valid
signals that are masked by nearer or louder transmitters.
Adaptive
Adaptive equalizers are so called because the channel compensation parameters are itera-
tively updated as the signal is processed. Unlike DFEs, which update the equalizer output
based on previous decisions, an adaptive equalizer updates the equalizer itself. Adaptive
equalizers are commonly based on gradient (i.e., Least Mean Squares (LMS) [LMS]) or
23
Kalman (i.e., Recursive Least Squares (RLS) [RLS]) methods. LMS systems are generally
simpler and lower complexity than RLS equalizers, though RLS converges faster at the
expense of computational and memory costs [51]. Both RLS and LMS algorithms seek
to minimize the MSE, but different powers of the difference between the transmitted and
received signal can be used as the cost function (e.g., Least Mean Fourth) [52]. Adaptive
algorithms usually include known training sequences to learn the communication channel.
Kari et al. [52], developed an adaptive linear equalizer designed to handle the impulsive
noises common in shallow environments by minimizing a cost function based on the MSE
with an added, regularizing logarithmic term. Youcef et al. [53] and Wang et. al [41] im-
plemented adaptive equalization for OFDM systems in the frequency domain, which is
useful when the time-spreading is large compared to symbol duration. In [54], Nott imple-
mented an adaptive DFE with forward error correction for communications with long-loiter
ocean surveillance gliders. In [48], Stojanovic et al. showed that equalizer performance is
improved when jointly estimated with synchronization parameters, and that spacing equal-
izer taps narrower than symbol-aligned removes symbol timing estimation requirements.
Aliesawi et al. [42] found adaptive chip (vice symbol) DFEs (jointly optimized with phase
recovery and coupled with an interference cancellation scheme) to be superior to a sim-
ilarly modified Rake receiver for Interleave Division Multiple Access (IDMA) multi-user
underwater acoustic communication systems. Calvo and Stojanovic [55] designed an adap-
tive algorithm for multiuser data detection in a CDMA system. Their algorithm is based
on Cyclic Coordinate Descent (CCD), wherein symbol estimates, channel parameters, and
carrier phase are adapted in turn (while holding the others constant) using the minimum
MSE criterion. Song and Badiey [56] implemented a time-reversal DFE, adapted using the
matching pursuit algorithm for sparse channels in a multiband system.
Blind
As previously mentioned, adaptive algorithms typically utilize a training sequence of known
symbols to achieve initial equalization. In the underwater environment, which is rapidly
changing, slow, and bandwidth constraining, training symbols are an inefficient (and po-
tentially infeasible) use of limited resources. Blind algorithms remedy this by performing
equalization without training information or prior knowledge of transmitted data or channel
characteristics for any user [57]; however, this utilization improvement comes at a cost
24
of slower convergence and reduced robustness to phase rotation [51]. Instead of finding a
best-fit solution for comparing a received training sequence to an expected sequence, blind
equalization techniques minimize the difference in statistics between equalized received
symbols and source sequences [58]. Blind techniques are sometimes used to initialize the
channel parameters for decision-directed systems, like DFEs [57]. In general, there are
four classifications of blind algorithms: Bussgang, polyspectra-based, cyclostationary, and
probabilistic [59]. Blind algorithms of the probabilistic variety combine channel estimation
schemes with probabilistic-detection symbol recovery, and as such, they will be discussed
under that section.
25
Turbo
Turbo equalizers leverage turbo codes, or other near Shannon-limit codes, to extract extrinsic
channel information not explicitly encoded in the transmitted data. By iterating between
an equalizer and decoder, equalizer outputs improve soft decoding decisions, which in
turn improve channel estimates. Aliesawi et al. [19] use a turbo equalizer in IDMA and
CDMA systems to achieve MAI cancellation by iterating between a DFE and a Gaussian
Approximation-based interference cancellation decoder for the IDMA/CDMA spreading
codes. Liu and Song [23] use irregular low density parity check codes and iterate between
an adaptive DFE and a belief propagation decoder. Tong et al. [61] compare Superposition
Coded Modulation (SCM) and Bit-Interleaved Code Modulation (BICM) in turbo equalizer
structures with a focus on OFDM systems, which can incur harmful Peak-to-Average Power
Ratios (PAPRs). They develop a Gaussian Approximation-based soft compensation method
to overcome the effects of clipping (to reduce PAPR) with minimal BER increase in SCM
systems, and show that SCM achieves reduced computational complexity for approximately
commensurate performance compared to BICM.
Probabilistic-Detection
Probabilistic-detection algorithms are of two varieties, Viterbi and MAP. Viterbi equalizers
use the Maximum-Likelihood Sequence Estimation (MLSE) criterion to generate an optimal
data solution by minimizing the error probability for a symbol sequence. MAP equalizers
use the maximum a posteriori (MAP) criterion to optimal detect a symbol by minimizing
the BER. Both algorithms require some knowledge of the channel in order to generate the
statistics through which symbol decisions are made, so these algorithms are often paired
with a channel-estimation algorithm such as those already discussed. Probabilistic-detection
algorithms rely on accurate knowledge of the statistical distribution of the noise in the
channel, which is often assumed to be additive white Gaussian noise. Since the wrong noise
assumption deteriorates performance, and since the computational cost of these algorithms
can be excessive (exponential with the number of channel taps), probabilistic-detection
equalizers, while optimal, may not always be practical [50].
Feder and Catipovic [14] suggest an iterative maximum-likelihood algorithm for blind joint
channel estimation and data recovery on data blocks, where the block size is determined
by expected channel stability duration. The algorithm iterates between channel response
26
estimation assuming known data using the Expectation-Maximization algorithm and data
recovery assuming known channel parameters using a Viterbi algorithm. Antón-Haro et
al. [57], [59], compare two different adaptive probabilistic-detection algorithms (Multiuser
Adaptive Baum & Welch (MABW) and Multiuser Adaptive Viterbi (MAV)) coupled with
coherence checking for blind joint detection of near-far users in a Direct-Sequence CDMA
acoustic communication system. MABW is a MAP algorithm based on Hidden Markov
Models (HMMs), while MAV is a Viterbi-based MLSE algorithm. Both algorithms demon-
strate similar steady-state performance and robustness in practice, but MABW is slightly
more computationally complex with a longer convergence time, and MAV has larger mem-
ory requirements and a lower performance guarantee.
2.4.4 Compensation
Compensation techniques support equalization by accounting for specific channel effects,
such as multi-user interference, phase rotation, Doppler shift, and spatially dependent fading.
Additionally, coding can compensate the residual error after data recovery algorithms are
applied.
Interference Cancellation
Interference cancellation algorithms suppress known sources of interference, such as those
caused by additional system users or known noise sources. Rather than treating additional
users as part of the channel noise, interference suppression techniques leverage the de-
terminism of interference signals to account for them specifically. The simplest methods
operate on a bank of matched filters, but optimal performance with near-far resistance can
be achieved by a maximum-likelihood detector (at an exponential complexity cost). Parallel
Interference Cancellation (PIC) and Successive Interference Cancellation (SIC) achieve the
same levels of success, but the processing delay in SIC increases with the number of users,
so PIC is often preferred in practical applications [55].
27
pre-processes overlapped segments of the received signal by consolidating and parameter-
izing all interference within each segment. Then, interference cancellation is applied on the
overlapped segments and iterated with equalization. Antón-Haro et al. [59] perform joint
equalization and multiuser interference suppression, improving the results of both, by iter-
ating between an adaptive equalizer and adaptive maximum-likelihood multiuser detection
algorithm.
Doppler Compensation
Most Doppler compensation involves first estimating the Doppler effects and then com-
pensating by resampling to account for frequency shift and symbol dilation. At the cost of
computation, sampling more frequently than the symbol rate (i.e., at the chip rate) can help
mitigate the effects of symbol compression and delay spread [22]. Common Doppler esti-
mation methods are interpolation, where shift is estimated between two known receptions
of fixed delay, and correlation, where many different shift values are attempted and the best
result is selected.
28
and associated Doppler shift. Using the strongest path and compensating for the frequency
shift, correlation is performed on the preamble and maximized to determine the time delay.
Successively, the strongest path is removed and the process is repeated. Unlike all other
referenced implementations, which require known transmissions, Eynard and Laot [29]
demonstrate blind compensation of the Doppler effects on carrier frequency and timing in
a multichannel system by leveraging proportionality of the frequency and symbol period.
Phase Tracking
Equalization algorithms can achieve some measure of phase compensation, but it is usually
insufficient for adverse UACs, especially given the coherently modulated schemes necessary
to maximize efficiency [48]. Thus, additional phase tracking is necessary, with a Digital
PLL (DPLL) being the predominate method in the literature. A first-order DPLL has a
proportional component only, while a second-order DPLL adds an integral coefficient; both
are seen in proposed and applied systems.
Stojanovic et al. [48] propose joint optimization using minimum MSE of a fractionally-
spaced adaptive DFE and second-order DPLL, which is experimentally confirmed as the
most effective method by Zhong and Xiao-ling in [51]. To circumvent preambles, Antón-
Haro et al. [59] present a blind equalizer combined with a blind second-order DPLL, whose
update equation is based on the imaginary part of the blind equalizer.
Beamforming
Beamforming leverages steerable receiver arrays to clarify multipath by focusing recep-
tions in certain directions and ignoring unwanted receptions. Spatial diversity can have
the same effect, but requires sufficient distance between multiple receivers. Beamforming
breaks down as the transmission range increases and can ignore valid signals in multiuser
systems [24], and since propagation paths change over time, beamformers must be adaptive.
Howe et al. [8] describe an adaptive beamformer that seeks minimum MSE between a
reference PN signal and the beamformer output. This method requires periodic retraining
and a preamble that is half the frame. Mutual Coupling (MC) between elements in an array
tends to distort beamforming, so Huang and Balanis [64] investigate the extent of the effects
of MC on the minimum MSE-based adaptive beamformer. They found that while MC does
29
not affect the lower bound on MSE in the channel dominated by environmental noise, MC
compensation also does not improve the minimum MSE in the receiver noise dominated
case.
Spatial Diversity
Spatial diversity, or multi-channel, requires increased computational complexity and re-
ceivers that are sufficiently far apart to be uncorrelated, but in return, they can significantly
decrease error rates by compensating for spatially-dependent fading, clarifying multipath
without range restrictions, and reinforcing decision results without an increase in bandwidth
or signal energy [65]. An optimal receiver system would combine and jointly optimize beam-
forming, spatial diversity, and equalization [66]. The key to leveraging spatial diversity is the
mechanism of diversity combining, for which there are three primary methods: unweighted
summing, maximum output, and reliability weighting. This third method is the most robust,
but it is the most difficult to implement, since it requires a metric for the reliability of
each channel [65]. Further, the complexity of the reliability weighted combiner increases
with number of receivers, but the complexity can be reduced (with no appreciable loss of
performance) by selecting only the most significant receptions.
In [66], Stojanovic et al. design a jointly optimized receiver system and show that the
combination of beamforming and diversity combining achieves equivalent performance at
lower cost compared to pure diversity combining. Catipovic and Freitag [25] reduce the
necessary receiver separation by using optimal maximum-likelihood diversity combining.
Wibisono and Sasase [67] reduce receiver spacing even more, designing a trellis coded
diversity system using weighted sum combining that operates successfully even when the
receivers are correlated. Goldfeld and Wulich [68] improve on the optimal diversity reception
system (e.g., [66] and [25]) by including erasures-correction by means of a block code and
purposefully erasing unreliable bits in each channel. Focusing on highly reverberant shallow
channels, Bessios and Caimi [27] developed a jointly optimized CMA-based blind equalizer
that incorporates a summed spatial diversity combiner.
In [17], Aydogmus implements an adaptive RLS filter with two-channel diversity such that
one channel is used to determine symbols and the other is used to confirm. The CCD
multiuser detection strategy employed by Calvo and Stojanovic in [55] can be extended to
the multichannel environment for only a linear increase in complexity on the number of
30
receivers. The CCD method is applied to each receiver individually, the outputs of which
are optimally combined before symbol decision are made.
Channel Coding
Coding in UACs has several purposes, primarily error correction and user detection. In
general, channel codes reduce the number of bits that can be transmitted in a given time,
but improve overall performance by increasing the chance that transmitted symbols will be
correctly interpreted. For example, Hamming codes and convolutional codes allow error
correction, while PN codes in CDMA systems allow multiple users to operate on the same
frequency band [16].
Catipovic and Baggeroer [69] demonstrate the use of sequential decoding on long constraint
convolutional codes to improve error rate over Rayleigh fading channels, while in [68], Gold-
feld and Wulich showed that an erasures-correction block code outperforms forward error
correction via cyclic Bose, Chaudhuri, and Hocquenghem (BCH) code in a multichannel
system. Müller and Rohling [70] combine a convolutional code and Reed-Solomon (RS)
code to achieve Doppler-robust error correction encoding. The convolutional code uses soft-
decision decoding (by the Viterbi algorithm, for example) to correct individual bit errors,
while the RS accounts for burst errors. In [23], however, Liu and Song optimize irregular
Low-Density Parity-Check (LDPC) codes for specific channels and show that LDPC codes
can outperform Turbo codes, both of which provided better data recovery than convolution
codes and RS codes in underwater channels, at the cost of a high complexity turbo-type
equalizer. In general, code performance is proportional to diversity order. Hansson and
Aulin [71] improve the diversity order of trellis-coded systems through a bit-interleaved
constellation expansion technique termed channel symbol expansion diversity.
2.5 Summary
In this chapter, we first presented Passerieux’s steganographic method as implemented
in the frequency domain by Ferrao. Then, we introduced the characteristics of wireless
communications, including common effects and their sources, and large- and small-scale
fading models. Next, we described the peculiarities of the underwater acoustic channel,
including the differences between UACs and RF communications and the particular effects
31
that make the underwater environment especially hostile. Finally, we presented an array
of methods applied in both the acoustic and RF environments to achieve data recovery
at a receiver through signal detection, frame synchronization, and channel equalization.
In each case, different classes of solutions were presented, and various implementations
were introduced. It should be noted that while certain techniques are well-established
or show great promise for practical applications, applicable methods are constrained to
those that are interoperable with Ferrao’s implementation of Passerieux’s system and do
not undermine the inherently covert nature of the communications scheme. Thus, change-
sensitive detection and blind synchronization and equalization techniques are the most likely
candidates. Additionally, we can not necessarily assume the availability of multi-element,
beamforming, or spatially diverse receivers. We intend to further explore data recovery
methods in the context of steganographic communications and analyze their performance
and constraints in realistic environments. Specifically, we intend to thoroughly research the
following:
• Assuring that transmission recognition can occur only by application of the stegano-
graphic key, and not using knowledge merely of the steganographic system
• Transmission capacity improvement
• Optimal symbol recovery methods for our steganographic system in the underwater
acoustic environment
• The performance of the application of these various techniques at different ranges,
depths, and acoustic environments.
The remainder of this thesis will describe and analyze methods to achieve reliable data re-
covery, robust to the effects of a hostile acoustic channel, for our steganographic system—to
include comparison of competing techniques—and will provide analysis of the performance
results in such a system.
32
CHAPTER 3:
Methodology
3.1 Theory
Our work builds on Ferrao’s frequency domain implementation [1] of Passerieux’s stegano-
graphic method [2], [3], as described in Chapter 2. Ferrao operates on the premise that
the real components of the cross correlation in the frequency domain will tend positive or
negative depending on the symbol, which is encoded as a negative or positive amplitude
factor. Passerieux contends that when the signal is perfectly synchronized, the symbol (en-
coded in amplitude, phase, or both) will present in the phase of the cross correlation in
the time domain. We demonstrate how both of these systems break down mathematically.
Additionally, we notice that both Ferrao and Passerieux utilize a symbol encoding scheme
that is constant across each interval 𝑚 and a key that varies across each sub-interval 𝑝. We
posit that these two elements are mathematically interchangeable and that bit rate can be
improved at the expense of computation by performing a search across all possible symbols.
The computational increase is logarithmic with the number of symbols, since each bit can
be searched independently.
33
Ferrao provides a mathematical analysis of the frequency domain translation of Passerieux’s
time domain technique. While we leverage that work, we will not present it here; the
interested reader is referred to Ferrao’s thesis [1].
After the DFT is applied to a signal interval, a vector of complex values results, with each
complex value representing the presence of a given frequency in the original signal. We
represent a generic single frequency value with amplitude 𝐴 and phase 𝑎 as 𝐴𝑒 𝚥𝑎 ; specific
frequency components in an array are indicated by subscripts (e.g., 𝐴 𝑘 𝑒 𝚥𝑎 𝑘 ); and we use
brackets to indicate the array itself (e.g., 𝐴 [𝑘] 𝑒 𝚥𝑎 [𝑘 ] ). We also make use of the following
formula for phasor addition (given 𝐶𝑒 𝚥𝑐 = 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 )
𝐵 sin(𝑏−𝑎),
𝐴2 + 𝐵2 + 2𝐴𝐵 cos(𝑏 − 𝑎)𝑒 𝚥 (𝑎+atan2( 𝐴+𝐵 cos(𝑏−𝑎) ))
√︁
𝐶𝑒 𝚥𝑐 = (3.1)
𝐴𝑒 𝚥𝑎 · 𝐵𝑒 𝚥𝑏 = 𝐴𝐵 cos(𝑏 − 𝑎) (3.2)
and
𝐴𝑒 𝚥𝑎 · 𝐶𝑒 𝚥𝑐 = 𝐴𝐶 cos(𝑐 − 𝑎) (3.3)
𝐶𝑒 𝚥𝑐 · 𝐶𝑒 𝚥𝑐 = 𝐶 2 (3.4)
𝐶 2 = ( 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 ) · ( 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 )
= 𝐴𝑒 𝚥𝑎 · 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 · 𝐵𝑒 𝚥𝑏 + 2𝐴𝑒 𝚥𝑎 · 𝐵𝑒 𝚥𝑏
= 𝐴2 + 𝐵2 + 2𝐴𝐵 cos(𝑏 − 𝑎)
Thus,
√︁
𝐶= 𝐴2 + 𝐵2 + 2𝐴𝐵 cos(𝑏 − 𝑎) (3.5)
34
This gives the magnitude of the sum, but not the angle. The angle can be found by the
following series of equations. First, substituting 𝐶𝑒 𝚥𝑐 = 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 into Equation 3.3 gives
𝐴𝐶 cos(𝑐 − 𝑎) = 𝐴𝑒 𝚥𝑎 · ( 𝐴𝑒 𝚥𝑎 + 𝐵𝑒 𝚥𝑏 )
= 𝐴2 + 𝐴𝐵 cos(𝑏 − 𝑎)
𝐴 + 𝐵 cos(𝑏 − 𝑎)
cos(𝑐 − 𝑎) = (3.6)
𝐶
| 𝐴𝑒 𝚥𝑎 × 𝐵𝑒 𝚥𝑏 | = 𝐴𝐵 sin(𝑏 − 𝑎)
then
| 𝐴𝑒 𝚥𝑎 × 𝐶𝑒 𝚥𝑐 | = | 𝐴𝑒 𝚥𝑎 × 𝐴𝑒 𝚥𝑎 + 𝐴𝑒 𝚥𝑎 × 𝐵𝑒 𝚥𝑏 |
= | 𝐴𝑒 𝚥𝑎 × 𝐵𝑒 𝚥𝑏 |
𝐴𝐶 sin(𝑐 − 𝑎) = 𝐴𝐵 sin(𝑏 − 𝑎)
and
𝐵
sin(𝑏 − 𝑎)
sin(𝑐 − 𝑎) = (3.7)
𝐶
Dividing 3.7 by 3.6 and application of the arctangent (specifically the atan2 function, which
preserves the quadrant of the solution) solves for the angle, 𝑐
𝐵 sin(𝑏 − 𝑎),
𝑐 = 𝑎 + atan2 (3.8)
𝐴 + 𝐵 cos(𝑏 − 𝑎)
Using Equation 3.1, we can describe our steganographic method in terms of a generic
35
frequency component. Recall that the time domain original signal is divided into intervals
of length 𝑇 seconds and the DFT is applied to each 𝑇-second interval. Once in the frequency
domain, the interval is subdivided into 2𝑀 + 1 subintervals, with data encoded in 𝑀
subintervals, and 𝑀 + 1 subintervals acting as guard bands. We then represent a generic
frequency component in one of the data subintervals of the original signal as Ω𝑒 𝚥𝜔 . The
auxiliary signal is built from the time-reversed partner subinterval (e.g., if Ω𝑒 𝚥𝜔 is in
subinterval 𝑚, then the partner subinterval is 𝑀 − 𝑚 + 1.) Due to the symmetry of the DFT,
the auxiliary signal component is just the complex conjugate of the original, so Ω𝑒 − 𝚥𝜔 .
Applying a phase key (𝑎 𝑚 ), a gain term (𝛼), and a symbol (𝑏𝑒 𝚥 𝛽 ), the generic frequency
component in the signal to be transmitted (𝑇 𝑒 𝚥𝜏 ) is
𝑇 𝑒 𝚥𝜏 = 𝑂𝑒 𝚥𝜔 + 𝛼𝑏𝑂𝑒 𝚥𝑎 𝑚 𝑒 𝚥 𝛽 𝑒 − 𝚥𝜔 (3.10)
√︃ 𝛼𝑏𝑂 sin(𝑎𝑚 +𝛽−2𝜔),
𝚥 (𝜔+atan2( 𝑂+𝛼𝑏𝑂 ))
= 𝑂 2 + (𝛼𝑏𝑂) 2 + 2𝛼𝑏𝑂 2 cos(𝑎 𝑚 + 𝛽 − 2𝜔)𝑒 cos(𝑎𝑚 +𝛽−2𝜔)
Assuming no transfer function due to transmission effects and perfect receive-side synchro-
nization, the generic received frequency component (𝑅𝑒 𝚥 𝜌 ) is exactly equal to 𝑇 𝑒 𝚥𝜏 . The
approximate auxiliary function is then calculated as
ˆ
Ψ̂𝑒 𝚥 𝜓 = 𝛼𝑇 𝑒 − 𝚥𝜏 𝑒 𝚥𝑎 𝑚 (3.11)
Finally, the cross-correlation of the approximate auxiliary signal and the received signal is
ˆ
𝑍𝑒 𝚥𝜁 = Ψ̂𝑒 − 𝚥 𝜓 𝑇 𝑒 𝚥𝜏
= 𝛼𝑇 𝑒 𝚥𝜏 𝑒 − 𝚥𝑎 𝑚 𝑇 𝑒 𝚥𝜏
which, after substituting Equation 3.10 and some basic arithmetic, gives
In the real world, channel effects will always be present. These effects are modeled in the
frequency domain by a vector of amplitude changes and phase shifts, 𝐻 [𝑘] 𝑒 𝚥𝜃 [𝑘 ] . In terms
36
of a generic frequency component, the received signal after transmission becomes
√︃ 𝛼𝑏 sin(𝑎𝑚 +𝛽−2𝜔),
𝚥 (𝜔+𝜃+atan2( 1+𝛼𝑏 ))
𝑅𝑒 𝚥𝜌
= 𝑂𝐻 1 + 𝛼2 𝑏 2 + 2𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)𝑒 cos(𝑎𝑚 +𝛽−2𝜔) (3.13)
ˆ
Ψ̂𝑒 𝚥 𝜓 = 𝛼𝑅𝑒 − 𝚥 𝜌 𝑒 𝚥𝑎 𝑚 (3.14)
giving
ˆ
𝑍𝑒 𝚥𝜁 = Ψ̂𝑒 − 𝚥 𝜓 𝑅𝑒 𝚥 𝜌
= 𝛼𝑅 2 𝑒 𝚥 𝜌 𝑒 − 𝚥𝑎 𝑚 𝑒 𝚥 𝜌
= 𝛼𝑂 2 𝐻 2 (1 + 𝛼2 𝑏 2 + 2𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔))𝑒 𝚥 (2𝜌−𝑎 𝑚 ) (3.15)
In the methods outlined by Passerieux [2], [3] and Ferrao [1], the gain term (𝛼) is constant
across all intervals, the phase key (𝑎 𝑚 ) varies across intervals and sub-intervals, and the
symbol (𝑏𝑒 𝚥 𝛽 ) varies from interval to interval but is constant across all sub-intervals. We
posit that one can increase the symbol space by using a phase key that varies from interval
to interval but is constant through the interval, and a symbol that varies from sub-interval to
sub-interval. In this scheme, each interval represents an 𝑀-bit symbol, and the magnitude
b is a shared secret value that acts as an additional key.
3.2.1 Synchronization
Proper synchronization is critical to the efficacy of the steganographic method. According
to Passerieux [2], amplitude peaks in the time domain cross-correlation will be maximized
upon correct application of the steganographic key and proper synchronization. When proper
37
synchronization occurs, the division of the signal into T-second intervals begins at the exact
same point in time on the received signal as with the original signal. However, when varying
the synchronization delay on a signal with known symbol and key, the magnitude of the time
domain cross-correlation peak varies non-monotonically, indicating that the magnitude is
not suitable for a synchronization search. Figure 3.1 illustrates this phenomenon. The x-axis
is offset from synchronization in seconds, and the y-axis is peak cross-correlation magnitude.
Note that if cross-correlation peak were sufficient for synchronization, we would expect an
absolute maximum at 𝑥 = 0 (i.e., no delay offset, or perfect synchronization).
and
𝑚 [𝑘 ] + 𝛽 [𝑘] − 2𝜔 [𝑘] ),
𝛼 𝑏 sin(𝑎
[𝑘] [𝑘]
𝜏[𝑘] = 𝜔 [𝑘] + atan2 (3.17)
1 + 𝛼 [𝑘] 𝑏 [𝑘] cos(𝑎 𝑚 [𝑘 ] + 𝛽 [𝑘] − 2𝜔 [𝑘] )
(We note that here and throughout, vector operations are element-wise.) When synchro-
38
nization is off at the receiver (even assuming the absence of channel effects) 𝑇[𝑘] 𝑒 𝚥𝜏[𝑘 ] =
𝑅 [𝑘] 𝑒 𝚥 𝜌 [𝑘 ] does not hold true. When properly synchronized, 𝑅𝑖 𝑒 𝚥 𝜌𝑖 is a function of 𝑂 𝑖 𝑒 𝚥𝜔𝑖 .
However, when synchronization is off, the DFT is performed on a different segment of the
time domain signal, resulting in an amplitude and phase for the 𝑖th frequency component
that is completely independent from 𝑂 𝑖 𝑒 𝚥𝜔𝑖 . In this case, the cross-correlation result is
merely
𝚥 (2𝜌 [𝑘 ] −𝑎 𝑚 [𝑘 ] )
𝑍 [𝑘] 𝑒 𝚥𝜁 [𝑘 ] = 𝛼𝑅 2[𝑘] 𝑒 (3.18)
Passerieux indicates that one should expect peaks in the time domain cross-correlation
spaced at the interval length. This means we should expect a peak at 𝑧1 . Setting 𝑛 to 1 in
the inverse DFT equation gives
𝑁
1 ∑︁ 0
𝑧1 = 𝑍𝑘 𝑒 𝚥 𝑁 (3.20)
𝑁 𝑘=1
which is just the average magnitude of the frequency domain cross-correlation. Thus peak
magnitude is a function of the average power in the interval, rather than proper synchro-
nization.
1
𝑍𝑒 𝚥𝜁 = 𝛼2 𝑂 2[𝑘] ( + 𝛼𝑏 2[𝑘] + 2𝑏 [𝑘] cos(𝑎 [𝑘] + 𝛽 [𝑘] − 2𝜔 [𝑘] ))𝑒 𝚥 (2𝜌 [𝑘 ] −𝑎 [𝑘 ] ) (3.21)
𝛼
From the definition of the inverse DFT, we can select an arbitrary sample in time (𝑖) between
39
the peaks located at 1 and 𝑁 (i.e., 1 < 𝑖 < 𝑁). (Recall that the number of time samples, 𝑁,
is also the number of frequency domain frequency components.)
𝑁
1 ∑︁ 2 𝜋 𝑘 (𝑖−1)
𝑧𝑖 = 𝑍 𝑘 𝑒 𝚥𝜁 𝑘 𝑒 𝚥 𝑁 (3.22)
𝑁 𝑘=1
From Equation 3.18, we can see that for all 𝑘, 𝜁 𝑘 is 2𝜌+ 𝑎 𝑘 . Thus, the summation can be
rewritten in rectangular coordinates as
𝑁
1 ∑︁ 2 2 1 2𝜋𝑘 (𝑖 − 1)
𝑧𝑖 = 𝛼 𝑂 𝑘 ( + 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) cos(2𝜌 𝑘 + 𝑎 𝑘 + )
𝑁 𝑘=1 𝛼 𝑁
𝑁
1 ∑︁ 2 2 1 2𝜋𝑘 (𝑖 − 1)
+𝚥 𝛼 𝑂 𝑘 ( + 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) sin(2𝜌 𝑘 + 𝑎 𝑘 + )
𝑁 𝑘=1 𝛼 𝑁
(3.23)
1 2 2 1 2𝜋𝑘 (𝑖−1)
+ 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) sin(2𝜌 𝑘 + 𝑎 𝑘 +
Í𝑁
𝑘=1 𝛼 𝑂 𝑘 ( 𝛼 ),
𝑁 𝑁
𝜑𝑖 = atan2 1 2 2 1 + 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
Í𝑁
𝑁 𝑘=1 𝛼 𝑂 𝑘 ( 𝛼
(3.24)
2 2 1 2𝜋𝑘 (𝑖−1)
Í𝑁
𝑘=1 𝛼 𝑂 𝑘 ( 𝛼 + 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) sin(2𝜌 𝑘 + 𝑎 𝑘 + 𝑁 ),
= atan2
2 2 1 + 𝛼𝑏 2𝑘 + 2𝑏 𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
Í𝑁
𝑘=1 𝛼 𝑂 𝑘 ( 𝛼
(3.25)
Within each of the 2𝑀 + 1 intervals 𝑏 𝑘 , 𝛽 𝑘 , and 𝛼 are constant, and 𝑎 𝑘 is constant within
40
every subinterval. Then with distribution, we get
Í𝑁 2 2𝜋𝑘 (𝑖−1)
© 𝛼 𝑘=1 𝑂 𝑘 sin(2𝜌 𝑘 + 𝑎𝑘 + 𝑁 ) ª
+ 𝛼 𝑏 𝑘 𝑘=1 𝛼𝑏 𝑘 𝑂 𝑘 sin(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
2 2
Í𝑁 ®
®
+ 2𝑂 2𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 ) sin(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) ),
®
®
𝜑𝑖 = atan2 Í𝑁
2 2𝜋𝑘 (𝑖−1)
® (3.26)
𝛼 𝑘=1 𝑂 𝑘 cos(2𝜌 𝑘 + 𝑎 𝑘 + 𝑁 )
®
®
+ 𝛼2 𝑏 𝑘 𝑁𝑘=1 𝛼𝑏 𝑘 𝑂 2𝑘 cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
Í ®
®
®
« + 2𝑂 2𝑘 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 ) cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) ) ¬
𝑂 2𝑘 sin(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
Í𝑁
© 𝑘=1 ª
𝛼𝑏 𝑘 𝑁𝑘=1 𝑂 2𝑘 (𝛼𝑏 𝑘 + 2 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) sin(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) ),
Í
+ ®
= atan2 Í𝑁
®
𝑂 2𝑘 cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
®
𝑘=1 ®
®
𝛼𝑏 𝑘 𝑁𝑘=1 𝑂 2𝑘 (𝛼𝑏 𝑘 + 2 cos(𝑎 𝑘 + 𝛽 𝑘 − 2𝜔 𝑘 )) cos(2𝜌 𝑘 + 𝑎 𝑘 + 2𝜋𝑘𝑁(𝑖−1) )
Í
« + ¬
(3.27)
While one could use trigonometric identities to further factor out the constants sin(𝛽 𝑘 )
and cos(𝛽 𝑘 ) and simplify some components of the equation by setting 𝑏 𝑘 𝑒 𝚥 𝛽 𝑘 to 1 in the
guard intervals, this does not further clarify the solution. It is sufficient to gather from
Equation 3.27 that while 𝜑 is a function of the symbol 𝑏 𝑘 𝑒 𝚥 𝛽 𝑘 , it is clearly not proportional
to the symbol, as Passerieux claims. Moreover, since atan2 is not bijective and 𝜑 is a function
of multiple variables—with the argument of atan2 primarily driven by 𝑂 [𝑘] —knowledge of
the value of 𝜑 is not sufficient to resolve the encoded symbol.
We note that if the constant phase key and variable symbol encoding version is used, the
discussion is analogous and the conclusions are the same.
41
The various variables used in the equations of this chapter are summarized in Table 3.1.
ˆ 𝚥 𝜔ˆ
𝑂𝑒 Approximate original signal †
Ψ𝑒 𝚥𝜓 Auxiliary signal †
𝑇 𝑒 𝚥𝜏 Transmitted signal 𝑇 𝑒 𝚥𝜏 = 𝑂𝑒 𝚥𝜔 + Ψ𝑒 𝚥𝜓 †
𝐻𝑒 𝚥𝜃 Transfer function †
𝑅𝑒 𝚥 𝜌 Received Signal 𝑅𝑒 𝚥 𝜌 = 𝑇 𝑒 𝚥𝜏 𝐻𝑒 𝚥𝜃 †
𝑍𝑒 𝚥𝜁 Output of cross-correlation †
ˆ 𝚥 𝜁ˆ
𝑍𝑒 Approximate 𝑍 and 𝜁 †
42
Table 3.1 – continued from previous page
Variable Name Notes
𝛿𝜔 𝛿𝜔 = 𝜔ˆ − 𝜔
𝐾𝑂 ˆ
𝐾𝑂 = 𝑂/𝑂
𝜆ˆ Estimated 𝜆 𝜆ˆ is the numerically approximated inte-
gral of 𝜕𝜕𝜆𝑓
Λ DFT of 𝜆
𝐸𝑒 𝚥𝜀 Estimate error ˆ 𝚥 𝜔ˆ
𝐸𝑒 𝚥𝜀 = 𝑂𝑒 𝚥𝜔 − 𝑂𝑒 †
𝑓 Frequency
𝐵, 𝐶 Integration equation parameters In our implementation: 𝐵 = 𝜋/2, 𝐶 = 𝛾 𝑓
ℎ Integration equation parameter ℎ = log(𝜋𝛾𝑔 𝑁)/(𝛾𝑔 𝑁)
𝜇
𝜓(𝑢) Exponential transform 𝜓(𝑢) = 2 tanh(𝜋 sinh(𝑢))
𝜓 −1 (𝑢) Transform inverse 𝜓 −1 (𝑢) = arcsinh(2 arctanh(𝑢/𝜇)/𝜋)
2
𝜓 0 (𝑢) Transform derivative 𝜓 0 (𝑢) = 𝜇𝜋
2 sech (𝜋 sinh(𝑢)) cosh(𝑢)
1 Si(𝜋(𝑘−ℓ))
𝜎𝑘−ℓ Modified sine integral 𝜎𝑘−ℓ = 2 + 𝜋 , where Si(𝑥) is the
sine integral of 𝑥
𝑖, 𝑗, 𝑘, ℓ Iterators
S(𝑘, ℎ)(𝑢) Normalized sinc function S(𝑘, ℎ)(𝑢) = sin(𝜋𝑥)/(𝜋𝑥), where 𝑥 is
(𝑢 − 𝑘 ℎ)/ℎ
𝑁 DFT size Also, 𝑁𝜆 in Johnson’s derivation tech-
nique
𝑁𝐼 Numerical integration size
𝐿𝜆 Length of 𝜆 In Johnson’s derivation technique
𝜇 Scale factor
𝛼𝑓 , 𝛽𝑓 , 𝛾𝑓 Integration constraint constants Also given as 𝛼𝑔 , 𝛽𝑔 , and 𝛾𝑔 , respectively
𝑑 𝑓 , 𝜀𝑑 , 𝜀𝑐 Integration constants 𝑑 𝑓 also given as 𝑑𝑔
43
3.3.1 Recovery Directly from Received Signal
The most enticing method for data recovery is to draw the symbol directly from the received
signal. In the absence of channel effects, the base equation is Equation 3.10 simplifies to
√︃ 𝛼𝑏 sin(𝑎𝑚 +𝛽−2𝜔),
𝚥 (𝜔+atan2( 1+𝛼𝑏 ))
𝑅𝑒 𝚥𝜌
= 𝑇𝑒 𝚥𝜏
= 𝑂 1 + 𝛼2 𝑏 2 + 2𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)𝑒 cos(𝑎𝑚 +𝛽−2𝜔) (3.28)
We can readily solve for 𝑎 𝑚 from the magnitude of the received signal.
𝑅2 1 𝛼𝑏
𝑎 𝑚 = arccos( 2
− − ) − 𝛽 − 2𝜔 (3.29)
2𝛼𝑏𝑂 2𝛼𝑏 2
We cannot solve for 𝑎 𝑚 unless the values of 𝑂 and/or 𝜔 are known. The simplest mechanism
for handling this problem is to use known cover signals, but that would undermine the
steganographic intent of our scheme. The alternative is to estimate 𝑂 or 𝜔 with accuracy
sufficient to allow proper recovery of 𝑎 𝑚 . Depending on the values of 𝛼, 𝑏, 𝑎 𝑚 , and 𝛽, 𝑅𝑒 𝚥 𝜌
can be a reasonable approximation of 𝑂𝑒 𝚥𝜔 . A better approach uses the known values of
𝛼, b, and 𝛽 to make a closer estimation. We leverage the consistency of 𝛼, 𝑏, 𝑎 𝑚 , and 𝛽
across all frequencies in a subinterval to achieve data recovery via an analytical method
that retrieves the symbol from the received signal. We will utilize the follow equivalences:
𝛽𝑖 = 𝛽 𝑗 , 𝛼𝑖 𝑏𝑖 = 𝛼 𝑗 𝑏 𝑗 , and 𝑎𝑖 + 𝛽𝑖 = 𝑎 𝑗 + 𝛽 𝑗 for all 𝑖 and 𝑗 in a given data subinterval. We will
also use the following substitutions for notational simplicity: 𝛿 𝑘 = 𝜌 𝑘 −𝜔 𝑘 , 𝛾 𝑘 = 𝑎 𝑚 +𝛽−2𝜔 𝑘 ,
and 𝐾 𝑘 = 𝑅 2𝑘 /𝑂 2𝑘 .
We begin with Equation 3.31. We can rearrange and take the tangent of both sides to get
𝛼𝑏 sin(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(𝜌 − 𝜔) = (3.32)
(1 + 𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔))
44
Which we can rearrange and square to get
We substitute 1 − cos2 𝛾 for sin2 𝛾, distribute and expand the multiplications, and rearrange
to isolate factors carrying a cos 𝛾 term.
𝛼2 𝑏 2 − tan2 𝛿 2 𝛼𝑏 cos2 𝛾
= 2 cos 𝛾 + 𝛼𝑏 cos 𝛾 +
𝛼𝑏 tan2 𝛿 tan2 𝛿
We subsequently solve for cos 𝛾 with the following series of algebraic adjustments
𝛼2 𝑏 2 𝛼𝑏
2
− 1 = 𝛼𝑏 cos2 𝛾(2 sec 𝛾 + 𝛼𝑏 + )
tan 𝛿 tan2 𝛿
𝛼2 𝑏 2
2 tan2 𝛿
−1
cos 𝛾 =
𝛼2 𝑏 2
2𝛼𝑏 sec 𝛾 + 𝛼2 𝑏 2 + tan2 𝛿
v
u
𝛼2 𝑏 2
u
−1
t
tan2 𝛿
cos 𝛾 = (3.33)
𝛼2 𝑏 2
2𝛼𝑏 sec 𝛾 + 𝛼2 𝑏 2 + tan2 𝛿
From Equation 3.30, we can substitute 2𝛼𝑏/(𝐾 − 1 − 𝛼2 𝑏 2 ) for sec 𝛾, and solve for 𝑎 𝑚 + 𝛽,
recalling 𝛾 = 𝑎 𝑚 + 𝛽 − 2𝜔.
√︄
cot2 𝛿 − (𝛼𝑏) −2
𝑎𝑚 + 𝛽 = ± 4
(3.34)
𝐾−1−𝛼2 𝑏 2
+ 1 + cot2 𝛿
If we assume that the values 𝜔𝑖−1 and 𝐾𝑖−1 are known when solving for 𝜔𝑖 and 𝐾𝑖 , we can
use 𝜔𝑖−1 and 𝐾𝑖−1 to solve for 𝑎 𝑚 + 𝛽 in iteration 𝑖. Since 𝑎 𝑚 + 𝛽 is now known, we can solve
45
for 𝜔𝑖 iteratively. We iterate from −𝜋 to 𝜋, with the step size determining the precision of
our estimated 𝜔𝑖 . We deem thousandths of radians to be sufficiently precise. We select 𝜔𝑖
to be the value of 𝜔ˆ that drives the following equation closest to zero.
𝛼𝑏 sin(𝑎 + 𝛽 − 2𝜔),
𝑚
𝜌𝑖 − 𝜔ˆ − atan2 (3.35)
1 + 𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
We subsequently solve for 𝐾𝑖 iteratively, searching for values of 𝐾ˆ such that 1+𝛼2 𝑏 2 −2|𝛼𝑏| ≤
𝐾ˆ ≤ 𝛼2 𝑏 2 + 2|𝛼𝑏|. We select as 𝐾𝑖 the 𝐾ˆ that drives the following equation nearest to zero.
𝐾 − 1 − 𝛼2 𝑏 2
cos 𝛾 = (3.37)
2𝛼𝑏
𝛼𝑏 sin 𝛾
tan 𝛿 =
𝛼𝑏(𝐾−1−𝛼2 𝑏 2 )
1+ 2𝛼𝑏
2𝛼𝑏
=
2 + 𝐾 − 1 − 𝛼2 𝑏 2
2𝛼𝑏
= (3.38)
𝐾 + 1 − 𝛼2 𝑏 2
We can square both sides and use the trigonometric identity sin2 𝑥 = 1 − cos2 𝑥 with
Equation 3.37 to get
−2
2
tan 𝛿 = (2𝛼𝑏) 2 2 2 2
1 − (𝐾 − 1 − 𝛼 𝑏 ) (2𝛼𝑏) (𝐾 + 1 − 𝛼2 𝑏 2 ) −2
4𝛼2 𝑏 2 − (𝐾 − 1 − 𝛼2 𝑏 2 ) 2
=
(𝐾 + 1 − 𝛼2 𝑏 2 ) 2
= (2𝛼𝑏 − 𝐾 + 1 + 𝛼2 𝑏 2 )(2𝛼𝑏 + 𝐾 − 1 − 𝛼2 𝑏 2 )(𝐾 + 1 − 𝛼2 𝑏 2 ) −2 (3.39)
46
This iterative method is preferred over a direct calculation of 𝐾 (for example, 𝐾 = 1 +
𝛼2 𝑏 2 + 2𝛼𝑏 cos 𝛾 or 𝐾 = 2𝛼𝑏 sin 𝛾/tan 𝛿 − 1 + 𝛼2 𝑏 2 ) because these equations for 𝐾 rely on
𝛾 = 𝑎 𝑚 + 𝛽 − 2𝜔. Since we calculate 𝜔 from 𝑎 𝑚 + 𝛽, any error in 𝑎 𝑚 + 𝛽 is exacerbated in
the direct calculation of 𝐾.
For each (𝐾 𝑘 , 𝜔 𝑘 ) pair for all 𝑘 in a subinterval, we test against all possible values 𝑎ˆ 𝑚 from
the finite set of keys. We select as the symbol 𝑎 𝑚 that 𝑎ˆ 𝑚 which results in least MSE within
the subinterval, where error is defined from Equation 3.30 as
Finally, while 𝑎 𝑔 ≠ 𝑎 𝑚 , 𝛽 is constant in both the guard and the subinterval, so we leverage
tan(𝛽𝑔 ) = tan(𝛽1 ) in the following way. We begin with Equation 3.32, which we can
rearrange to
tan 𝛿(1 + 𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)) = 𝛼𝑏 sin(𝑎 𝑚 + 𝛽 − 2𝜔)
tan 𝛿
+ tan 𝛿 = tan(𝑎 𝑚 + 𝛽 − 2𝜔)
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan( 𝐴) − tan(𝐵)
Since tan( 𝐴 − 𝐵) = (3.42)
1 + tan( 𝐴) tan(𝐵)
tan(𝛽) − tan(2𝜔 − 𝑎 𝑚 ) 1
= tan(𝛿)(1 + )
1 + tan(𝛽) tan(2𝜔 − 𝑎 𝑚 ) 𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
47
1
tan(𝛽) − tan(2𝜔 − 𝑎 𝑚 ) = tan(𝛿)(1 + )(1 + tan(𝛽) tan(2𝜔 − 𝑎 𝑚 ))
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(𝛿)
tan(𝛽) = tan(𝛿) + + tan(𝛿) tan(𝛽) tan(2𝜔 − 𝑎 𝑚 )
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(2𝜔 − 𝑎 𝑚 )
+ tan(𝛿) tan(𝛽) + tan(2𝜔 − 𝑎 𝑚 )
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(2𝜔 − 𝑎 𝑚 )
tan(𝛽)(1 − tan(𝛿) tan(2𝜔 − 𝑎 𝑚 ) − tan(𝛿) )=
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(𝛿)
tan(𝛿) + + tan(2𝜔 − 𝑎 𝑚 )
𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
tan(2𝜔 − 𝑎 𝑚 )
tan(𝛽)(𝛼𝑏 cot(𝛿) − 𝛼𝑏 tan(2𝜔 − 𝑎 𝑚 ) − )=
cos(𝑎 𝑚 + 𝛽 − 2𝜔)
𝛼𝑏 + sec(𝑎 𝑚 + 𝛽 − 2𝜔) + 𝛼𝑏 tan(2𝜔 − 𝑎 𝑚 ) cot(𝛿)
Since tan(𝛽𝑔 ) = tan(𝛽𝑖 ), and since 2𝜔 − 𝑎 𝑚 = 𝛽 − 𝛾, and setting 𝑎 𝑚 = 0 in the guard interval,
Notably, 𝛾1 is the only unknown in Equation 3.44, and we can solve for it iteratively to an
arbitrary precision (at the expense of computation) since it is bounded by (−𝜋, 𝜋]. With 𝛾1
known, we solve for 𝐾1 and 𝜔1 from Equations 3.30 and 3.31, respectively.
48
3.3.2 Recovery via Numerical Approximations of Calculus Functions
We attempt a numerical recovery solution by first recalling that the transmitted signal for a
generic frequency component is defined by Equation 3.10 as
𝑇 𝑒 𝚥𝜏 = 𝑂𝑒 𝚥𝜔 + 𝛼𝑏𝑂𝑒 𝚥𝑎 𝑚 𝑒 𝚥 𝛽 𝑒 − 𝚥𝜔
ˆ 𝚥 𝜔ˆ ≠ 𝑇 𝑒 𝚥𝜏
𝑂𝑒
and
ˆ 𝚥 𝜔ˆ ≠ 𝛼𝑏𝑂𝑒 𝚥𝑎 𝑚 𝑒 𝚥 𝛽 𝑒 − 𝚥𝜔
𝑂𝑒
√︄
2
2 2 ˆ 𝐸 2𝐸 𝚥 (𝑎 −𝜔+𝜔+atan2(
ˆ 𝐸 sin( 𝜀−𝛾−𝜔) ,
))
= 𝛼 𝑏 𝑂𝑂 1 + + cos(𝜀 − 𝛾 − 𝜔)𝑒 𝑚 𝛼𝑏𝑂+𝐸 cos( 𝜀−𝛾−𝜔)
𝛼𝑏𝑂 𝛼𝑏𝑂
(3.48)
We use the notation 𝑍˜ for the magnitude of this cross-correlation to differentiate from 𝑍
used elsewhere. We then use the definition of 𝐸𝑒 𝚥𝜀 to calculate
𝑂ˆ sin( 𝜔−𝜔),
√︃ ˆ
𝚥𝜀 𝚥𝜔 ˆ 𝚥 𝜔ˆ 2 ˆ 2 ˆ 𝚥 (𝜔−atan2( 𝑂+𝑂ˆ cos( 𝜔−𝜔) ))
𝐸𝑒 = 𝑂𝑒 − 𝑂𝑒 = 𝑂 + 𝑂 − 2𝑂 𝑂 cos( 𝜔ˆ − 𝜔)𝑒 ˆ (3.49)
49
and substitute it into Equation 3.48 to get
v
u
u
u
u 1
(𝛼2 𝑏 2 + 1 + 𝐾𝑂2 − 2𝐾𝑂 cos 𝛿𝜔 )
u
u
√︁ u
u
𝑍˜ = 𝛼𝑏𝑂 𝑂ˆ 𝛼𝑏𝐾𝑂 𝛼𝑏𝐾
u
t 𝑂
√︃ (3.50)
+ 2 𝐾𝑂−2 + 1 − 2 cos 𝛿𝜔 𝐾𝑂 sin 𝛿 𝜔 ,
𝐾𝑂 cos(𝛾 + atan2( 1+𝐾𝑂 cos 𝛿 𝜔 ))
and 𝐾𝑂 sin 𝛿 𝜔 ,
−𝐸 sin(𝛾 + atan2( 1+𝐾𝑂 cos 𝛿 𝜔
)),
𝜆 = 𝑎 𝑚 + 𝛿𝜔 + atan2 𝐾𝑂 sin 𝛿 𝜔 ,
(3.51)
𝛼𝑏𝑂 + 𝐸 cos(𝛾 + atan2( 1+𝐾𝑂 cos 𝛿 𝜔
))
𝑂ˆ
where 𝛿𝜔 = 𝜔ˆ − 𝜔 and 𝐾𝑂 = 𝑂.
The phase term vector 𝜆 [𝑘] can then be considered sample points of a function of the
constant (within each subinterval) 𝑎 𝑚 and values that vary with frequency. When we take
the derivative of 𝜆 with respect to frequency, then 𝜕𝑎𝜕 𝑓 = 0, while the other terms remain.
𝑚
Subsequently integrating with respect to frequency gives 𝜆ˆ [𝑘] , which can be subtracted
from 𝜆 [𝑘] to reveal 𝑎 𝑚 . This method assumes that a derivative for 𝜆 exists and sufficient
accuracy from numerical indefinite integration is attainable. We use the derivation technique
presented by Steven G. Johnson [72] and the double exponential sinc integration method
given by Tanaka et al. [73]. Given the symmetry of the Fast Fourier Transform (FFT), we
perform our derivation-integration technique only on the positive frequency bins and negate
the result for the negative reflection.
Johnson’s derivation technique [72] is fairly straightforward. First, use the FFT to compute
Λ from 𝜆, which will have a length 𝑁𝜆 that is even (in our case). For 0 ≤ 𝑘 < 𝑁𝜆 /2,
multiply Λ 𝑘 by 2𝜋𝑘𝚤/𝐿 𝜆 , and set Λ 𝑘 to zero at 𝑘 = 𝑁𝜆 /2. For 𝑁𝜆 /2 < 𝑘 < 𝑁𝜆 , multiply Λ 𝑘
by 2𝜋(𝑘 − 𝑁𝜆 )𝚤/𝐿 𝜆 . For our purposes, 𝐿 𝜆 is the size of the domain of 𝜆. Finally, take the
inverse FFT of the modified Λ 𝑘 to compute 𝜆0.
50
The formula for integration given by Tanaka et al. [73] is
∫ 𝑥 𝑁𝐼
1 −1
∑︁
𝑓 (𝑡)d𝑡 = [tanh(𝐵 sinh(𝐶𝜓 (𝑥))) + 1] ℎ 𝑓 (𝜓(𝑘 ℎ))𝜓 0 (𝑘 ℎ)
−1 2 𝑘=−𝑁 𝐼
𝑁
" 𝑁
𝐼
∑︁ ∑︁ 𝐼
𝑐 0 𝑁𝐼
− log(𝑐𝑓 00 𝑁 )
where 𝑂 (𝑒 𝑓 𝐼 ) is an error term, which we can assume to be negligible for sufficiently
large 𝑁 𝐼 (e.g., 𝑁 𝐼 ≥ 1000), where 𝑁 𝐼 is the number of numerical integration iterations. The
function 𝜓(𝑢) is the double exponential transform, for which we use 𝜇 tanh( 𝜋2 sinh(𝑢)). For
2 2
our transform, 𝜓 0 (𝑢) is 𝜇𝜋 −1 𝑢
2 sech (𝜋 sinh(𝑢)) cosh(𝑢), and 𝜓 (𝑢) is arcsinh( 𝜋 arctanh( 𝜇 )),
as given by Takahasi and Mori [74]. We define 𝜇 as a scaling factor, which we set to the
maximum modified frequency bin in order to map (−∞, ∞) to our frequency band. A scaling
factor of 1 would map (−∞, ∞) to (-1,1).
∀𝑥 ∈ <, | 𝑓 (𝑥)| ≤ 𝛼 𝑓 exp(−𝛽 𝑓 exp(𝛾 𝑓 |𝑥|)), for some positive numbers 𝛼 𝑓 , 𝛽 𝑓 , and 𝛾 𝑓 .
(3.53)
We select 𝛾 𝑓 = 1/12000, 𝛽 𝑓 = 𝜋/2 − 10−10 , and 𝛼 𝑓 = 1039 max 𝜆0, which maintains the
51
constraint of Equation 3.53. We select a 𝑑 𝑓 = 12000𝜋, so that 𝛾 𝑓 𝑑 𝑓 = 𝜋/2, in which case
Tanaka et al. [73] give 𝐵 = 𝜋/2, 𝐶 = 𝛾 𝑓 , 𝛽𝑔 = 𝛽 𝑓 , 𝛾𝑔 = 𝛾 𝑓 , and 𝑑𝑔 = 2𝛾𝜋 𝑓 − 𝜀 𝑑 , where
𝜀 𝑑 is any positive number such that 𝑑𝑔 > 0. We set 𝜀 𝑑 to 1. Error is minimized when
𝛾 𝑓 𝑑 𝑓 = 𝜋/2 and 𝛽 𝑓 is as large as possible. The value ℎ is defined by
log(𝜋𝛾𝑔 𝑁 𝐼 )
We select 𝜀 𝑐 = (𝑑𝑔 − 𝛽𝑔 ) so that ℎ = . (3.55)
𝛾𝑔 𝑁 𝐼
We compute the numerical indefinite integration technique given by Tanaka et al. for all 𝑥
such that 𝑥 equals a frequency in a data interval in order to generate 𝜆ˆ [𝑘] . Subtracting 𝜆ˆ [𝑘]
from the actual 𝜆 [𝑘] should give 𝑎 𝑚 in each of the data intervals, from which we recover the
symbols.
52
√︃
𝑅 [𝑘] = 𝑂 [𝑘] 𝐻 [𝑘] 1 + 𝛼2 𝑏 2 + 2𝛼𝑏 cos(𝑎 [𝑘] + 𝛽 − 2𝜔 [𝑘] ) (3.57)
and 𝛼𝑏 sin(𝑎 + 𝛽 − 2𝜔 ),
[𝑘] [𝑘]
𝜌 [𝑘] = 𝜔 [𝑘] + 𝜃 [𝑘] + atan2 (3.58)
1 + 𝛼𝑏 cos(𝑎 [𝑘] + 𝛽 − 2𝜔 [𝑘] )
We will assume the ability to achieve perfect estimation in interval 𝑖 in order to validate
symbol recovery by dividing 𝐻𝑖 𝑒 𝚥𝜃 𝑖 out of 𝑅𝑖 𝑒 𝚥 𝜌𝑖 .
In the second solution method, the presence of a transfer function affects the calculation of
𝐸𝑒 𝚥𝜀 . We must now define the residual error as
where
v
u
u
u
u
u 𝐻 2𝐾𝑂 𝐻 cos 𝛾 𝐾𝑂
+ + +
√︂ u
2𝐾𝑂 u
u
𝐸 = 𝛼𝑏𝑂𝐻 t 2𝐾𝑂 𝛼𝑏 𝛼𝑏𝐻
u 𝐾𝑂 2𝛼𝑏𝐻 (3.60)
𝛼𝑏𝐻 √︃
− 𝛼21𝑏2 + 1 + 2 cos 𝛾 𝛼𝑏 sin 𝛾,
𝛼𝑏 cos(−𝛿𝜔 + 𝜃 + atan2( 1+𝛼𝑏 cos 𝛾 ))
and
𝛼𝑏 sin 𝛾,
𝜀 =𝜔 + 𝜃 + atan2( 1+𝛼𝑏 cos 𝛾 )
𝛼𝑏 sin 𝛾,
sin(−𝛿𝜔 + 𝜃 + atan2( 1+𝛼𝑏 cos 𝛾 )),
− atan2 √︁ (3.61)
𝐻 2 2 𝛼𝑏 sin 𝛾,
𝐾𝑂 1 + 𝛼 𝑏 + 2𝛼𝑏 cos 𝛾 + cos(−𝛿𝜔 + 𝜃 + atan2( 1+𝛼𝑏 cos 𝛾 ))
53
Clearly, 𝜀 varies with frequency since it is dependent on other frequency-varying variables.
Thus, when we differentiate and reintegrate 𝜆 with respect to frequency, 𝜀 will be included
ˆ So when we subtract 𝜆ˆ from 𝜆, we still reveal 𝑎 𝑚 .
in 𝜆.
When the original signal is known, determination of 𝐻𝑒 𝚥𝜃 is trivial. From Equations 3.30
and 3.31,
𝑅
𝐻 = √︁ (3.62)
𝑂 1 + 𝛼2 𝑏 2 + 2𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
and 𝛼𝑏 sin(𝑎 + 𝛽 − 2𝜔),
𝑚
𝜃 = 𝜌 − 𝜔 − atan2 (3.63)
1 + 𝛼𝑏 cos(𝑎 𝑚 + 𝛽 − 2𝜔)
3.5 Implementation
In order to test the performance of our symbol embedding and recovery schemes, we
implemented a series of modules in MATLAB R2017b [76].
3.5.1 Modules
Two main modules perform the embedding and extraction, while two minor modules sim-
ulate transmission and synchronization.
Embedding
The function stegembed.m implements the operations described in Section 2.1.2. The in-
puts to stegembed.m are coverAudio, T, alpha, fl, delay, and requestedSyms. The
base audio, coverAudio, must be in uncompressed .wav format. The float inputs T, alpha,
and delay, and positive integer input requestedSyms are meta-parameters for the period,
auxiliary signal gain, embedding start point, and the number of symbols to encode, respec-
tively. The positive integer fl is a helper parameter that defines the lowest frequency for
MMPE. The outputs of the embedding function are x, y, fs, a, start_index, stop_index,
s, symspace, and rstart. The time domain cover audio, x, is returned, which is merely
the first channel of the coverAudio input, while y is the time domain signal to be transmit-
ted with embedded steganography symbols (the number of embedded bits depends on the
requestedSyms parameter and the size of T relative to the duration of the cover.) fs is the
sampling frequency, as determined by reading the cover audio, and rstart is the delay
54
input converted from seconds to number of samples. The matrix outputs a, start_index,
stop_index, s, and symspace are the key, data interval starting indices, data interval end
indices, embedded symbols (i.e., ground truth), and symbol space, respectively. In our im-
plementation, the symbols are encoded in phase and the symbol magnitude is set to 1. As
in Ferrao’s implementation [1], stegembed.m always operates on a 2048 Hz band centered
about 2500 Hz.
Extraction
We have three extraction functions, one for each of the methods presented in Section 3.3.
All three modules have the same output, syms, which is the recovered symbols. The in-
puts to passerieuxextractMethod1.m are yrc, x, fs, T, alpha, a, symspace, rstart,
start_index, stop_index, Hinit, nKeys, and s. As in stegembed.m, T and alpha are
float inputs for period and auxiliary signal gain, respectively, while x, fs, a, symspace,
rstart, start_index, stop_index, and s are directly the outputs of stegembed.m. The
positive integer nKeys is the number of possible symbols, including unique synchroniza-
tion symbols that are not used for data transmission. yrc is the received time domain signal
with embedded symbols after transmission through a simulated channel, while Hinit is
the initial estimate of the channel transfer function. passerieuxextractMethod3.m takes
exactly the same inputs, while passerieuxextractMethod2.m takes no additional inputs
but does not require x, Hinit, or s.
55
3.6 Summary
In this chapter, we discussed symbol recovery in Ferrao’s implementation of Passerieux’s
steganographic scheme from a mathematical perspective. We presented basic equations, and
disproved the synchronization and recovery solutions asserted by Passerieux. Additionally,
we presented three information recovery solutions. The first was an analytical method
derived from frequency domain equations. The second was a numerical method built on
approximations of derivation and integration from discrete samples. The final method
presented was recovery by comparison to a known original signal. We further discussed the
effects of an adverse channel on our recovery solutions and described our implementation
modules. In Chapter 4, we discuss our design parameters and present our results.
56
CHAPTER 4:
Results and Analysis
4.1 Design
We test our recovery algorithms via two simulated channels—a relatively benign deep
water channel and a comparatively adverse shallow channel. We use the same channels as
Ferrao [1]. The deep channel has a maximum depth of 1000 m and a maximum range of
1000 m, while the shallow channel has a maximum depth of 100 m and a maximum range
of 3000 m. We test recovery every 40 m while permuting through transmitter and receiver
depths 1 m, 20 m, and 60 m in the shallow channel and 1 m, 180 m, 500 m depths in the deep
channel. For each of the three methods, we assess recovery performance when encoding
a symbol space of size 2, 4, 7, 12, 23, 32, and 60, respectively, in each period. Based on
Ferrao’s results [1], we maintain a period (𝑇) of 0.5 seconds, a gain (𝛼) of 0.32, and SNR
of 50 dB.
While a single run of our numerical recovery method (Section 3.3.2) is imperceptably slower
than a single run of either the analytical (Section 3.3.1) or comparative (Section 3.3.3)
methods, performing the number of iterations necessary to test our second recovery method
on the same scale as the others was infeasible in practice. As a result, we modified the
experiment design to test recovery every 80 m in both channels for the numerical recovery
method. Additionally, in the shallow channel we hold transmitter depth constant at 60m
while iterating through receiver depths of 1 m, 20m, and 60 m, and in the deep channel we
limit our transmitter and receiver depths to 1 m and 500 m. All other parameters remain
unchanged from the other experiments.
57
4.2 Results
Since we modified the mechanism by which symbols are encoded, it is necessary to
first ensure that the steganographic characteristics of the scheme are not degraded. We
then present symbol recovery performance. For our analyses, we use the first channel
of an uncompressed .wav recording of sperm whale vocalizations sampled at 48 kHz,
SpermWhaleNormalClicks.wav obtained from the National Oceanic and Atmospheric Ad-
ministration (NOAA) Southwest Fisheries Science Center catalog of cetacean sounds [77].
-100
Power (dB)
15
-110
-120
10
-130
5 -140
-150
0
0.2 0.4 0.6 0.8 1
Time (minutes)
Frequency (kHz)
-100 -100
15 15
Power (dB)
-120 -120
10 10
-130 -130
5 -140 5 -140
-150 -150
0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
Time (minutes) Time (minutes)
Figure 4.1. Spectrograph of (a) the Unmodied Cover Signal, (b) the Sig-
nal with Binary Symbols Embedded, and (c) the Signal with 6-bit Symbols
Embedded.
The spectrographs are clearly effectively identical, but we quantify the statistical difference
58
between the signals via common measures of “hiddenness" of a steganographic signal—
MSE, Root Mean Square Error (RMSE), SNR, and PSNR.
MSE or RMSE
MSE is a measure of the difference between the steganographic signal and the cover signal,
where RMSE is merely the square root of MSE. If the cover signal is given by 𝑦 [𝑘] and the
steganographic signal is given by 𝑥 [𝑘] , both of length 𝑁, the MSE is
𝑁
1 ∑︁
MSE = (𝑥𝑖 − 𝑦𝑖 ) 2 (4.1)
𝑁 𝑖=1
Figure 4.2 shows how average RMSE varies with the size of the symbol space. Maximum
RMSE is 3.474132𝑥10−4 and RMSE does not increase with symbol space, indicating that
our new encoding technique is no less covert than Ferrao’s, nor do we sacrifice covertness
for increased symbol transmission.
3.48
Average RMSE (x10 4)
3.46
3.44
3.42
3.40
0 10 20 30 40 50 60
No. of Symbols
59
SNR and PSNR
SNR and PSNR are also variations on MSE, where in this case “Noise” in Signal-to-Noise
Ratio is the distortion caused by modifying the signal [78]. Since 𝑦 [𝑘] is normalized so that
the maximum possible amplitude is 1, PSNR is
2
𝑦𝑚𝑎𝑥 1
PSNR = 10 log10 = 10 log10 . (4.2)
MSE MSE
SNR is defined as Í𝑁 2
𝑖=1 𝑦𝑖
SNR = 10 log10 . (4.3)
MSE
As a rule, we desire SNR to be greater than 30 dB for imperceptible modification, while
SNR less than 20 dB risks our method’s LPI/LPD characteristics. The primary driver of
SNR is the gain term 𝛼. Our choice of 𝛼 = 0.32 maintains SNR above 20 dB, but a gain
value sufficiently small to raise average SNR above 30 dB (𝛼 / 0.18) undermines symbol
recovery [1].
Figure 4.3 shows how average PSNR and SNR vary with the size of the symbol space. As
with Figure 4.2, this figure reinforces that our encoding technique maintains comparable
covertness to Ferrao’s method and enables increased transmission rates without impacting
LPI/LPD.
60
(a) Average PSNR vs. Symbol Space (b) Average SNR vs. Symbol Space
69.5 23.5
69.4 23.4
69.3 23.3
Average PSNR (dB)
69.1 23.1
69.0 23.0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
No. of Symbols No. of Symbols
Figure 4.3. Variation of Average SNR and PSNR with the Number of En-
codable Symbols.
The traditional steganography metrics in our embedding scheme are on the same order as
Ferrao’s results, indicating that our technique does not negatively affect the covertness of
the system, allowing achievement of increased bit rates without sacrificing LPI/LPD.
Histograms
A histogram of signal phase provides a convenient visualization of the relative frequency
with which frequency values appear in the signal, and can provide a “fingerprint” for a
given signal or period. For LPI/LPD, we expect the histogram for the steganographic signal
to be very similar to the cover signal. Conversely, we would need some uniqueness in the
distribution that is mappable to the embedded symbol in order to achieve recovery through
statistical means.
Figure 4.4 presents the histograms for the phase within the Least Significant Bit (LSB)
subinterval for five arbitrary periods, comparing the original, transmitted, received, and
cross-correlation signals. The cross-correlation signal is presented in both the frequency
domain and time domain. Each histogram within a row represents the same arbitrary
subinterval, period, and channel configuration. The symbol value in the given subinterval
61
is shown by a red vertical line. The dotted vertical blue line gives the mean value in the
subinterval, while the dashed vertical blue line displays the mean value for the full period.
Phase is bounded by the interval (−𝜋, 𝜋], the histogram bin widths are 0.1323 radians, and
the histogram bin heights are normalized so that the maximum possible value is 1.
Statistical Distributions of Phase
Original Transmitted Received Cross-Correlation (FD) Cross-Correlation (TD)
2.35619
2.35619
0.05 0.05 0.05 0.05 0.05
0 0 0 0 0
-2 0 2 -2 0 2 -2 0 2 -2 0 2 -2 0 2
-1.0472
0 0 0 0 0
-2 0 2 -2 0 2 -2 0 2 -2 0 2 -2 0 2
-1.0472
0 0 0 0 0
-2 0 2 -2 0 2 -2 0 2 -2 0 2 -2 0 2
-1.0472
0 0 0 0 0
-2 0 2 -2 0 2 -2 0 2 -2 0 2 -2 0 2
2.35619
2.35619
0.05 0.05 0.05 0.05 0.05
0 0 0 0 0
-2 0 2 -2 0 2 -2 0 2 -2 0 2 -2 0 2
Original Transmitted Received Cross-Correlation (FD) Cross-Correlation (TD)
Figure 4.5 presents histograms of amplitude within an arbitrary data subinterval for five
arbitrary periods of the original, transmitted, received, and cross-correlation signals. As
with the histograms of phases, the cross-correlation signal is presented in both frequency
domain and time domain, and each histogram within a row represents the same arbitrary
subinterval, period, and channel configuration. A dotted vertical blue line gives the mean
value in the subinterval, and a dashed vertical blue line displays the mean value for the full
period. There is not a convenient analogue of symbol value in signal amplitude, so there is no
visual presentation thereof, but the histograms are labeled with their relative symbols. The
histogram bin heights are normalized so that the maximum possible value is 1. For ease of
comparison, the axis limits are the same across all histograms with the notable exception of
the time domain cross-correlation signal. The results for the time domain cross-correlation
amplitude histogram are so different in scale that it has a unique set of axis limits.
62
Statistical Distributions of Amplitude
Original Transmitted Received Cross-Correlation (FD) Cross-Correlation (TD)
2.35619
2.35619
0.05 0.05 0.05 0.05
0.5
0 0 0 0 0
0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.005 0.01 0.015 0.02
2.35619
2.35619
0.05 0.05 0.05 0.05
0.5
0 0 0 0 0
0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.005 0.01 0.015 0.02
-1.0472
0.5
0 0 0 0 0
0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.005 0.01 0.015 0.02
-1.0472
0.5
0 0 0 0 0
0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.005 0.01 0.015 0.02
-1.0472
0.5
0 0 0 0 0
0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.5 1 1.5 0 0.005 0.01 0.015 0.02
Original Transmitted Received Cross-Correlation (FD) Cross-Correlation (TD)
Figure 4.6 and Figure 4.7 are the same as Figure 4.4 and Figure 4.5, respectively, except that
they show the histogram of the Most Significant Bit (MSB) frequency band for a symbol
space of 60. There are fewer bins because each successive bit in our system is two-thirds the
size of the preceding. This results in more noticeable differences among the histograms, but
these differences do not correspond to particular bit values nor do they appear to indicate
statistical anomalies that would reveal our steganographic presence.
63
Figure 4.6. Histograms of MSB Phase Values in the Cover, Steganographic,
Received, and Cross-Correlated Signals.
64
We highlight in the comparative histograms the relative similarity of the cover, transmitted,
and received signals, though effects of the noise floor on the received signal can be seen.
The lack of unique characteristics between symbol values (i.e., absence of distinguishing
features down a column) demonstrates that statistical recovery or detection methods are
unlikely to be successful. Notably, the mean of phase is effectively zero in every case. This
is due to the bound on potential phase values—any shift or change in phase affects the
interval equally, as values near the edges of the interval circle around to the opposite side
of the interval, rather than exceeding the bounds (i.e., modulus 2𝜋 with the result shifted by
−𝜋.) Taken together, the lack of distinguishing features among the histograms of amplitude
and phase indicate that knowledge of the steganographic scheme alone is highly unlikely to
be sufficient to detect the transmission, much less determine the embedded symbols.
65
Table 4.1. Deep Channel Experiment Parameters.
Parameter Values
Symbol Space 2, 4, 7, 12, 23, 32, 60
T, 𝛼, SNR 0.5 s, 0.32, 50 dB
Method Analytical, Comparative
Transmitter Depth 1 m, 180 m, 500 m
Receiver Depth 1 m 180 m, 500 m
Range 1 m to 1000 m, in 40 meter increments
Method Numerical
Transmitter Depth 1 m, 500 m
Receiver Depth 1 m, 500 m
Range 1 m to 1000 m, in 80 meter increments
symbols is shown by different colors, with each recovery method having a different line
style. Table 4.2 summarizes the recovery performances averaged across range.
Symbol Recovery vs Range in Deep Channel
100
Symbol Space
80 2
4
7
12
23
60 32
Recovery [%]
60
Recovery Method
Analytical
40 Comparative
Numerical
20
0
0 200 400 600 800 1000
Range [m]
Our analytical recovery method performs within about 1% of the comparative method,
66
Table 4.2. Average Symbol Recovery Performance in a Model Deep Channel.
achieving a maximum goodput of 11.73 bps when synchronization and equalization are
assumed. The numerical recovery method, on the other hand, on average performs the
same as random guessing. Our assumption of perfect knowledge of the transfer function
in each period reverses the combined channel effects that normally cause increased signal
degradation as a function of distance. As a result, the otherwise expected reduction of
symbol recovery performance with distance is not reflected in Figure 4.8.
67
Table 4.3. Shallow Channel Experiment Parameters.
Parameter Values
Symbol Space 2, 4, 7, 12, 23, 32, 60
T, 𝛼, SNR 0.5 s, 0.32, 50 dB
Method Analytical, Comparative
Transmitter Depth 1 m, 20 m, 60 m
Receiver Depth 1 m, 20 m, 60 m
Range 1 m to 3000 m, in 40 meter increments
Method Numerical
Transmitter Depth 60 m
Receiver Depth 1 m, 20m, 60 m
Range 1 m to 3000 m, in 80 meter increments
model, the notional sound speed is 1500 m/s, and the channel has a bandwidth of 4096 Hz
centered on 2500 Hz. Identically to the deep channel experiment, for each symbol space,
transmitter depth, and receiver depth combination, we perform the embedding procedure,
simulate the received signal based on the channel characteristics, and then apply extraction
with each recovery method. Table 4.3 summarizes the experiment’s simulation parameters.
As with the deep channel model, in Figure 4.9 we average symbol recovery across all
transmitter and receiver depths in order to highlight the effect of recovery method and
number of encoded symbols on recovery performance. The graph shows recovery percentage
vs. range for each transmitter depth. The number of encoded symbols is shown by different
colors, with each recovery method having a different line style. Table 4.4 summarizes the
recovery performances averaged across range.
Even in the more adverse channel, our analytical recovery method maintains a similar
performance to the comparative method, staying within about 10% of comparative recovery.
As in the deep channel experiment, the numerical recovery method performs no better than
random guessing. Once again, our assumption of perfect equalization negates the factors
that normally cause reduction of symbol recovery performance as a function of distance,
resulting in nearly constant recovery with range as shown in Figure 4.9.
68
Symbol Recovery vs Range in Shallow Channel
100
Symbol Space
80 2
4
7
12
23
60 32
Recovery [%]
60
Recovery Method
Analytical
40 Comparative
Numerical
20
0
0 500 1000 1500 2000 2500 3000
Range [m]
When synchronization and equalization are assumed, recovery is apparently dictated by the
number of bits per symbol. In our encoding scheme, the MSB is also the smallest interval
in the subdivision scheme, and therefore the most prone to error. In general, it seems that as
more possible symbols require the most significant bit, performance degrades. (Note: the
minimum bits in our scheme is four, and the two additional higher value symbols [e.g., 60
and 61 in the 60-symbol encoding] are generated and reserved for synchronization.) While
the assumptions of synchronization and equalization are by no means trivial and will need
to be solved for an operational system, our results nonetheless indicate that our analytical
recovery method could contribute to a promising steganographic communication scheme.
Equalization Assumptions
Notably, neither Figure 4.8 nor Figure 4.9 show a decrease in recovery performance as
range increases. While this behavior would normally be expected, our assumption of per-
fect equalization counteracts range-dependent adverse channel effects such as scattering,
absorption, and spreading. Figure 4.10 and Figure 4.11 show how different equalization
assumptions affect recovery in our example deep channel and shallow channel, respectively.
For this case study, we use the parameters shown in Table 4.5.
Figure 4.10 shows recovery percentage versus range in the deep channel for each of several
69
Table 4.4. Average Symbol Recovery Performance in a Model Shallow Chan-
nel.
70
Table 4.5. Equalization Assumption Comparison Parameters.
Parameter Values
Symbol Space, T, 𝛼, SNR 4, 0.5 s, 0.32, 50 dB
Method Analytical
Equalization Assumptions None, 𝑒 , 𝐻𝑖 , 𝑒
𝑗 𝜃 𝑖 𝑗 𝜃 𝑖−1 , 𝐻𝑖−1 , 𝐻𝑖−1 𝑒 𝑗 𝜃 𝑖 , 𝐻𝑖 𝑒 𝑗 𝜃 𝑖−1 , 𝐻𝑖−1 𝑒 𝑗 𝜃 𝑖−1
Channel Deep Channel
Transmitter Depth 180 m
Receiver Depth 180 m
Range 1 m to 1000 m, in 25 meter increments
Channel Shallow Channel
Transmitter Depth 60 m
Receiver Depth 60 m
Range 1 m to 3000 m, in 25 meter increments
and
• phase and amplitude equalization based on the transfer function for the previous
symbol only (𝐻𝑖−1 𝑒 𝚥𝜃 𝑖−1 ).
The effects of each assumption are shown by different combinations of line-style and color.
This shows the importance of correct phase equalization on symbol recovery. While correct
amplitude equalization has an almost negligible positive effect on recovery, correct phase
equalization is necessary for any successful transmission over appreciable distance. Inter-
estingly, equalization based on the transfer function in the previous period shows a constant
poor performance—even under-performing no equalization at all. This behavior is due to
the random noise generated as part of the channel simulation, which guarantees that the
phases within each period will be different.
Figure 4.11 shows recovery percentage vs. range in the shallow channel for each of our
equalization assumptions. We use the same set of assumptions and the same color scheme
to graph their affects.
71
100
80 Transfer Functions
None
ej i
Hi
ej i 1
Hi 1
60 Hi 1ej i
Recovery [%]
Hiej i 1
Hi 1ej i 1
40
20
0
0 200 400 600 800 1000
Range [m]
phase equalization is not merely necessary for adequate performance, but excellent perfor-
mance over great distance is possible so long as phase equalization is achieved. Moreover,
these graphs make visually apparent the increased volatility of the shallow channel versus
the deep-water channel.
Bit Rate
The theoretical upper bound on the number of bits (𝑀) in our scheme is given by
2𝐵−1
b(b( Í ∗ 2048 ∗ 𝑇) ∗ 0.25) = 1 (4.4)
3𝐵−1 𝐵𝑘=1 [(2/3) 𝑘 ]
where 𝐵 is the number of bits and 𝑇 is the period duration. This equation ensures that the
guard interval is at least of size 1, which gives a maximum bit rate of 12 bps for a 0.5 s
period. As we see in our results, the actual bound is lower in practice since we determine
the correct bit value by MSE. As the number of bits increases, the size of the smallest data
interval shrinks, and recovery performance degrades as the sample size becomes too small to
overcome individual errors. However, Equation 4.4 indicates that both data rate and recovery
performance theoretically could be improved by increasing the period. Figure 4.12 validates
72
100
80 Transfer Functions
None
ej i
Hi
ej i 1
Hi 1
60 Hi 1ej i
Recovery [%]
Hiej i 1
Hi 1ej i 1
40
20
0
0 500 1000 1500 2000 2500 3000
Range [m]
this prediction to a degree. The Deep Channel graph gives results using the open-ocean
model and the Shallow Channel graph shows results for the shallow channel model. Only
the analytical recovery method is used, and different periods are represented by different
line styles, with the number of encoded symbols shown by different colors.
73
Deep Channel Shallow Channel
100 100
Symbol Space
80 80 2
4
7
12
23
60 60 32
Recovery [%]
60
Recovery Method
T=0.25
40 40 T=0.50
T=1.00
20 20
0 0
0 200 400 600 800 1000 0 500 1000 1500 2000 2500 3000
Range [m] Range [m]
While recovery performance is in general improved by increasing the period, the improve-
ment is insufficient to surmount the loss in overall transmission rate due to decrease in
symbol rate.
4.3 Summary
In this chapter, we described the design parameters of our experiments, presented the
steganographic security of our embedding scheme, reviewed the results of simulated com-
munications through various model channels, and briefly discussed the effects of equal-
ization assumptions and bit rate considerations in our system. We found that correct syn-
chronization and phase equalization is necessary for adequate performance; however, when
synchronization and equalization are achieved, high rates of recovery are possible at long
ranges. We also showed that longer periods could theoretically result in higher transmission
rates. In Chapter 5, we summarize our findings and present opportunities for future research.
74
CHAPTER 5:
Conclusions and Future Work
While correct synchronization and phase equalization is necessary for adequate perfor-
mance, it is possible to achieve bit rate improvement better than eight bps while maintain-
ing desirable LPI/LPD characteristics—but we did not achieve BER less than 10-3 . Once
assumptions of synchronization and equalization are applied, we are able to achieve suc-
cessful recovery approaching 12 bps at ranges of 1 km or more without negatively affecting
steganographic covertness, though better bit rates are theoretically achievable by increasing
the period.
75
The idea that the indefinite integral of the numerical derivative of the cross-correlation phase
existed and that such an integral could be numerically approximated was not supported by
our research. Since the cross-correlation function generates values at equally spaced sample
points, the numerical derivative is merely a collection of linear local approximations joined
by discontinuities, where the discontinuities are aligned to the sample points. When we
integrate numerically, our sample points are the discontinuity points. Moreover, since we
are specifically interested in the integral within a finite interval, the integral must necessarily
be definite.
76
that maximizes bit rate while maintaining sufficient steganographic covertness.
5. Developing error correction and OSI Layer 2 and 3 protocols to enable interoperability
with existing equipment.
6. Modifying the recovery algorithm to encode bits using Quadrature Phase Shift Keying
(QPSK) or Asymmetric Phase Shift Keying (APSK), and evaluating the operational
and security tradeoffs.
7. Rigorously evaluating the security of our system via steganalysis and information
theory techniques.
8. Implementing and testing a complete steganographic system with receiver and trans-
mitter.
9. Performing real-time testing in actual underwater environments.
10. Developing software designed for usability by communications systems operators.
5.3 Conclusion
In this thesis, we showed that recovery of symbols encoded in an LPI/LPD acoustic steganog-
raphy scheme is possible based on locally constant variables (i.e., without knowledge of
the cover signal.) However, the success of our recovery system relies on accurate synchro-
nization and equalization techniques, which we merely assumed. Thus, in order to make
this system practical, significant work remains to generate synchronization and equalization
solutions.
77
THIS PAGE INTENTIONALLY LEFT BLANK
78
List of References
[1] R. Ferrao, “Underwater masked carrier acoustic communication: modeling and anal-
ysis,” M.S. thesis, Naval Postgraduate School, Monterey, CA, 2018.
[3] J. Passerieux, “Method and system for acoustic communication,” U.S. Patent
15/319,703, May 25, 2017.
[7] L. Kinsler, A. Frey, A. Coppens, and J. Sanders, Fundamentals of Acoustics, 4th ed.
Hoboken, NJ: John Wiley & Sons, 1999.
[8] G. Howe, P. Tarbit, O. Hinton, B. Sharif, and A. Adams, “Sub-sea acoustic remote
communications utilising an adaptive receiving beamformer for multipath suppres-
sion,” in OCEANS CONF REC IEEE, Brest, France, 1994, vol. 1, pp. 313–316.
[10] C. Tellambura and V. Bhargava, “Convolutionally coded binary PSAM for Rayleigh
fading channels,” Electronics Letters, vol. 28, no. 16, pp. 1503–1505, 1992,
https://doi.org/10.1049/el:19920955.
79
[11] S. Joshi, “Coded-OFDM in various multipath fading environments,” in 2010 2nd
International Conference on Computer and Automation Engineering (ICCAE), Sin-
gapore, 2010, vol. 3, pp. 127–131, https://doi.org/10.1109/ICCAE.2010.5452071.
[14] M. Feder and J. Catipovic, “Algorithms for joint channel estimation and
data recovery—application to equalization in underwater communications,”
IEEE Journal of Oceanic Engineering, vol. 16, no. 1, pp. 42–55, 1991,
https://doi.org/10.1109/48.64884.
[16] J. Proakis, E. Sozer, J. Rice, and M. Stojanovic, “Shallow water acoustic net-
works,” IEEE Communications Magazine, vol. 39, no. 11, pp. 114–119, 2001,
https://doi.org/10.1109/35.965368.
80
[21] I. Akyildiz, D. Pompili, and T. Melodia, “State of the art in protocol re-
search for underwater acoustic sensor networks,” ACM SIGMOBILE Mobile
Computing and Communications Review, vol. 11, no. 4, pp. 11–22, 2007,
https://doi.org/10.1145/1347364.1347371.
[22] M. Stojanovic, Underwater Acoustic Communication. Hoboken, NJ: John Wiley &
Sons, 2015, pp. 1–12. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/
047134608X.W5411.pub2
[23] S. Liu and A. Song, “Optimization of LDPC codes over the underwater acoustic
channel,” International Journal of Distributed Sensor Networks, vol. 12, no. 2, pp.
1–10, 2016, https://doi.org/10.1155/2016/8906985.
[25] J. Catipovic and L. Freitag, “High data rate acoustic telemetry for moving ROVs
in a fading multipath shallow water environment,” in Symposium on Autonomous
Underwater Vehicle Technology, Washington, D.C., 1990, pp. 296–303.
[26] G. Wenz, “Acoustic ambient noise in the ocean: Spectra and sources,” The Jour-
nal of the Acoustical Society of America, vol. 34, no. 12, pp. 1936–1956, 1962,
https://doi.org/10.1121/1.1909155.
[27] A. Bessios and F. Caimi, “Multipath compensation for underwater acoustic commu-
nication,” in OCEANS CONF REC IEEE, Brest, France, 1994, vol. 1, pp. 317–322.
[28] G. Yang, Q. Guo, D. Huang, J. Yin, and M. Zheng, “Kalman filter-based chip dif-
ferential blind adaptive multiuser detection for variably mobile asynchronous un-
derwater multiuser communications,” IEEE Access, vol. 6, p. 49646–49653, 2018,
https://doi.org/10.1109/ACCESS.2018.2868475.
[29] G. Eynard and C. Laot, “Blind Doppler compensation scheme for single carrier
digital underwater communications,” in OCEANS 2008. IEEE, 2008, pp. 1–5,
https://doi.org/10.1109/OCEANS.2008.5152066.
81
[32] K. Lowham, “Synchronization analysis and simulation of a standard IEEE 802.11g
OFDM signal,” M.S. thesis, Naval Postgraduate School, Monterey, CA, 2004.
[34] J. Locke and P. White, “The performance of methods based on the fractional
Fourier transform for detecting marine mammal vocalizations,” The Journal
of the Acoustical Society of America, vol. 130, no. 4, pp. 1974–1984, 2011,
https://doi.org/10.1121/1.3631664.
[35] J. Ling, H. He, J. Li, W. Roberts, and P. Stoica, “Covert underwater acous-
tic communications: Transceiver structures, waveform designs and associ-
ated performances,” in OCEANS 2010. IEEE publishing, 2010, pp. 1–10,
https://doi.org/10.1109/OCEANS.2010.5663840.
[36] K. Howland, “Signal detection and frame synchronization of multiple wireless net-
working waveforms,” M.S. thesis, Naval Postgraduate School, Monterey, CA, 2007.
[37] T. Goh, J. Liu, and B. Soong, “Deterministic signal detection: a hybrid approach,”
in Singapore ICCS/ISITA ’92. Singapore: IEEE publishing, 1992, pp. 385–389,
https://doi.org/10.1109/ICCS.1992.254924.
[38] C. Yang, J. Qu, S. Li, and S. Mao, “Signal detection with higher-order statistics,” in
3rd International Conference on Signal Processing, 1996. IEEE publishing, 1996,
pp. 545–548, https://doi.org/10.1109/ICSIGP.1996.567322.
[39] E. Hoppe and M. Roan, “Principal component analysis for emergent acous-
tic signal detection with supporting simulation results,” The Journal of
the Acoustical Society of America, vol. 130, no. 4, pp. 1962–1973, 2011,
https://doi.org/10.1121/1.3628324.
[41] J.-Z. Wang, P.-S. Zhou, P. Catipovic, and P. Willett, “Asynchronous multiuser
reception for OFDM in underwater acoustic communications,” IEEE Trans-
actions On Wireless Communications, vol. 12, no. 3, pp. 1050–1061, 2013,
https://doi.org/10.1109/TWC.2013.011713.120075.
82
[42] S. Aliesaw, C. Tsimenidis, B. Sharif, and M. Johnston, “Soft Rake and DFE based
IDMA systems for underwater acoustic channels,” in Sensor Signal Processing for
Defence (SSPD 2010), 2010, pp. 1–5.
[43] S. Yurdakul and S. Senturk, “Performance of a digital data transmission system us-
ing matched filter processing of an auxiliary signal for synchronization,” M.S. thesis,
Naval Postgraduate School, Monterey, CA, 1979.
[44] K. Kung and K. Parhi, “Optimized joint timing synchronization and channel estima-
tion for OFDM systems,” IEEE Wireless Communications Letters, vol. 1, no. 3, pp.
149–152, 2012, https://doi.org/10.1109/WCL.2012.022812.120015.
[46] J. Grotz, B. Ottersten, and J. Krause, “Joint channel synchronization under interfer-
ence limited conditions,” IEEE Transactions On Wireless Communications, vol. 6,
no. 10, pp. 3781–3789, 2007, https://doi.org/10.1109/TWC.2007.060099.
[47] D. Brady and J. Catipovic, “Adaptive multiuser detection for underwater acoustical
channels,” IEEE Journal Of Oceanic Engineering, vol. 19, no. 2, pp. 158–165, 1994,
https://doi.org/10.1109/48.286637.
[49] U. Ali, M. Kieffer, and P. Duhamel, “Joint protocol-channel decoding for robust
frame synchronization,” IEEE Transactions On Communications, vol. 60, no. 8, pp.
2326–2335, 2012, https://doi.org/10.1109/TCOMM.2012.061212.11041.
[52] D. Kari, M. Sayin, and S. Kozat, “A new robust adaptive algorithm for underwater
acoustic channel equalization,” 2015.
83
[53] A. Youcef, C. Laot, and K. Amis, “Adaptive frequency-domain equalization for un-
derwater acoustic communications,” in 2011 IEEE OCEANS — Spain. IEEE pub-
lishing, 2011, pp. 1–6, https://doi.org/10.1109/Oceans-Spain.2011.600362.
[54] B. Nott, “Long-endurance maritime surveillance with ocean glider networks,” M.S.
thesis, Naval Postgraduate School, Monterey, CA, 2015.
[56] A. Song and M. Badiey, “Time reversal acoustic communication for multiband trans-
mission,” The Journal of the Acoustical Society of America, vol. 131, no. 4, pp.
EL283–EL288, 2012, https://doi.org/10.1121/1.3690965.
[58] W. Rao, W. Tan, Y. Li, and H. Gao, “New modified constant modulus algorithm for
underwater acoustic communications,” in 2011 International Conference on Internet
Computing & Information Services (ICICIS). IEEE publishing, 2011, pp. 563–566,
https://doi.org/10.1109/ICICIS.2011.149.
[60] C. Tseng, F. Lu, F. Chen, and S. Wu, “Compensation of multipath fading in under-
water spread-spectrum communication systems,” in Proceedings of the 1998 Inter-
national Symposium on Underwater Technology. IEEE publishing, 1998, pp. 453–
458, https://doi.org/10.1109/UT.1998.670153.
[61] J. Tong, P. Li, and X. Ma, “Superposition coded modulation with peak-power lim-
itation,” IEEE Transactions on Information Theory, vol. 55, no. 6, pp. 2562–2576,
2009, https://doi.org/10.1109/TIT.2009.2018224.
84
[63] B. Sharif, J. Neasham, O. Hinton, and A. Adams, “Closed loop Doppler tracking
and compensation for non-stationary underwater platforms,” in OCEANS 2000
MTS/IEEE Conference and Exhibition. IEEE publishing, 2000, vol. 1, pp. 371–375,
https://doi.org/10.1109/OCEANS.2000.881287.
[64] C. Z. Huang and C. Balanis, “The MMSE algorithm and mutual coupling for adap-
tive arrays,” IEEE Transactions on Antennas and Propagation, vol. 56, no. 2, pp.
1292–1296, 2008, https://doi.org/10.1109/TAP.2008.922619.
[65] J. Catipovic and L. Freitag, “Spatial diversity processing for underwater acoustic
telemetry,” IEEE Journal Of Oceanic Engineering, vol. 16, no. 1, pp. 86–97, 1991,
https://doi.org/10.1109/48.64888.
[67] G. Wibisono and I. Sasase, “Trellis coded PSK modulation with diversity on corre-
lated Rayleigh fading channel,” in Proceedings of ICUPC - 5th International Con-
ference on Universal Personal Communications. IEEE publishing, 1996, vol. 2, pp.
538–542, https://doi.org/10.1109/ICUPC.1996.562631.
[68] L. Goldfeld and D. Wulich, “Multichannel system with optimal diversity re-
ception and erasures-correction decoder for Rayleigh fading channels,” IEEE
Transactions on Communications, vol. 48, no. 12, pp. 1979–1982, 2000,
https://doi.org/10.1109/26.891205.
[70] T. Müller and H. Rohling, “Channel coding for narrow-band Rayleigh fading with
robustness against changes in Doppler spread,” IEEE Transactions on Communica-
tions, vol. 45, no. 2, pp. 148–151, 1997, https://doi.org/10.1109/26.554360.
85
[73] K. Tanaka, M. Sugihara, and K. Murota, “Numerical indefinite integration by double
exponential sinc method,” Mathematics of Computation, vol. 74, no. 250, pp. 655–
679, 2005.
[74] H. Takahasi and M. Mori, “Double exponential formulas for numerical integration,”
Publications of the Research Institute for Mathematical Sciences, vol. 9, no. 3, pp.
721–741, 1973.
[76] MATLAB, version R2017b. The MathWorks, Inc. [Online]. Available: https://www.
mathworks.com
86
Initial Distribution List
87