0% found this document useful (0 votes)
9 views

A Novel Family of Robust Hyperbolic Arctan Adaptive Filtering Algorithms1

This document proposes a novel family of robust hyperbolic arctan adaptive filtering algorithms for impulsive noise environments. It introduces parameterized least absolute difference error VP-LAD algorithms within this family. The algorithms are analyzed theoretically and applied to practical simulations to examine performance. Prior work on robust cost functions is discussed, including M-shaped, Λ-shaped and V-shaped gradient functions. The general framework for stochastic gradient adaptive filtering algorithms is also outlined.

Uploaded by

pg2017.omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

A Novel Family of Robust Hyperbolic Arctan Adaptive Filtering Algorithms1

This document proposes a novel family of robust hyperbolic arctan adaptive filtering algorithms for impulsive noise environments. It introduces parameterized least absolute difference error VP-LAD algorithms within this family. The algorithms are analyzed theoretically and applied to practical simulations to examine performance. Prior work on robust cost functions is discussed, including M-shaped, Λ-shaped and V-shaped gradient functions. The general framework for stochastic gradient adaptive filtering algorithms is also outlined.

Uploaded by

pg2017.omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

A novel Family of robust hyperbolic arctan adaptive filtering algorithms

For impulsive noise Environment


Abstract
In this work we study a novel family of adaptive filtering algorithms based on hyperbolic arctan
adaptive filtering algorithms. Basically, the proposed algorithms adapt the conventional cost functions
in it criticization of optimality based on error values. The proposed algorithms are robust against
heavy-tailed noise distribution. For these algorithms family, we used a general theoretical framework
for the transient and steady-state performance analysis, and derived steady-state excess mean square
error (EMSE) and conditions for convergence stability. The proposed algorithms are applied to
practical simulations study to examine the performance
Introduction
The most widely adaptive filtering algorithm is least-mean-square (LMS) [1], due to its
convexity, stability, computational simplicity and optimality under Gaussian assumption. Generally,
LMS and its family normalized Least-mean-square (NLMS) and variable step-size LMS (VSS-LMS)
use the second order statistics of the error signal - mean-square error (MSE) as cost function, which is
recognized as a default tool in the Gaussian environments [2, 3]. It’s known that the cost function
based on higher order powers of the error (even power) e.g., least mean fourth (LMF) algorithm
provides better trade-off between convergence speed and steady-state performance compared to LMS
as shown in [4] but it suffers from convergence and stability problems. Unfortunately, signals, in
many real-world applications, often follow non-Gaussian/impulsive (heavy-tailed) distribution. There,
the performance of the algorithms based on second order powers of the error or higher will be highly
degraded.
To address the heavy-tailed process problem, a variety of robust optimization criteria and cost
functions have been studied in the literature. Commonly used cost function can be broadly classified
into three groups. First, cost functions based on fractional lower-order statistics (FLOS) moments [5]
which is mainly includes least mean-pth (LMP) of error and mean absolute error (MAE) [6, 7],.
Second, cost functions-based information theoretic learning (ITL) this includes, generalized
maximum correntropy criteria (GMCC) cost function [8], kernel risk-sensitive loss cost function
(KRSL) [9]. Third, cost functions-based logarithmic and exponential cost function, this includes the
logarithmic hyperbolic cosine adaptive filter (LHCAF) [10], least logarithmic absolute difference
(LLAD) algorithm [11], the least mean logarithmic square (LMLS) algorithm[11]. The algorithm
based on MAE, updates its weight coefficient vector by using only sign of the error signal, resulting
into effective suppression of large outliers and high robustness to impulsive noises. However, it
exhibits slow convergence rate, large steady-state error and an inferior performance in the absence of
impulsive noises. The mixed-norm cost function (robust Huber cost function (RHF)) [12] is hybrid of
MSE and MAE of the system output error. This RHF down the coefficients only the heavy-tailed,
otherwise it behaves like MSE filter. Basically, best performance of RHF depending to the threshold
parameter which requires well optimization in order to determine the conversion of MSE and MAE;
therefore, it requires to operate several different algorithms in parallel which may not feasible in some
real-world applications. The GMCC cost function is nonlinear similarity measure defined in the
kernel space. The performance of the algorithm based on GMCC is sharply around the optimal
solution but flat away from the optimal solution, which can lead to the performance degradation [13].
The authors in [9] introduced KRSL cost function as an alternative similarity measure, which provides
more convex performance leading to faster convergence and good solution accuracy while
maintaining the robustness to the heavy-tailed impulsive noises compare to the correntropy-based
algorithms. The logarithmic cost function defined in [11] is developed from regret approaches [14,
15], and it utilized to achieve better trade-off between the transient and the steady-state performance
by intrinsically combines the higher and lower order of the error into single continuous update
function. For example, the least logarithmic absolute difference (LLAD) algorithm improves the LMS
giving performance similar to the hybrid of MSE and MAE under impulsive noise environment. In
recent many works based on logarithmic hyperbolic cosine (LHC) cost function, e.g., logarithmic
hyperbolic cosine adaptive filter (LHCAF), and least Incosh (LIncosh) are proposed that is robust to
large errors and smooth to small errors and to improve the accuracy of adaptive filters [10, 16].
However, these algorithms based on LHC cost function coverage to high steady-state misadjustmt.
However, in the work ([17], in table.1) classified the gradient function shape of the different cost
functions into three groups. The first class: V-shaped algorithms e.g., least-mean logarithmic square
(LMLS) [11], least mean fourth order LMF family [4] and others. The second class: Λ -shaped
algorithms e.g., SA algorithm, least-mean p norm LMP algorithm [18], LLAD algorithm, RHF
algorithms and others. The third class: M-shaped algorithms e.g., GMCC algorithm and new family
of robust M-shaped error weigh NFRMS family algorithms [17]. The work demonstrated that the
algorithm based on M-shaped and Λ -shaped types algorithm are robust algorithm against impulsive
noise.
In this work we introduce a novel family of adaptive filter based on arctan cost function that are
robust towards outliers. To demonstrate the robustness of the proposed hyperbolic arctan adaptive
filtering algorithms. parameterized least absolute difference error VP-LAD algorithm by
Family of robust hyperbolic arctan adaptive filter (FRHATAF)
Signals and Model
A. Wide-Sense Cyclostationary Stochastic Processes

Definition 1: The process u[i] is said to be second-order statistic (wide-sense) cyclostationary signal
(WSCS) with period T if and only if its mean and autocorrelation function (s) are constant with the
period T , [19] i.e.:

E { u[i] }=E {u [i+mT ] } for all i∈(−∞ , ∞ ) (1)

E { u [i+τ ]uT [i] }=E {u [ i+mT + τ ] u H [i] } for all i , τ ∈(−∞ , ∞ ) (2)
Where T and m are an integer.

Here we adopt the model used in [x] to define the input signal u[i] as

u[i]=σ u [ i ] s [i] (3)

where σ u [ i ] is a deterministic periodic sequence with period T and s[i] is a zero-mean random
sequence with unity variance. We denote that, s[i] is often defined as the coloured Gaussian sequence
as in [x]. Hence, u[i] can be viewed as a discrete time wide sense white Gaussian cyclostationary
process with autocorrelation matrix Ru as

Ru= E[u [ i ] u [ i ] ] = Diag{σ u [ i ] , σ u [ i−1 ] , … . , σ u [ i−M +1 ] }


T 2 2 2 (4)

Adopting the common model for the periodic sequence σ 2u [ i ] =β ( 1+ γ sin ⁡(2 πi/T ) ), with |γ|<1 and
β > Σ 0 are scalar factor and amplitude respectively as in [x]. The correlated (or colored) degrees of
cyclostationary input can be classified as weak, moderate or strong M ≪ T , M ≈T and M ≫ T ,
respectively depending on how the variation period compares to the filter memory, i.e., to the
coefficient vector length M. Hence,
General framework of the stochastic gradient adaptive filtering algorithm
In order to derive the general framework, we consider the system identification model of unknown
time-variant system depicted in Fig. (1). The relation between its scalar desired signal d [i] and 1 × M
input signalu [ i ]is characterized by linear model as

d [ i ]=u [ i ] w +n [ i ]
o (5)

Where w o is unknown optimal solution coefficient vector to be defined later in (7) and n [i]
represents perturbation noise, where the disturbance term n [i] is uncorrelated with u [ i ]. We define the
error signal

ϵ [ i ]=d [ i ] −w [ i ] u [ i ]
T
(6)
ϵ [ i ]=ϵ a [ i ] + n[i ]

Where the a priori estimation error ϵ a [ i ], defined by

ϵ a [ i ] =~
w [i ] u [ i ]
H (7a)

The term ϵ a [ i ], measure the distance between u [ i ] w and w T [ i ] u [ i ] , i.e., it measures how close the
o

estimator w T [ i ] u [ i ] is to the optimal linear estimator of d [ i ]. And the weight-error vector, defined
by
~
w [ i ] =w −w [ i ]
o (7a)

Where w [i] is the unknown filter coefficient vector to be estimated. Furthermore, the time-varying
parameter vector w o is of system is modelled as a random walk process [18]:

w [ i+1 ] =w [ i ] +h [i]
o o (7b)

where the random error h [i] is a zero-mean white Gaussian vector with the covariance matrix
E {h [ i ] h [ i ]=σ h I , and is assumed to be statistically independent ofu [ i ] and n [i].
H 2

Basically, we aim to determine accurate estimates of the coefficient vector w [ i ] . In order to pursue
our objective, and facilitate the ensuing performance analysis, we first derive generalized
mathematical framework of the family of the stochastics adaption algorithms. To that aim, we use the
form of error ϵ [ i ], defined in (6) to design algorithms in the form [1]
ˇ
w [ i+1 ] =w [ i ] −μ ∇ (8)
w [i ]
ˇ
Where μ is the step-size parameter and ∇ w [i] is the gradient with respect to the coefficient w [ i ] of an error cost
function ϕ (ϵ [ i ] ), called as the error criteria

ˇ
Examining the gradient function ∇ w [i] used in (8), we obtain that

ˇ = ∂ ϕ ( ϵ [i ])
∇ w [i ]
∂ w [i]

∂ ( ϵ [ i ]) (9)
¿ ϕ' ( ϵ [ i ] )
∂ w [i ]

¿−ϕ ( ϵ [ i ] ) u [ i ]
'

In which ϕ ' (.) refers the first order differentiation of ϕ ( . ). let use f ( ϵ [ i ] )=ϕ ( ϵ [ i ] ), then we may define the
'

stochastic gradient algorithms by the following [1, 20]

w [ i+1 ] =w [ i ] + μf ( ϵ [ i ] ) u [ i ]
(10)
ϵ [ i ]=d [ i ] −w [ i ] u [ i ]
T

It’s noteworthy that (10) represents a generalization form of the algorithms based on both mean-
square error and non-mean-square error criterions. However, this form of adaptive algorithms in (10)
have less frequent studies in the signal processing community, basically because of mathematical
difficulties analysis concerns with their analysis. Authors in [9, 17, 20] have examined this family of
adaptive filtering algorithms and they show that this family can improve the performance of the
adaptive filtering algorithms especially when the perturbing noise distribution is non-Gaussian, which
is motivates more researchers to perform more studies on such family of adaptive filtering algorithms.
Therefore, here in this work we adopt this family of algorithms defined (10) to propose a novel
family of robust hyperbolic arctan adaptive filtering algorithms (FRHATAF) that are effective under
impulsive (heavy-tailed) interferences environments. To derive our proposed algorithms, we use non-
mean-square error criterion which is arctan function. In particular, we proposed two categories of
FRHATAF algorithms, namely: 1) standard robust hyperbolic arctan adaptive filtering algorithms
(SRHATAF) and 2) robust hyperbolic arctan adaptive filtering algorithms (RHATAF) based on how
we define the cost function. It worth to note that, in this work we define robustness as the insensitivity
of the algorithms against the impulsive noise encountered in the practical applications.

Proposed Algorithm derivation


1. Framework of cost function of SRHATAF
Inspirited with the framework defined in (9) we define a non-mean square error criteria called a
generalized hyperbolic arctan cost function as
−1
ϕ ( ϵ [ i ])=E [tan (k (ϵ [ i ] ))] (11)

Where k (ϵ [ i ] ) is the conventional cost function based on the nonlinear error function ϵ [ i ], e.g,
E [|ϵ [ i ]|] and E [|ϵ [ i ]|¿¿ 2]¿. Generally, we consider the conventional cost function k ( ϵ [ i ]) =|ϵ [ i ]| ,
p

where p>0 , is corresponds to the shape parameter. Thus, the generalized hyperbolic arctan cost
function in (11) becomes
1 p (12)
E [tan (α |ϵ [ i ]| )]
−1
ϕ ( ϵ [ i ] )=

Where α >0, is the smoothing factor, thus, applying a stochastic gradient process used (9) to cost
function given in (11), and substitute into (10), there the SFRHATAF algorithms can be defined as

w [ i+1 ] =w [ i ] + μ R ( ϵ [ i ] ) u [i]
(13)

( )
p−1
|ϵ [ i ]| sign(ϵ [ i ] )
R ( ϵ [ i ]) = 2p
1+ δ |ϵ [ i ]|
Some observation from (13):
a). When we consider p=1 , (13) becomes

w¿ (14)

Which is called standard robust hyperbolic arctan least absolute deference (SRHAT-LAD) algorithm.
b). When we consider p=2, (13) becomes

w¿ (15)

Which is called standard robust hyperbolic arctan least mean square (SRHAT-LMS) algorithm.
A. Normalized Updates SFRHATAF
In this part we introduce normalized update equation of SFRHATAF algorithms with respect to input
signal u [ i ] in order to improve the performance of the proposed algorithms. We re-defined the cost
function (9) as

[ ( )]
p (16)
1 −1 α |ϵ [ i ]|
ϕ ( ϵ [ i ])= E tan p
pα ‖u [ i ]‖
The gradient update is given by

[ ]
p −1 (17)
|ϵ [ i ]| sign(ϵ [ i ] )
ϕ (ϵ [ i ] )=
'
2p
p δ |ϵ [ i ]|
‖u [ i ]‖ + p
‖u [ i ]‖
Thus, we have SRHAT-NLAD algorithm can be described as
w¿ (18)

And also obtain SRHAT-NLMS algorithm can be described as

w¿ (19)

It’ss worth to point out that the last expression in (8) is also proposed as Arctan NLMS (Arc-NLMS)
algorithm in [6].
FRATAF, p = 1
Gradient function F p ( (i))

FRATAF, p = 2
0.5 6

cost function J(w)


FRATAF, p = 3
FRATAF, p = 4
FRATAF, p = 5
0 4
FRATAF,p = 1
FRATAF, p = 2
-0.5 FRATAF, p = 3 2
FRATAF, p = 4
FRATAF, p = 5
0
-5 0 5 -15 -10 -5 0 5 10 15
Error (i) Error (i)
1
FRATAF, p = 2
0.8 FRATAF, p = 3
FRATAF, p = 4
FRATAF, p = 5
0.6
[i]

0.4

0.2

0
-10 -5 0 5 10
Error (i)
2. Framework of cost function of RHATAF
Alternatively, the cost function defined in (11) can also be defined as following in (16) as following
p (20)
ϕ ( ϵ [ i ] )=E [|tan−1 (δϵ [ i ] )| ]
Thus, a stochastic gradient based adaptive filtering algorithm of (16) can be easily derived as

w [ i+1 ] =w [ i ] −μ ϕ ( ϵ [ i ])
∂ w[i] (21)
w [ i+1 ] =w [ i ] + μF ( ϵ [ i ] ) u[i]

( )
p−1
|tan−1 ( δϵ [ i ] )| −1
sign(tan ( δϵ [ i ] ) )
F ( ϵ [ i ])= 2
1+ δ |ϵ [ i ]|

However, (17) can be rewritten as


−1
w [ i+1 ] =w [ i ] +η[i]tan ( δϵ [ i ]) u [ i ]
(22)
( )
p−2
η [i]=μ |tan ( δϵ [ i ])|
2
¿ ¿ ¿1+ δ |ϵ [ i ]|
−1

Some observation from (17):


If we consider p=1 , (17) becomes

μ −1 (23)
w [i+1]=w [ i ] + 2
sign(tan ( δϵ [ i ] ) )u [ i ]
1+δ |ϵ [ i ]|
Which is called robust arctan least absolute deference (RHAT-LAD) algorithm
When we consider p=2, (17) becomes

w¿ (24)

Which is called robust arctan least mean square (RHAT-LMS) algorithm


However, for more simplicity and ease of analysis, algorithm expression in (21) can be related to the
hyperbolic tanh(.) algorithm introduced in [] by using the following relation

−1 π (25)
tan ( x [i] ) ≤ tanh ( x [i] ) ,
2
Both sides of the above expression (25) become an equal as the amplitude of x tends to larger values,
which is very convenient for case when we perform the analysis under assumption of impulsive noise
environment. Thus, applying the last expression (25), into (21), we obtain

w [ i+1 ] =w [ i ] +η[i]tanh ( δ [i] ϵ [ i ] ) u [ i ]


(26)

( )
p−2
π |tanh ( δϵ [ i ] )|
η [i]=μ
2 1+δ [i]|ϵ [ i ]|2
2

δ [ i+1 ] =δ [ i ] γ +(1−γ ) e−δ [ i] δ[i]


It’s easy to notice that the last expression above (18) is similar to [], with variable step-size equal to
η [i]. Specifically, transient performance and steady-state analysis of tanh algorithm has studied in the
Gaussian and non-Gaussian noises [10]. the stabilization for the case of large initial weight errors
Remark 1

Based on the simulation depicted in Fig .1, it’s easy to note that, when |ϵ [ i ]|→ ∞ , the both terms
R ( ϵ [ i ] ) and F ( ϵ [ i ] ) → 0 regardless the value of the power p. Thus, for the large values of error e.g.,
heavy-tailed noise, the filter coefficient values don’t change much. This demonstrates the robustness
of the FRHATAF algorithm against the impulsive noise or outliers the cause high peak of error.
Remark 2
Through the proposed SRHATAF algorithms, in general we can improve the convergence analysis for
the different conventional objective function. However, in the environments where higher order error,
e.g., LMS and LMF algorithms which have poor convergent (impulsive noise environments), the
performance of the corresponding SRHAT-LAD, and SRHAT-LMS algorithms are robust against
impulsive noise, as SRHAT-LAD robustness to the LMS algorithm, and SRHAT-LMS overcome the
stability issues of the LMF algorithm. Moreover, via the well optimization of δ , we can still obtain
improvement in the convergence performance.

Performance analysis
We study the convergent conditions of the filter coefficient vector w [ i ] e.g., mean convergent and
mean-square convergent, and excess mean-square error (EMSE) of an adaptive filter and finally
determine the range of values of the step-size over which coefficient vector error remains bounded.
All analysis performs under assumption of white Gaussian environment
Statistical Assumptions

Assumption 1. The input regressor u[i] is zero-mean independent and identical distribution (i.i.d)
Gaussian random variable, and the perturbation noise n [i] is also zero-mean i.i.d Gaussian random
variable and uncorrelated with all u[i] .

Assumption 2. The a prior error ϵ a [ i ] to be later defined is assumed to be Gaussian and is jointly
Gaussian with the weighted prior error ϵ a [ i ] for any constant matrix Σ , where ϵ a [ i ] =~
Σ Σ H
w Σu [i] and
Σ is asymmetric positive-definite weighted matrix. This assumption is hold for longer filter length
e.g., M, and sufficient small step-size μ. Also, ϵ a [ i ] independent from n [i].

Assumption 3. We assume that there exists optimal solution vector w o, such that the desired
signal can be achieved as d [ i ]=wo u [ i ] +n [ i ]

In this work we do analysis only for the standard FRHATAF


Transient Mean Behavior
In this work we conduct all analysis based on Taylor series expansion approach of the function
f ( . ). very important to note that throughout the analysis, we use a generalized function f ( ϵ [ i ] ) to
represent both R ( ϵ [ i ] ) or F ( ϵ [ i ] ) functions in (13) and (17) respectively.

Definition: Adaptive filter with weight coefficient error vector ~


w [ i ] that has length M is said to be
mean convergent if lim E {~
w [ i ] }=0M × 1[1, 21]
i→∞

Based on general update equation of stochastic gradient algorithm defined in (10), and using the
definition of the ~
w [ i ] in (6), it follows that
~
w [ i+ 1 ] =~
w [ i ] −μ f ( ϵ [ i ] ) u [ i ] (21)

Mean Behavior of SRHATAF


Based on update equation (13), (21) can be written as

( )
p −1 (22)
~ |ϵ [ i ]| sign( ϵ [ i ] )
w [ i+ 1 ] =~
w [ i ] −μ 2p
u [i ]
1+ δ |ϵ [ i ]|

Mean Behavior of SRHAT-LAD


After take statistic expectation of (22) when p =1, we have

[( ) ]
(23)
sign(ϵ [ i ] )
E [~
w [ i+1 ] ]=E [ ~
w [ i ] ]−μ E 2
u [i ]
1+δ |ϵ [ i ]|

If we assume that ϵ [ i ] is a zero-mean Gaussian variable,


E
[( sign(ϵ [ i ] )
1+ δ |ϵ [ i ]|
2
) ]
u [ i ] =E
(
1
1+δ |ϵ [ i ]|
2
) E [ sign(ϵ [ i ] )u [ i ] ]

The first part of right-side of the last expression above can be evaluated as

( )

1 1 1
E 2
= ∫
√ 2 πδ σ ϵ −∞ 1+ x 2
2
exp ⁡(−λ x )dx (24)
1+δ |ϵ [ i ]|
1
Where x=√ δ ϵ [ i ] and λ= 2
2 δ σϵ
However, inequality in (24) has solved exactly in [11] as

β 1=E
( 1
1+δ |ϵ [ i ]|
2
)
= √ πλ exp ⁡(λ) [1−erf ⁡( √ λ) ] (25)

√λ
2
Where erf ( √ λ )=
√ 0
π
∫ exp (−t 2 ) dt . Also the second part of the right-side can be evaluated as

E [ sign(ϵ [ i ] )u [ i ] ] =μ
√ 2 1
R
π σϵ u
(26)

Substitute last expression (25) and (26) into (23), we obtain


2 1 (27)
E [~
w [ i+1 ] ] =E [ ~
w [ i ] ] −μ β R E [~
w [i ] ]
π σϵ 1 u

( √
¿ 1−μ
2 1
β R E [~
π σϵ 1 u
w [i]]
)
Now it’s easy to E [ ~
w [ i ] ] will converge if we choose the step-size to satisfy
σϵ
0< μ ≤ √ 2 π (28)
β 1 λmax (Ru )

Mean Behavior of SRHAT-LMS


After take statistic expectation of (22) when p =1, we have

E [~
w [ i+1 ] ]=E [ ~
w [ i ] ]−μ E
[( 1
1+δ |ϵ [ i ]|
4
) [ ] [ ]]
ϵ iui (29)

2
It’s known that as i→ ∞ , it yields that E [|ϵ [ i ]| ]→ σ 2n , it means that E [|ϵ [ i ]| ] is converge square of
4

2
the noise variance ( σ 2n ) thus in this case

 Noise is Gaussian with smaller values, the (29) can be approximated to


E
[( 1
1+ δ |ϵ [ i ]|
4
) [ ] [ ]]
ϵ i u i ≈ E [ ϵ [ i ] u [i ] ] (30a)

 Noise is non-Gaussian with heavy-tailed distribution, the (28) can be approximated to

E
[( 1
1+ δ |ϵ [ i ]|
4
) ]
ϵ [ i ] u [ i ] ≈ E [ sech2 (δ ϵ [ i ] )ϵ [ i ] u [ i ] ] (30b)

Where in the last case for large values of the noise amplitude E
( | [ ]| )
1
1+δ ϵ i
4
≈ E [ sech ( δϵ [ i ]) ] ≈ 0
2

Therefore, using result in (30 a&b), (29) approximate to

E [~
w [ i+1 ] ]=E [ ~
w [ i ] ]−μ ψ [i] Ru E [ ~
w [i]]
(31)
E [~
w [ i+1 ] ]=( 1−μ ψ [i] Ru ) E [ ~
w [i]]

Now it’s easy to E [ ~


w [ i ] ] will converge if we choose the step-size to satisfy
1
0< μ ≤ (32)
ψ [i] λmax (Ru )
Where ψ [ i ] =E [ sech 2 ( δϵ [ i ]) ] in the case of non-Gaussian environment assumption, and ψ [ i ] =1 , under
Gaussian noise environment assumption.

Mean Behavior of RHATAF


Based on update equation (26), (21) can be written as

( )
p−1 (33)
~
w [ i+ 1 ] =~
w [ i ] −μt
|tanh ( δϵ [ i ] )| sign( tanh ( δϵ [ i ] ) ) u [ i ]
2
1+ δ |ϵ [ i ]|

π
Where μt = μ
2
Mean Behavior of RHAT-LAD
After take statistic expectation of (33) when p =1, we have

[( ) ]
(34)
sign(tanh ( δϵ [ i ] ) )
E [~
w [ i+1 ] ] =E [ ~
w [ i ] ] −μt E 2
u [i ]
1+ δ |ϵ [ i ]|

To facilitate the analysis, we use the fact that

sign ( tanh ( δϵ [ i ] ) )=sign( ( δϵ [ i ] ))

thus, for this case (34) becomes


[( ) ]
1 (35)
E [~
w [ i+1 ] ] =E [ ~
w [ i ] ] −μt E 2
sign( ( δϵ [ i ]) )u [ i ]
1+ δ |ϵ [ i ]|

The right-side of (35), is evaluated exactly in (25) and (26) this yields that

E [~
w [ i+1 ] ] =E [ ~ w [ i ] ] −μt
2 1

β R E [~
π σϵ 1 u
w [i]]

¿ E [~w [ i ] ] −μ
√ π 1
β R E [~
2 σϵ 1 u
w [i ] ] (36)

( √
¿ 1−μ
π 1
β R E [~
2 σϵ 1 u
w [i]]
)
Now it’s easy to E [ ~
w [ i ] ] will converge if we choose the step-size to satisfy

√ 8 σϵ
0< μ ≤ (37)
π β 1 λ max (R u)
Mean Behavior of SRHAT-LMS
After take statistic expectation of (33) when p =2, we have

[( ) ]
1 (38)
E [~
w [ i+1 ] ] =E [ ~
w [ i ] ] −μt E 2
tanh ( δϵ [ i ] ) u [ i ]
1+ δ |ϵ [ i ]|

To evaluate above relation (38). let use the Taylor series expansion of nonlinearity function g ( ϵ [ i ] )
with respect to ϵ a [ i ] around the noise value n [ i ], can written as [9, 20]

1 '' 2 2
g ( ϵ [ i ] )=g ( ϵ a [ i ] + n[i] )=g ( n [i] ) + g ( n[i] ) ϵ a [ i ] + g ( n [ i ] ) ( ϵ a [ i ] ) +O(ϵ a [ i ] )
'
2 (39)

[ ] [ ] [[ ] ] (40)
2
E ( g ( ϵ [ i ] ) ) ≈ E ( g ( n [ i ] ) ) + E g ( ϵ [ i ] ) f ( ϵ [ i ] ) +|g ( n [ i ] )| ϵ a [ i ]
2 2 '' ' 2

Where g' ( n[i] ) and g' ' ( n [i] ) are the first and second order derivatives of the function g(.), and
2
O(ϵ a [ i ] ) denotes the third and higher orders. Last Taylor expansion expression in (22) represents a
class of function g ( . )that are differentiable up to third order over the all of ϵ [ i ] [9]

Thus, let g ( ϵ [ i ] )=tanh ( δϵ [ i ] ) to (38) and apply (39), we obtain

E [~
w [ i+1 ] ] =E [ ~
w [ i ] ] −μt E [δ sech (δ n [i])] β1 Ru E [ ~
w [i ]]
2 (41)

¿ ( 1−μt E[δ sech 2( δn[i])] β1 Ru ) E [ ~


w [i]]

Mean convergent for the different kind the noise distributions can be evaluated from (41), For the case
if the noise distribution is known and if it’s uninform over range [-b,b], relation (41) will be evaluated
as [10]
( )
1 (41)
E [~
w [ i+1 ] ] = 1−μt tanh ⁡( δ b) β1 Ru E [ ~
w [i ] ]
b

Now it’s easy to E [ ~


w [ i ] ] will converge if we choose the step-size to satisfy
4 b
0< μ ≤ (42)
π tanh ⁡(δb) β1 λmax (Ru )

Mean-square convergent Behavior


In order to evaluate the expression of the mean-square convergent of the FRHATAF algorithms, we
2
use the expression determined in (21) and multiply with its transpose and ignore terms ( ϵ a [ i ] ) ; And
Invoking assumptions 1,2 and 3, and take expectation operation, we obtain

E [~
w [ i+1 ] ~
wT [ i+ 1 ] ]=E ( [ ~ {
w [ i ] ] −μf ( n [ i ] ) u[i] )( E [ ~
w [ i ] ] −μf ( n [ i ] ) u[i] )
T
}
¿ E [~ wT [ i ] ]−μE [~
w [i ] ~ w [ i ] ] + μ2 E [‖u[i]‖] E [ f 2 ( n [ i ]) ]
w [ i ] f ( n [ i ]) ]−μE [ f ( n [ i ]) ~
(43
K [i+1]=K [i]−μE [ ϵ a [ i ] f ( n [ i ]) ] −μE [ f ( n [ i ]) ϵ a [ i ] ]+ μ E [‖u [i]‖] E [ f ( n [ i ]) ]
2 2 )

Where K [i+1]=E [ ~ wT [ i+1 ] ]


w [ i+1 ] ~
Mean-square convergent of SRHAT-LAD
To perform mean-square deviation (MSD) of the SRHAT-LAD, we recall (14), we have

[( ) ]√
1 2 1 (44
f ( n [ i ]) ϵ a [ i ] =E sign( ( δϵ [ i ]) )ϵ a [ i ] = β R K [i] )
1+ δ |ϵ [ i ]|
2
π σϵ 1 u

[( )] [ ]
2
1
2
1 |ϵ [ i ]|
E [ f ( n [ i ]) ]=E
2
2
= 2E 2
1+ δ |ϵ [ i ]| σϵ ( 1+ δ|ϵ [ i ]|2 )

[ ] [
(45

] [ ]
2
1 |ϵ [ i ]| 1 −∂ 1 −1 ∂ 1 )
2
E 2 2
= 2E [ 2
]= 2 E 2
σϵ ( 1+ δ|ϵ [ i ]| ) σ ϵ ∂ δ 1+δ|ϵ [ i ]| σ ϵ ∂ δ 1+ δ|ϵ [ i ]|
Where the interchange of integration and differentiation property in [11] is used in the last expression
1 ∂ ϑ (ϵ [ i ] , δ)
since ϑ (ϵ [ i ] ,δ ) ≜ 2 and are both continuous in R2. Thus, using the result in (25),
1+δ |ϵ [ i ]| ∂δ
and after some mathematical manipulations, (45) become

(46
E [ f ( n [ i ]) ]=( 2 λ+ 1 ) { λ √ πλ exp ( λ ) [ 1−erf ( √ λ ) ] }−2 λ )
2 2

Now substituting results in (44) and (45) into (43), we can achieve the MSD relation of SRHAT-LAD
as following
[
K [ i+1 ]= 1−2 μE
√ 2 1
π σϵ ] [
β 1 R u K [i]+ μ2 N σ 2u ( 2 λ+1 ) { λ √ πλ exp ( λ ) [ 1−erf ( √ λ ) ] }−2 λ 2 ]
(47)
Mean-square convergent of SRHAT-LMS
To perform mean-square deviation (MSD) of the SRHAT-LMS, we recall (15), we have

[( ) ]
1 (48
f ( n [ i ] ) ϵ a [ i ] =E 4
ϵ [i ] ϵ a [ i ] ≈ E ¿ ¿ ¿ )
1+ δ |ϵ [ i ]|

[ ] [
(49

]
2
|ϵ [ i ]| 1
E [ f ( n [ i ] ) ]=E
2
2
4 2
=E 4 2
|ϵ [ i ]| )
( 1+ δ|ϵ [ i ]| ) ( 1+δ|ϵ [ i ]| )
To evaluate (49), we consider the following assumptions

 Under assumption of non-Gaussian noise (heavy-tailed noise distribution), (49) can be


evaluate as

[ ] [√ ]
(50a
1 4
E [ f 2 ( n [ i ] ) ]=E
2
2
|ϵ [ i ]| ≤ E sech2 ( δϵ [ i ] ) E [|ϵ [ i ]|¿¿ 2]¿ )
( 1+ δ|ϵ [ i ]|4 ) π

 Under assumption of Gaussian noise (with small variance), (49) can be evaluate as

[( ]
1 (50b
E [ f ( n [ i ]) ]=E
2
2
4 2
|ϵ [ i ]| ≈ E[|ϵ [ i ]|¿¿ 2]¿ )
1+ δ |ϵ [ i ]| )

Now substituting results in (48) and (50, a&b) into (43), we can achieve the MSD relation of SRHAT-
LMS as following

K [ i+1 ]=¿ (51


)

Mean-square convergent of RHATAF


To perform mean-square deviation (MSD) of the RHAT-LAD, we recall (26), we have

f ( n [ i ]) ϵ a [ i ] =E
[( 1
1+ δ |ϵ [ i ]|
2
) sign( ( δϵ [ i ]) )ϵ a [ i ] =
]√ 2 1
β R K [i]
π σϵ 1 u

[( | [ ]| ) ]
2
1
E [ f ( n [ i ]) ]=E =( 2 λ+1 ) { λ √ πλ exp ( λ ) [ 1−erf ( √ λ ) ] }−2 λ
2 2
2
1+ δ ϵ i

Thus, MSD relation of SRHAT-LAD is defined as following

[ √
K [ i+1 ]= 1−2 μ
π 1
2 σϵ
β 1 R u K [i]+ μ2
2]
π 2
() [
N σ 2u ( 2 λ+1 ) { λ √ πλ exp ( λ ) [ 1−erf ( √ λ ) ] }−2 λ 2 ]
(52)
RHAT-LMS
To perform mean-square deviation (MSD) of the RHAT-LAD, we recall (26), we have

[( ) ]
1 (53
f ( n [ i ] ) ϵ a [ i ] =E 2
tanh ( δϵ [ i ] ) ϵ a [ i ] =E[δ sech2 ( δn [ i ] ) ] β 1 Ru K [i] )
1+ δ |ϵ [ i ]|

Last expression (53) has evaluated in (41)

[( )]
1
2 (54
E [ f ( n [ i ]) ]=E
2
2 [
E ( tanh ( δϵ [ i ]) )
2
] )
1+ δ |ϵ [ i ]|

To evaluate last line (54), let g ( ϵ [ i ] )=tanh ( δϵ [ i ] ) to (54) and apply (40), we obtain

E [ tanh ( δϵ [ i ] ) ] =¿
2

E [ δ tanh2 ( δ n [ i ] ) ]+ E [ δ 2 sech 4 ( δ n [ i ])−δ 2 tanh2 ( δ n [ i ]) sech 2 ( δ n [ i ]) ] Ru K [i]


Thus,
(53
E [ f ( n [ i ]) ]=E [ δ tanh (δ n [ i ] ) ] + E [ δ sech ( δ n [ i ]) −δ tanh ( δ n [ i ]) sech ( δ n [ i ]) ]
2 2 2 4 2 2 2
Ru K [i] )

[ () ] ()
2 2
π 2 π 2 π
K [ i+1 ]= 1−2 μ E [ g ( ϵ [ i ]) ] β 1 Ru + μ β2 E[ g ( ϵ [ i ]) g ( ϵ [ i ]) +|g ( ϵ [ i ])| ][ Ru ⨂ Ru ] K [i]+ μ β2 E [g ( ϵ [ i
' '' ' 2
2 2 2

Where

β 1=√ πλ exp ⁡(λ) [ 1−erf ⁡( √ λ) ]

[
β 2= ( 2 λ+ 1 ) { λ √ πλ exp ( λ ) [ 1−erf ( √ λ ) ] }−2 λ2 ]
E [ g' ( ϵ [ i ]) ]=E [ δ sech 2 ( δ n [ i ]) ]

[ 2
]
E g ( ϵ [ i ]) g ( ϵ [ i ])+|g ( ϵ [ i ])| =E [ δ tanh (δ n [ i ] ) ] + E [ δ sech ( δ n [ i ]) −δ tanh ( δ n [ i ]) sech ( δ n [ i ]) ]
'' ' 2 2 4 2 2 2

[1] A. H. Sayed, Adaptive filters. John Wiley & Sons, 2011.


[2] E. Eweda, N. J. Bershad, and J. C. M. Bermudez, "Stochastic analysis of the LMS and NLMS
algorithms for cyclostationary white Gaussian and non-Gaussian inputs," IEEE Transactions
on Signal Processing, vol. 66, no. 18, pp. 4753-4765, 2018.
[3] H.-C. Huang and J. Lee, "A new variable step-size NLMS algorithm and its performance
analysis," IEEE Transactions on Signal Processing, vol. 60, no. 4, pp. 2055-2060, 2011.
[4] E. Walach and B. Widrow, "The least mean fourth (LMF) adaptive algorithm and its family,"
IEEE transactions on Information Theory, vol. 30, no. 2, pp. 275-283, 1984.
[5] S. P. Talebi, S. Werner, and D. P. Mandic, "Distributed Adaptive Filtering of $\alpha $-Stable
Signals," IEEE Signal Processing Letters, vol. 25, no. 10, pp. 1450-1454, 2018.
[6] W. Gao and J. Chen, "Transient analysis of signed LMS algorithms with cyclostationary
colored Gaussian inputs," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67,
no. 12, pp. 3562-3566, 2020.
[7] E. Eweda, "Transient performance degradation of the LMS, RLS, sign, signed regressor, and
sign-sign algorithms with data correlation," IEEE Transactions on Circuits and Systems II:
Analog and Digital Signal Processing, vol. 46, no. 8, pp. 1055-1062, 1999.
[8] J. Zhao and H. Zhang, "Kernel recursive generalized maximum correntropy," IEEE Signal
Processing Letters, vol. 24, no. 12, pp. 1832-1836, 2017.
[9] B. Chen, L. Xing, B. Xu, H. Zhao, N. Zheng, and J. C. Principe, "Kernel risk-sensitive loss:
definition, properties and application to robust adaptive filtering," IEEE Transactions on
Signal Processing, vol. 65, no. 11, pp. 2888-2901, 2017.
[10] S. Wang, W. Wang, K. Xiong, H. H. Iu, and K. T. Chi, "Logarithmic hyperbolic cosine adaptive
filter and its performance analysis," IEEE Transactions on Systems, Man, and Cybernetics:
Systems, 2019.
[11] M. O. Sayin, N. D. Vanli, and S. S. Kozat, "A novel family of adaptive filtering algorithms based
on the logarithmic cost," IEEE Transactions on signal processing, vol. 62, no. 17, pp. 4411-
4424, 2014.
[12] P. Petrus, "Robust Huber adaptive filter," IEEE Transactions on Signal Processing, vol. 47, no.
4, pp. 1129-1133, 1999.
[13] B. Chen, L. Xing, H. Zhao, N. Zheng, and J. C. Prı, "Generalized correntropy for robust
adaptive filtering," IEEE Transactions on Signal Processing, vol. 64, no. 13, pp. 3376-3387,
2016.
[14] A. C. Singer, S. S. Kozat, and M. Feder, "Universal linear least squares prediction: Upper and
lower bounds," IEEE Transactions on Information Theory, vol. 48, no. 8, pp. 2354-2362, 2002.
[15] S. S. Kozat and A. C. Singer, "Universal switching linear least squares prediction," IEEE
Transactions on Signal Processing, vol. 56, no. 1, pp. 189-204, 2007.
[16] C. Liu and M. Jiang, "Robust adaptive filter with lncosh cost," Signal Processing, vol. 168, p.
107348, 2020.
[17] S. Zhang, W. X. Zheng, J. Zhang, and H. Han, "A family of robust m-shaped error weighted
least mean square algorithms: performance analysis and echo cancellation application," IEEE
access, vol. 5, pp. 14716-14727, 2017.
[18] O. Arikan, A. E. Cetin, and E. Erzin, "Adaptive filtering for non-Gaussian stable processes,"
IEEE Signal Processing Letters, vol. 1, no. 11, pp. 163-165, 1994.
[19] W. A. Gardner, "Cyclostationarity in communications and signal processing," STATISTICAL
SIGNAL PROCESSING INC YOUNTVILLE CA, 1994.
[20] S. C. Douglas and T.-Y. Meng, "Stochastic gradient adaptation under general error criteria,"
IEEE transactions on signal processing, vol. 42, no. 6, pp. 1335-1351, 1994.
[21] N. Shlezinger and K. Todros, "Performance analysis of LMS filters with non-Gaussian
cyclostationary signals," Signal Processing, vol. 154, pp. 260-271, 2019.

You might also like