0% found this document useful (0 votes)
4 views

Paper_IWSHM

This document compares two methodologies for online data normalization in structural health monitoring (SHM): multiple linear regression followed by principal component analysis (MLR-PCA) and cointegration (COI). Both methods were applied to a prestressed concrete cable-stayed bridge using 3½ years of continuous data, demonstrating robust results and reasonable sensitivity to damage. Performance indicators such as false positives and damage sensitivity ratios were evaluated, showing that both methodologies effectively distinguish between damaged and undamaged states when trained adequately.

Uploaded by

privatepimentel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Paper_IWSHM

This document compares two methodologies for online data normalization in structural health monitoring (SHM): multiple linear regression followed by principal component analysis (MLR-PCA) and cointegration (COI). Both methods were applied to a prestressed concrete cable-stayed bridge using 3½ years of continuous data, demonstrating robust results and reasonable sensitivity to damage. Performance indicators such as false positives and damage sensitivity ratios were evaluated, showing that both methodologies effectively distinguish between damaged and undamaged states when trained adequately.

Uploaded by

privatepimentel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Title: SHM based damage detection using cointegration and linear multivariate data

analysis: performance comparison based on a real case study

Authors:Emanuel Sousa Tomé


Mário Pimentel
Joaquim Figueiras
ABSTRACT

Two alternative methodologies for online data normalization are described and
compared: multiple linear regression followed by principal component analysis (MLR-
PCA) and cointegration (COI). While the former is being used for some time in the
scope of SHM, only recently the latter was introduced to the analysis of SHM data. In
both cases the statistical classification is performed resorting to the Hotelling T2 statistic.
The developed algorithms are applied to a prestressed concrete cable-stayed bridge
of which 3½ years of continuous data is available. Three performance indicators are
used to compare the two methodologies: one is the number of false positives (incorrectly
predicted damage events) and the other two are related to the sensitivity to damage.
Several damage scenarios involving small section loss the stay-cables are simulated by
corrupting the measured (real) time series with the structural response to the damage
events obtained from a finite element model of the bridge. It is shown that both
methodologies can provide robust results and reasonable sensitivity to damage.

INTRODUCTION

Notwithstanding the large number of civil engineering structures equipped with


structural health (SHM) monitoring systems, few cases can be found where damage
identification methods have been applied and validated jointly with a quantitative
evaluation of the damage intensities that can be signaled. This quantification is relevant
for infrastructure managers and provides the means to transform the data gathered by
SHM systems into useful information for supporting the decision-making process
related to the maintenance and conservation strategies.
Data-based approaches for damage detection usually rely on two steps: structural
response modelling and unsupervised statistical classification. In the first step, the
normal structural response due to the normal environmental and operational variations
(EOVs) is removed from the data, a process designated as data normalization. Only after
statistical classification tools can be applied. This paper discusses two distinct
approaches for data normalization of SHM data.
_____________
Emanuel Sousa Tomé, CONSTRUCT-LABEST, University of Porto, Faculty of Engineering.
Mário Pimentel, CONSTRUCT-LABEST, University of Porto, Faculty of Engineering.
Joaquim Figueiras, CONSTRUCT-LABEST, University of Porto, Faculty of Engineering.
DATA NORMALIZATION ALGORITHMS

MULTILINEAR REGRESSION AND PRINCIPAL COMPONENT ANALYSIS


(MLR-PCA)

The multilinear regression (MLR) model is expressed by [1]:

ˆ E
Y  XU (1)
MLR

where Y is an n-by-m matrix of the dependent variables, being n the number of


observations and m the number of dependent variables, X is an n-by-(r+1) matrix with
the corresponding n values of the r selected predictor variables, Û is an (r+1)-by-m
matrix with the estimated model parameters, and EMLR is an n-by-m matrix with the
residuals of the MLR model. The estimates of the model parameters ( Uˆ ) are obtained
through the least squares method and are given by:

Uˆ  ( XT X)1 XT Y (2)

The predictors having an absolute value of the Pearson correlation coefficient with
the dependent variable below a pre-established threshold are not used in the model.
Principal Component Analysis (PCA) [1] is a statistical method that uses an
orthogonal transformation to convert a set of initially correlated variables into a set of
linearly uncorrelated variables. Considering an n-by-m matrix Y with the original
variables, where m is the number of sensors and n is the number of observations in time,
the m uncorrelated principal component scores, Z , are determined by:

Z  Y T (3)

where T is the m-by-m orthonormal transformation matrix. The covariance matrix of


the original variables in the training period, Σ , is related to the covariance matrix of the
principal component scores, Λ , by:

Σ  T  Λ  TT (4)

being the T and Λ matrixes obtained by the singular value decomposition of the
covariance matrix Σ of the original variables. The columns of T are the singular vectors
and the diagonal matrix Λ contains the singular values of the matrix Σ in descending
order. The singular values stored in Λ are the variances of the components of Z .
Moreover, the matrix Λ can be split into a matrix storing the first p singular values and
in a matrix 𝐙̂ containing the remaining m-p singular values, which are not relevant to
explain the variability of Y . The matrix 𝐙̂ is expected to be insensitive to the EOVs
and can be used as damage sensitive features.
The PCA model can be applied directly to the sensor readings or, as proposed by
Magalhães et al. [2], to the residuals of the MLR model. In the latter case, it is here
designated as MLR-PCA. The MLR model is used to remove the effects of the measured
actions on the bridge, such as the temperature, from the data. The PCA is used to
suppress the environmental and operational effects not removed by the MLR model,
namely the remaining temperature effects and the long-term behaviour due to
rheological effects of the concrete.

COINTEGRATION (COI)

Cointegration analysis has been recently used as an output-only approach to


suppress the EOVs in SHM [3, 4]. The basic idea behind cointegration is to establish
relationships between nonstationary time series in order to create a stationary residual.
A common way to describe non-stationary processes is by means of the order of
integration. A non-stationary process that becomes stationary after differencing d
times, it is said to be integrated of order d and denoted as I (d ) . The order of integration
can be determined by means of a unit root test, such as the ADF test [5].
A set of M I (1) time series yt  [ y1,t , y2,t ,..., yM ,t ]T is cointegrated if there is a linear
combination of them that is stationary, that is, if there is a vector β  [1, 2 ,..., M ]T
such that:

βT yt  1 y1,t  2 y2,t  ...  M yM ,t  zt ~ I (0) (5)

where z t are the cointegration residuals. For the time series to be cointegrated, they
must have shared/common trends and the same order of integration [3]. Since y t is M-
dimensional, there may be Nr ≤ M-1 linearly independent cointegrating vectors and the
cointegration relation given by equation (5) can be generalized to:

 β1T y t   z1,t 
   
ΒT y t     ~ I (0) (6)
β N y t   z N ,t 
T
 r   r 

where the M-by-Nr matrix Β is the cointegration matrix. The maximum-likelihood


multivariate Johansen procedure [6] is adopted to estimate de cointegration vectors. One
important step is related to the choice of the number of lags k in the underlying Vector
Error Correction Model. In this work, k was chosen using the stationarity-based
approach proposed by Dao, et al. [7], which allows the automation of the process.
After the cointegration matrix is determined, only the vectors producing stationary
cointegration residuals are retained and used to project new data into the cointegration
space. These are the damage sensitive features that are used in this model. The
determination of the number of cointegration vectors to retain is made by means of a
likelihood ratio statistic test proposed by Johansen, the trace test [6], for which a
significance level of 5% was adopted. Details about the implemented algorithm can be
found in reference [8].

CLASSIFICATION

After the data is normalised, a control chart is used to track abnormal values, which
can be related to the presence of damage. The Hotelling T2-statistic condensates all
model features into a scalar indicator, working in this context as a damage indicator:
T2  r (x  x)T S1 (x  x) (7)

where r is the number of observations considered (window size), x is the average of the
observations inside the window, x is the process average when it is in control and S is
the process covariance matrix, all of them estimated using the data from the training
period.
The control limits define the accepted process variability. In the present paper, r =1
was adopted. Therefore, the lower control limit is zero and the upper control limit (UCL)
is computed from:

m(s  1)(s  1)
UCL  Fm, s m ( ) (8)
s2  s  m

where Fm, s  m ( ) is the α percentage point of the F distribution with m and sm

degrees of freedom, being m the number of features and s the number of subgroups (or
windows) collected during the training period. In the MLR-PCA method, m equals the
number of sensors minus the number of principal components extracted from the data.
In the COI method, m is the number of cointegration residuals.

PERFORMANCE CRITERIA

The ability for data normalisation the proposed methodologies is evaluated by


means of the number of false positives, that is, the number of points above the UCL
when the structure is in control and no damage was introduced in the bridge. The
sensitivity to damage is evaluated using two other indicators. The first is the ratio
between the mean values of the T2 statistic in the damaged and undamaged states (RU).
The second is the ratio between the mean values of the T2 statistic in the damaged state
and the UCL (RLα).

mean Tdamaged
2
 mean Tdamaged
2

RU  RL  (9)
mean Tundamaged
2
 UCL( )

Large RU values mean that there is a clear distinction between the undamaged and
damaged states. Values of RLα >1.0 mean that on average the points are above the UCL.
When the data is not well-normalised, low values of RU and high values of RLα may be
obtained. The best data normalisation methodology will be that simultaneously
providing a lower number of false positives and higher values of both RU and RLα.

CORGO BRIDGE

The Corgo Bridge is a concrete cable-stayed bridge with a post-tensioned box-girder


deck and a 300m long central span and a central suspension system containing four
symmetric semi-fans of 22 stay cables each, see Figure 1. The bridge is located in
northern Portugal and opened to traffic in September 2013. The bridge is equipped with
a permanent SHM system, of which only a subset of the data is going to be used in the
present paper, namely the concrete temperatures and the stay-cable forces. Ten of the
88 stay cables are equipped with accelerometers and the forces are estimated using the
taut-string theory. The resonant frequencies of the stay-cables are obtained using the
Peak-Picking method, being the auto-spectra determined using acceleration time series
with a duration of 30 min. The concrete temperatures correspond to those measured in
two representative cross-sections: one in the box-girder and another in the piers. The
temperatures readings in one of the stay-cables were also considered.

300m

Figure 1. Side elevation of the Corgo Bridge with the identification of the instrumented stay-cables.

EXPERIMENTAL TIME SERIES AND DAMAGE SIMULATION

The system is operating since January 2015. The collected data is pre-processed
using the Interquartile Range Analysis algorithm [9] and then daily averaged. The daily
averaged temperatures in the deck and forces in one of the longest stay-cables are shown
in Figure 2 as examples.

Figure 2. Samples of the daily averaged time series of the collected data. Left: average of the 4
temperature sensors in the deck cross-section. Right: forces in the stay-cable T18C20.

The drift in the stay-cable force due to the time-dependent of effects of concrete and
prestressing steel are clearly noticeable. A previous study [10] has shown that from the
available temperature readings, only the average of the 4 sensors in each of the
monitored cross-sections (deck and pier) could be used in the MLR model, jointly with
the temperature in the stay-cable. Therefore, the MLR model contains 3 predictor
variables and 10 dependent variables corresponding to the forces in the monitored stay-
cables. The COI model uses only the 10 time series of the stay-cable forces, being an
output-only (or latent-variable) model.
Damage scenarios involving the cross-section area reduction of the stay-cables are
numerically simulated using a previously validated finite element model of the bridge
[10]. The data being acquired by the SHM system is superposed with the structural
response due to damage events.

COMPARISON OF APPROACHES

Normalization of the data

The application of the MLR-PCA method requires the definition of the minimum
threshold for the Pearson correlation coefficient for each predictor and dependent
variable and the cumulative percentage of the variance defining the number of p of
principal components to be removed from the data. The results presented herein were
obtained using a threshold of 0.4 for the absolute value of the Pearson correlation
coefficient and 80% of the variance. For a discussion refer to Sousa Tomé et al. [11].
The application of the COI algorithm does not require any further user input.

Figure 3. False positives (UCL computed for α = 99.99%) as a function of the training period size.

MLR-PCA MLR-PCA

365d 730d
Training Training

COI COI

365d 730d
Training Training

Figure 4. Hotelling T2 control charts. Black dots correspond to the undamaged state and red crosses to
the simulated damaged scenario (10% of area loss in the stay cable T19C19). UCL for α=99.99%.
The variation of the percentage of false positives with the size of the training period
is shown in Figure 3 and provides an overall insight of the efficiency of the
corresponding data normalization. The increasing training periods start in January 1st
2015 and the performance indicators are always calculated for the 18 month period
starting in January 2017. For each training period the models were automatically
adjusted following the procedure outlined above. Therefore, the number of principal
components in the MLR-PCA and the number of cointegration residuals in the COI vary
with the training data. The UCL was defined using the 99.99% significance level. Above
172 days all the stay-cable force time-series are integrated of order 1, which means this
is the minimum training period that can be used in the COI model. It is interesting to
notice the sharp reduction in the number of false positives as the training period
approaches one year. The control charts for training periods of 1 and 2 years are shown
in Figure 4. The black dots correspond to the real data and the red crosses to the data
corrupted by the damage of 10% cross-section loss in the stay-cable T19C19.

Damage detection sensitivity

The sensitivity to the damage event defined in the previous section can be evaluated
in Figure 5 for increasing training period sizes. The RU values show that both models
can clearly distinguish this damage provided that the training period exceeds one year.
As concluded before, for smaller training periods the data is not normalized and damage
cannot be discerned. The RL values well above 1 confirm that both methods can
unambiguously detect this damage.
Area reductions from 0% up to 100% are individually considered in all stay-cables.
Assuming that damage is detected when RL≥1.0, the minimum detectable damage for
each stay cable is presented in Figure 6 for a training period of 2 years.

Figure 5. Performance indicators RU and RL (for α=99.99%) as a function of the training period size.
Considered damage scenario: 10% of area reduction in the stay cable T19C19.

Figure 6. Minimum detectable damage in each stay cable considering that a damage is unambiguously
detected when RLα≥1. UCL computed using a significance level α=99.99%.
The short stay-cables anchored close to the pier are those where the damages are
more difficult to detect The performance of both models is comparable, with a small
advantage for the COI. Moreover, besides being more consistent from the theoretical
point of view, the COI model is an output only method and requires less user-defined
parameters.

CONCLUSIONS

Two online methodologies were systematized for data normalization and damage
detection. Both were shown to reasonably normalize the data being acquired in a cable
stayed bridge provided that the training period is longer than one year (or one year and
a half in the case of the COI). Concerning damage sensitivity, and for the analyzed
dataset, both the MLR-PCA and COI models provided equivalent results, with an edge
for the COI, which has the additional advantage of being an output-only method.

ACKNOWLEDGEMENTS

The authors would like to acknowledge to the bridge owner, AutoEstradas


XXI/Globalvia Transmontana and NewMENSUS, Lda. This work was financially
supported by: UID/ECI/04708/2019 – CONSTRUCT – Institute of R&D in Structures
and Construction funded by national funds through the FCT/MCTES (PIDDAC);
POCI-01-0145-FEDER-031355 – S4Bridges. The support by FCT through the PhD
grant SFRH/BD/91536/2012 attributed to the first author is gratefully acknowledged.

REFERENCES
1. Johnson RA, Wichern DW.2013. “Applied Multivariate Statistical Analysis,”. 6 edition ed. Harlow:
Pearson
2. Magalhães F, Cunha A, Caetano E. 2012. “Vibration based structural health monitoring of an arch
bridge: From automated OMA to damage detection,” Mechanical Systems and Signal Processing,
28:212-228.
3. Cross E.J., Worden K., Chen Q.. 2011. “Cointegration: a novel approach for the removal of
environmental trends in structural health monitoring data,” Proceedings of the Royal Society A:
Mathematical, Physical and Engineering Sciences, 467: 2712-2732.
4. Dao P.B., Staszewski W.J. 2013. “Cointegration approach for temperature effect compensation in
Lamb-wave-based damage detection,” Smart Materials and Structures, 22.
5. Dickey D.A., Fuller W.A. 1981. “Likelihood Ratio Statistics for Autoregressive Time Series with a
Unit Root,” Econometrica, 49: 1057-1072.
6. Johansen S. 1988. “Statistical analysis of cointegration vectors,” Journal of Economic Dynamics and
Control, 12: 231-254.
7. Dao P.B., Staszewski W.J., Klepka A. 2017. “Stationarity-Based Approach for the Selection of Lag
Length in Cointegration Analysis Used for Structural Damage Detection,” Computer-Aided Civil and
Infrastructure Engineering, 32: 138-153.
8. Sousa Tomé, E. 2019. ” Smart structural health monitoring applied to the management and
conservation of bridges,” Doctoral thesis, University of Porto, Faculty of Engineering.
9. Posenato D., Kripakaran P., Inaudi D., Smith I.F.C. 2010. “Methodologies for model-free data
interpretation of civil engineering structures”, Computers & Structures, 88: 467-482.
10. Sousa Tomé E., Pimentel M., Figueiras, J. 2018. “Structural response of a concrete cable-stayed
bridge under thermal loads”, Engineering Structures, 176: 653-672.
11. Sousa Tomé E., Pimentel M., Figueiras, J. 2019. “Online early damage detection and localisation
using multivariate data analysis – application to a cable-stayed bridge”, Structural Control and Health
Monitoring, Accepted for publication.

You might also like