Paper_IWSHM
Paper_IWSHM
Two alternative methodologies for online data normalization are described and
compared: multiple linear regression followed by principal component analysis (MLR-
PCA) and cointegration (COI). While the former is being used for some time in the
scope of SHM, only recently the latter was introduced to the analysis of SHM data. In
both cases the statistical classification is performed resorting to the Hotelling T2 statistic.
The developed algorithms are applied to a prestressed concrete cable-stayed bridge
of which 3½ years of continuous data is available. Three performance indicators are
used to compare the two methodologies: one is the number of false positives (incorrectly
predicted damage events) and the other two are related to the sensitivity to damage.
Several damage scenarios involving small section loss the stay-cables are simulated by
corrupting the measured (real) time series with the structural response to the damage
events obtained from a finite element model of the bridge. It is shown that both
methodologies can provide robust results and reasonable sensitivity to damage.
INTRODUCTION
ˆ E
Y XU (1)
MLR
Uˆ ( XT X)1 XT Y (2)
The predictors having an absolute value of the Pearson correlation coefficient with
the dependent variable below a pre-established threshold are not used in the model.
Principal Component Analysis (PCA) [1] is a statistical method that uses an
orthogonal transformation to convert a set of initially correlated variables into a set of
linearly uncorrelated variables. Considering an n-by-m matrix Y with the original
variables, where m is the number of sensors and n is the number of observations in time,
the m uncorrelated principal component scores, Z , are determined by:
Z Y T (3)
Σ T Λ TT (4)
being the T and Λ matrixes obtained by the singular value decomposition of the
covariance matrix Σ of the original variables. The columns of T are the singular vectors
and the diagonal matrix Λ contains the singular values of the matrix Σ in descending
order. The singular values stored in Λ are the variances of the components of Z .
Moreover, the matrix Λ can be split into a matrix storing the first p singular values and
in a matrix 𝐙̂ containing the remaining m-p singular values, which are not relevant to
explain the variability of Y . The matrix 𝐙̂ is expected to be insensitive to the EOVs
and can be used as damage sensitive features.
The PCA model can be applied directly to the sensor readings or, as proposed by
Magalhães et al. [2], to the residuals of the MLR model. In the latter case, it is here
designated as MLR-PCA. The MLR model is used to remove the effects of the measured
actions on the bridge, such as the temperature, from the data. The PCA is used to
suppress the environmental and operational effects not removed by the MLR model,
namely the remaining temperature effects and the long-term behaviour due to
rheological effects of the concrete.
COINTEGRATION (COI)
where z t are the cointegration residuals. For the time series to be cointegrated, they
must have shared/common trends and the same order of integration [3]. Since y t is M-
dimensional, there may be Nr ≤ M-1 linearly independent cointegrating vectors and the
cointegration relation given by equation (5) can be generalized to:
β1T y t z1,t
ΒT y t ~ I (0) (6)
β N y t z N ,t
T
r r
CLASSIFICATION
After the data is normalised, a control chart is used to track abnormal values, which
can be related to the presence of damage. The Hotelling T2-statistic condensates all
model features into a scalar indicator, working in this context as a damage indicator:
T2 r (x x)T S1 (x x) (7)
where r is the number of observations considered (window size), x is the average of the
observations inside the window, x is the process average when it is in control and S is
the process covariance matrix, all of them estimated using the data from the training
period.
The control limits define the accepted process variability. In the present paper, r =1
was adopted. Therefore, the lower control limit is zero and the upper control limit (UCL)
is computed from:
m(s 1)(s 1)
UCL Fm, s m ( ) (8)
s2 s m
where Fm, s m ( ) is the α percentage point of the F distribution with m and sm
degrees of freedom, being m the number of features and s the number of subgroups (or
windows) collected during the training period. In the MLR-PCA method, m equals the
number of sensors minus the number of principal components extracted from the data.
In the COI method, m is the number of cointegration residuals.
PERFORMANCE CRITERIA
mean Tdamaged
2
mean Tdamaged
2
RU RL (9)
mean Tundamaged
2
UCL( )
Large RU values mean that there is a clear distinction between the undamaged and
damaged states. Values of RLα >1.0 mean that on average the points are above the UCL.
When the data is not well-normalised, low values of RU and high values of RLα may be
obtained. The best data normalisation methodology will be that simultaneously
providing a lower number of false positives and higher values of both RU and RLα.
CORGO BRIDGE
300m
Figure 1. Side elevation of the Corgo Bridge with the identification of the instrumented stay-cables.
The system is operating since January 2015. The collected data is pre-processed
using the Interquartile Range Analysis algorithm [9] and then daily averaged. The daily
averaged temperatures in the deck and forces in one of the longest stay-cables are shown
in Figure 2 as examples.
Figure 2. Samples of the daily averaged time series of the collected data. Left: average of the 4
temperature sensors in the deck cross-section. Right: forces in the stay-cable T18C20.
The drift in the stay-cable force due to the time-dependent of effects of concrete and
prestressing steel are clearly noticeable. A previous study [10] has shown that from the
available temperature readings, only the average of the 4 sensors in each of the
monitored cross-sections (deck and pier) could be used in the MLR model, jointly with
the temperature in the stay-cable. Therefore, the MLR model contains 3 predictor
variables and 10 dependent variables corresponding to the forces in the monitored stay-
cables. The COI model uses only the 10 time series of the stay-cable forces, being an
output-only (or latent-variable) model.
Damage scenarios involving the cross-section area reduction of the stay-cables are
numerically simulated using a previously validated finite element model of the bridge
[10]. The data being acquired by the SHM system is superposed with the structural
response due to damage events.
COMPARISON OF APPROACHES
The application of the MLR-PCA method requires the definition of the minimum
threshold for the Pearson correlation coefficient for each predictor and dependent
variable and the cumulative percentage of the variance defining the number of p of
principal components to be removed from the data. The results presented herein were
obtained using a threshold of 0.4 for the absolute value of the Pearson correlation
coefficient and 80% of the variance. For a discussion refer to Sousa Tomé et al. [11].
The application of the COI algorithm does not require any further user input.
Figure 3. False positives (UCL computed for α = 99.99%) as a function of the training period size.
MLR-PCA MLR-PCA
365d 730d
Training Training
COI COI
365d 730d
Training Training
Figure 4. Hotelling T2 control charts. Black dots correspond to the undamaged state and red crosses to
the simulated damaged scenario (10% of area loss in the stay cable T19C19). UCL for α=99.99%.
The variation of the percentage of false positives with the size of the training period
is shown in Figure 3 and provides an overall insight of the efficiency of the
corresponding data normalization. The increasing training periods start in January 1st
2015 and the performance indicators are always calculated for the 18 month period
starting in January 2017. For each training period the models were automatically
adjusted following the procedure outlined above. Therefore, the number of principal
components in the MLR-PCA and the number of cointegration residuals in the COI vary
with the training data. The UCL was defined using the 99.99% significance level. Above
172 days all the stay-cable force time-series are integrated of order 1, which means this
is the minimum training period that can be used in the COI model. It is interesting to
notice the sharp reduction in the number of false positives as the training period
approaches one year. The control charts for training periods of 1 and 2 years are shown
in Figure 4. The black dots correspond to the real data and the red crosses to the data
corrupted by the damage of 10% cross-section loss in the stay-cable T19C19.
The sensitivity to the damage event defined in the previous section can be evaluated
in Figure 5 for increasing training period sizes. The RU values show that both models
can clearly distinguish this damage provided that the training period exceeds one year.
As concluded before, for smaller training periods the data is not normalized and damage
cannot be discerned. The RL values well above 1 confirm that both methods can
unambiguously detect this damage.
Area reductions from 0% up to 100% are individually considered in all stay-cables.
Assuming that damage is detected when RL≥1.0, the minimum detectable damage for
each stay cable is presented in Figure 6 for a training period of 2 years.
Figure 5. Performance indicators RU and RL (for α=99.99%) as a function of the training period size.
Considered damage scenario: 10% of area reduction in the stay cable T19C19.
Figure 6. Minimum detectable damage in each stay cable considering that a damage is unambiguously
detected when RLα≥1. UCL computed using a significance level α=99.99%.
The short stay-cables anchored close to the pier are those where the damages are
more difficult to detect The performance of both models is comparable, with a small
advantage for the COI. Moreover, besides being more consistent from the theoretical
point of view, the COI model is an output only method and requires less user-defined
parameters.
CONCLUSIONS
Two online methodologies were systematized for data normalization and damage
detection. Both were shown to reasonably normalize the data being acquired in a cable
stayed bridge provided that the training period is longer than one year (or one year and
a half in the case of the COI). Concerning damage sensitivity, and for the analyzed
dataset, both the MLR-PCA and COI models provided equivalent results, with an edge
for the COI, which has the additional advantage of being an output-only method.
ACKNOWLEDGEMENTS
REFERENCES
1. Johnson RA, Wichern DW.2013. “Applied Multivariate Statistical Analysis,”. 6 edition ed. Harlow:
Pearson
2. Magalhães F, Cunha A, Caetano E. 2012. “Vibration based structural health monitoring of an arch
bridge: From automated OMA to damage detection,” Mechanical Systems and Signal Processing,
28:212-228.
3. Cross E.J., Worden K., Chen Q.. 2011. “Cointegration: a novel approach for the removal of
environmental trends in structural health monitoring data,” Proceedings of the Royal Society A:
Mathematical, Physical and Engineering Sciences, 467: 2712-2732.
4. Dao P.B., Staszewski W.J. 2013. “Cointegration approach for temperature effect compensation in
Lamb-wave-based damage detection,” Smart Materials and Structures, 22.
5. Dickey D.A., Fuller W.A. 1981. “Likelihood Ratio Statistics for Autoregressive Time Series with a
Unit Root,” Econometrica, 49: 1057-1072.
6. Johansen S. 1988. “Statistical analysis of cointegration vectors,” Journal of Economic Dynamics and
Control, 12: 231-254.
7. Dao P.B., Staszewski W.J., Klepka A. 2017. “Stationarity-Based Approach for the Selection of Lag
Length in Cointegration Analysis Used for Structural Damage Detection,” Computer-Aided Civil and
Infrastructure Engineering, 32: 138-153.
8. Sousa Tomé, E. 2019. ” Smart structural health monitoring applied to the management and
conservation of bridges,” Doctoral thesis, University of Porto, Faculty of Engineering.
9. Posenato D., Kripakaran P., Inaudi D., Smith I.F.C. 2010. “Methodologies for model-free data
interpretation of civil engineering structures”, Computers & Structures, 88: 467-482.
10. Sousa Tomé E., Pimentel M., Figueiras, J. 2018. “Structural response of a concrete cable-stayed
bridge under thermal loads”, Engineering Structures, 176: 653-672.
11. Sousa Tomé E., Pimentel M., Figueiras, J. 2019. “Online early damage detection and localisation
using multivariate data analysis – application to a cable-stayed bridge”, Structural Control and Health
Monitoring, Accepted for publication.