Vibration Event Recognition Using SST-Based F-OTDR System
Vibration Event Recognition Using SST-Based F-OTDR System
Article
Vibration Event Recognition Using SST-Based Φ-OTDR System
Ruixu Yao 1,2,† , Jun Li 1,2, *,† , Jiarui Zhang 1,2 and Yinshang Wei 1
1 School of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, China;
[email protected] (R.Y.); [email protected] (J.Z.)
2 Shaanxi Provincial Key Laboratory of Coal Fire Disaster Prevention, Xi’an 710054, China
* Correspondence: [email protected]; Tel.: +86-13635695983
† These authors contributed equally to this work.
Abstract: We propose a method based on Synchrosqueezing Transform (SST) for vibration event
analysis and identification in Phase Sensitive Optical Time-Domain Reflectometry (Φ-OTDR) systems.
SST has high time-frequency resolution and phase information, which can distinguish and enhance
different vibration events. We use six tap events with different intensities and six other events as
experimental data and test the effect of attenuation. We use Visual Geometry Group (VGG), Vision
Transformer (ViT), and Residual Network (ResNet) as deep classifiers for the SST transformed data.
The results show that our method outperforms the methods based on Continuous Wavelet Transform
(CWT) and Short-Time Fourier Transform (STFT), while ResNet is the best classifier. Our method can
achieve high recognition rate under different signal strengths, event types, and attenuation levels,
which shows its value for Φ-OTDR system.
1. Introduction
Distributed fiber optic sensing system is a technology that uses optical fiber as a sensor,
which can realize real-time monitoring of temperature [1], vibration, sound, and other
physical quantities along the optical fiber. A distributed fiber optic sensing system has
the advantages of wide coverage, high sensitivity, strong anti-interference ability, etc., and
is widely used in oil and gas pipelines, bridges, tunnels, seismology, and other fields [2].
Citation: Yao, R.; Li, J.; Zhang, J.; Wei,
However, distributed fiber optic sensing systems also face some challenges, one of which is
Y. Vibration Event Recognition Using
SST-Based Φ-OTDR System. Sensors
how to effectively process and classify the signals collected from optical fibers to identify
2023, 23, 8773. https://doi.org/
different types of events and anomalies, which has become a focal point of concern within
10.3390/s23218773 the field of distributed fiber optic intrusion detection.
To solve this problem, many researchers have proposed different signal processing and
Academic Editor: Aldo Minardo
classification methods. Time-domain [3] methods obtain a single piece of information and,
Received: 3 October 2023 therefore, fewer types can be accurately identified. Frequency-domain [4] feature analysis
Revised: 16 October 2023 methods can obtain the intrinsic spectral characteristics of the signal, but they may ignore
Accepted: 19 October 2023 the time-domain information for non-stationary signals that are constantly changing over
Published: 27 October 2023 time. Therefore, the time-frequency domain characterization method is the most commonly
used approach, which transforms the signal at different times and frequencies to extract
the local features of the signal.
Many methods based on time-frequency domain features have been proposed, such
Copyright: © 2023 by the authors.
as wavelet decomposition, STFT, etc. In 2015 [5], Wu et al. used wavelet decomposition
Licensee MDPI, Basel, Switzerland.
to extract the spectral distribution as a feature vector and combined it with Back Propaga-
This article is an open access article
tion (BP) neural network to classify the events of environmental noise, system drift, and
distributed under the terms and
man-made intrusion, with an accuracy of 89.19%. In 2018 [6], Xu C et al. used the spectral
conditions of the Creative Commons
Attribution (CC BY) license (https://
subtraction method and STFT to get the spectral map, and then inputted it into a Convo-
creativecommons.org/licenses/by/
lutional Neural Network (CNN) for automatic feature extraction and classification. The
4.0/). recognition rate of the four events of digging, walking, vehicle overtaking, and sabotage
exceeded 90% in each case. In 2022 [7], Xu W et al. calculated the difference between two
adjacent Rayleigh Back Scattering (RBS) traces and normalized the difference results of the
images. The images were labeled, tagged, and then recognized by You Only Live Once
(YOLO) as calm state, rigid collision with the ground, impact protection network, vibration
protection network, and cutting protection network of five states and localization; a recog-
nition rate of 96.14% was attained. In 2023 [8], Du X et al. performed wavelet transform
and feature extraction on the original signal, and then optimized the feature vector using
the sample feature correction algorithm. Support Vector Machines (SVM)was then used to
recognize watering, climbing, knocking, pressure, as well as a spurious perturbation event.
SVM achieved a 98.5% recognition rate, as well as a 99.8% recognition rate for spurious
perturbation events. All of these methods, to varying degrees, quickly localized intrusion
events by classifying several explicit events, which is a significant improvement for the
security of the monitoring system. However, these methods also have some limitations,
such as restricted time-frequency resolution and ignored phase information.
These limitations are overcome by a novel method for signal processing and classifi-
cation based on SST that is presented in this paper. CWT [9] is a time-frequency analysis
tool that transforms signals at different scales and translations to extract local features
of the signals. SST [10] is a technique that takes advantage of the sparsity of the signals
to reconstruct the signals through a small number of measurements. In this paper, we
perform signal extraction and reconstruction of vibration signals using SST and use the
phase information of CWT to enhance the feature representation of the signal. A dataset
of vibration signals generated in the laboratory is used, which contains 12 categories of
signals generated by different strengths of tap and different events. We also consider
the effect of signal attenuation on the classification performance and process and classify
the attenuated weak signals. We compare the classification effect of three kinds of time-
frequency maps, namely STFT, CWT, and SST, in the VGG [11], ViT [12], and ResNet [13]
algorithms. The experimental results show that our method can stably and effectively
improve the classification performance and provide a new way for the application of the
Φ-OTDR system.
of the setup. The device uses a narrow linewidth laser (NLL) with BG-PL-1550-20-3K
technology, which has a linewidth of 3 kHz and produces high-quality pulse light. The
NLL operates at a wavelength of 1550.12 nm and a power of 20 mW, with a pulse width of
100 ns and a repetition rate of 1 kHz. The light from the NLL passes through an isolator
(IO), which has an isolation degree of 40 dB and prevents the back-reflected light in the
fiber from affecting the laser. The IO is an IO-FS50 model from Thorlabs. After passing
through an acousto-optic modulator (AOM), the light is amplified by an erbium-doped
fiber amplifier (EDFA) to enhance the sensitivity and dynamic range of the system. The
AOM is a CETC44-AOM-200M-5V-1W model from China Electronics Technology Group
44 Institute, with a working frequency of 200 MHz. The EDFA is an AEDFA-13-M model
from Poffer Optoelectronics Technology, with a gain of 15 dB. The photodetector (PIN)
receives the light signals reflected or scattered from the sensing fiber and converts them into
electrical signals. The PIN is a PDA10CS model from Thorlabs, with a response wavelength
range of 800 nm to 1700 nm. The data acquisition card (DAQ) collects and processes the
signal after photoelectric conversion, using a PCle-9802DC model customized by Jianyi
Sensors 2023, 23, x FOR PEER REVIEW 4 of 20
Technology, with a maximum sampling rate of 250 MSa/s. The sensing fiber and buried
cable are Corning SMF-28 single-mode fibers, with a core optical refractive index of 1.4.
Figure1.1.Structure
Figure Structureof theΦ-OTDR
ofthe Φ-OTDRsystem.
system.
The Principle
2.2. SST purpose of this paper is to introduce experiments that are conducted in a labora-
tory environment to collect and analyze different types of signals applied onto the sensing
To realize signal processing and classification of the Φ-OTDR system, we will intro-
optical fiber cable. The cable has a length of 11 km and was divided into two sections: the
duce a method based on SST. SST is a technique that leverages the sparsity of the signals
front section and the rear section. The front section was wrapped around wheels and left
to reconstruct them from a small number of measurements.
exposed to the air, while the rear section was buried 30 cm deep in soil. The first task of
Daubechies
the experiment et to
was al.,detect
one of the founders
a phone vibration of(PV)
wavelets,
signal proposed
within theSST frontassection
an empirical
of the
optical fiber. This was achieved by placing a mobile phone on the cable to create coefficients
modal decomposition tool in 2011 [10]. SST rearranges the time-frequency vibrations,
through
and then the synchronous
detecting the signalcompression operator, of
from the beginning which shiftsoptic
the fiber the time-frequency distribu-
cable over a distance of
tion of the signal at any point of the time-frequency plane to the center
2–4 km. For the second task, the rear section of the optical fiber was used to detect eleven of gravity of the
energy
other and enhances
signals. the energy
These signals concentration
were generated of the activities
by various instantaneous frequency.
performed Therefore,
on or within the
the core of SST is the estimation of the instantaneous frequency.
soil, such as tapping, stamping, chiseling, mixing, watering, and digging. Sampling It can better solve the
points
time-frequency ambiguity problem existing in the traditional time-frequency
from 9.94–10 km were used for this purpose. The tapping signal varied in intensity based analysis
method.
on factorsAs a special
such rearrangement,
as force, frequency, and SST
evencanthe
notgender
only sharpen the time-frequency
of the individual performing repre-
the
sentation, but also recover the signal.
action. Taps performed by females were categorized as low force, while those performed
by malesGenerally speaking, strong
were considered the Morlet
force.wavelet
The time function
intervalisbetween
more suitable
each tapforranged
analyzing
fromperi-
1s
odic
to 3 s,or quasi-periodic
representing fast, signals,
normal, while
and slow the tapping
generalized Morse
speeds, wavelet (GMW)
respectively. function
The stamping (ST) is
more suitable for analyzing signals with sharp changes or sudden
signal was generated by an individual forcefully stepping on the soil. The chiseling (CH) events. Therefore, in
this paper,
signal we used
was created bythe GMW awavelet
inserting function.
chisel into WeThe
the soil. filled the signal
excavation by reflection
walking filling,
and tapping
which can make its length to the nearest power of 2. We set the threshold of the control
synchronization squeezing operator to 0.05.
As shown in Equation (1), SST improves the time-frequency resolution by compres-
sively rearranging the CWT transform coefficients in the frequency direction and rewrit-
Sensors 2023, 23, 8773 4 of 21
(MI) signal resulted from three simultaneous actions on the soil: tapping, stamping, and
chiseling. The watering (WA) signal was produced by pouring water into the soil. Lastly,
the digging (HSZ) signal was generated using a Hyrule shovel to dig into the soil. Various
devices and instruments were utilized throughout this experiment to generate, transmit,
receive, and process these signals within the optical fiber cable. A schematic diagram of
this experimental setup can be seen in Figure 1.
where ξ is the angular frequency and fˆ(ξ ) is the Fourier transform of the signal f . When
the wavelet transform coefficients at any point (a, b) are not equal to 0, the instantaneous
frequency of the signal f is ψ f ( a, b), as shown in Equation (2):
∂
ψ f ( a, b) = −i (Ψ f ( a, b))−1 Ψ f ( a, b) (2)
∂b
Then a synchronized compression transformation is performed to create a mapping
from the starting point (b, a) to (b, Ψ f ( a, b)), and Ψ f ( a, b), etc. are transformed from the
time-scale plane to the time-frequency scale plane to obtain a new time-frequency map.
Then the frequency variable ψ, the scale variable a, and the translation factor b are
discretized, and Ψ f ( a, b) can be obtained when it satisfies ak − ak−1 = (∆a)k only at the
point ak , and the corresponding T f ( a, b) of the SST is similarly computed when it satisfies
ψl − ψl −1 = ∆ψ in the continuous interval [ψ − 21 ∆ψ, ψ + 12 ∆ψ].
The final expression of SST is shown in Equation (3):
3
T f (ψl , b) = ∆ψ−1 ∑ Ψf(ak , b)ak − 2 (∆a)k (3)
ak:|ψ( a
k,b )−ψl |
(a) 2.75 Hz
(b) 47.17 Hz
(c) 91.08 Hz
Figure 2. Time-frequency
Figure 2. Time-frequencydiagrams
diagrams of ofthree
three different
different analog
analog signals.
signals. (a) Diagrams
(a) Diagrams of 2.75 Hzof 2.75 Hz r
raw
signal, CWTCWT
signal, signal, and
signal, andSST
SSTsignal; (b)diagrams
signal; (b) diagrams of 47.17
of 47.17 Hzsignal,
Hz raw raw signal, CWTand
CWT signal, signal, and SST sign
SST signal;
(c) diagrams of 91.08
(c) diagrams Hz
of 91.08 Hzraw
raw signal, CWT
signal, CWT signal,
signal, andand SST signal.
SST signal.
Figure 3.
Figure 3. Flow
Flow of
of recognition
recognition algorithm.
algorithm.
3.
3. Experimental
Experimental Analysis
Analysis
3.1. Dataset
3.1. Dataset
In
In order
order to
to verify
verify the
the classification
classification performance
performance ofof our
our method
method in
in complex
complex situations,
situations,
we
we collected data of different strengths and types of vibration events inin
collected data of different strengths and types of vibration events thethe laboratory
laboratory as
as
thethe actual
actual vibration
vibration event
event dataset.
dataset. Six Six types
types of tapping
of tapping withwith different
different strengths
strengths werewere
col-
collected for this paper, namely tap slow force (TSF), tap fast force (TFF), tap normal force
lected for this paper, namely tap slow force (TSF), tap fast force (TFF), tap normal force
(TNF), tap slow little force (TSLF), tap fast little force (TFLF), and tap normal little force
(TNF), tap slow little force (TSLF), tap fast little force (TFLF), and tap normal little force
(TNLF). These tapping events were generated by applying taps of different strengths at
(TNLF). These tapping events were generated by applying taps of different strengths at
different locations and in different directions on the sensing fiber to simulate the behavior
different locations and in different directions on the sensing fiber to simulate the behavior
of different intruders. In order to increase the complexity of the environment, we also
of different intruders. In order to increase the complexity of the environment, we also
collected data from six different types of events, including HSZ, WA, MI, ST, PV, CH. These
collected data from six different types of events, including HSZ, WA, MI, ST, PV, CH.
events were generated by performing different activities near or above the sensing fiber
These events were generated by performing different activities near or above the sensing
to simulate different environmental disturbances. We uniformly cut the signals every 3 s
fiber to simulate different environmental disturbances. We uniformly cut the signals every
according to the latest time of the complete occurrence of each event as a sample signal, and
3 s according to the latest time of the complete occurrence of each event as a sample signal,
then we performed CWT, SST, and STFT on the sample signals to generate time-frequency
and then we performed CWT, SST, and STFT on the sample signals to generate time-fre-
maps. Finally, we categorized and organized the time-frequency map dataset. As shown
quency
in Tablesmaps.
1 andFinally, we categorized
2, we collected a total ofand organized
3840 the time-frequency
sample signals, map dataset.
with 1920 sample As
signals for
shown in Tables 1 and 2, we collected a total of 3840 sample signals, with
each category, including 320 sample signals for each type of percussive vibration event. 1920 sample
The original waveform vibration images of the 12 data are shown in Figure 4. It can be
seen that there are obvious differences in the wave peaks, frequencies, and morphology
of different data. Based on these differences, this paper uses time-frequency analysis to
extract the features of the signals and classify and identify them. CH has three raised wave
peaks with maximum intensity at 0.1, which is similar to the MI vibration image. Both
of them experience a sudden increase in wave peaks, but the intensity of MI is greater
than that of CH at around 0.4, and the number of wave peaks is higher. There are four, a
phenomenon which indicates that both CH and MI are generated by intermittent events.
HSZ exhibits one raised wave with a maximum intensity around 0.1, indicating that HSZ
is generated by a smaller shock event. PV exhibits a continuously increasing wave with a
maximum intensity of 0.24, ST displays a raised wave with a maximum intensity of 0.075,
Sensors 2023, 23, 8773 7 of 21
As shown in Figure 5, the vibration images of the attenuated signals for 12 datasets
illustrate the effect of the attenuation process on the signal strength and characteristics.
The figures show that the signal amplitudes have slightly diminished compared to the
original data, but the overall characteristics have not changed much. This suggests that the
attenuation process preserves the quality of the data.
Sensors 2023, 23, x FOR PEER REVIEW 8 of 20
Sensors 2023, 23, 8773 8 of 21
As shown in Figure 5, the vibration images of the attenuated signals for 12 dataset
illustrate the effect of the attenuation process on the signal strength and characteristics
(j) The figures show that the signal (k) amplitudes have slightly (l) diminished compared to th
original data, but the overall characteristics have not changed much. This suggests tha
Figure 4. Waveform
the Figure
attenuation images
process
4. Waveform of
of the
preserves
images original vibration
the vibration
the original quality ofsignals.
the (a)
signals. (a) CH;
data.
CH; (b) HSZ;
(b) HSZ; (c) MI;(c)
(d)MI;
PV; (d) PV; (e) ST
(e) ST;
(f) WA; (g) (g)
(f) WA; TFF; (h)(h)TNF;
TFF; TNF;(i)
(i) TSF; (j)TFLF;
TSF; (j) TFLF;(k)(k) TNLF;
TNLF; (l) TSLF.
(l) TSLF.
As shown in Figure 5, the vibration images of the attenuated signals for 12 datasets
illustrate the effect of the attenuation process on the signal strength and characteristics
The figures show that the signal amplitudes have slightly diminished compared to the
original data, but the overall characteristics have not changed much. This suggests tha
the attenuation process preserves the quality of the data.
Figure 5. Cont.
(a)
(b)
(c)
(d)
(e)
Figure 6. Cont.
Sensors 2023, 23, x FOR PEER REVIEW 11 of 20
Sensors 2023, 23, 8773 12 of 21
(f)
(g)
(h)
(i)
(j)
(k)
Figure 6. Cont.
Sensors 2023, 23, x FOR PEER REVIEW 12 of 20
Sensors 2023, 23, 8773 13 of 21
As part of this study, events were recognized by deep classification network, and
evaluation metrics such as accuracy, F1-score, and time were used to measure the classifi-
cation performance of the model under different time-frequency analysis methods. As
seen in Table 3 and Figure 7 for six signals with different intensities, the SST Resnet model
has the highest recognition accuracy of 99.48%, which is the best performance, and the
CWT Resnet model is the second best with 98.18% recognition accuracy. For the recogni-
tion accuracy of the six intensity signals under the six background signals, SST ResNet
was the best performing model with 96.74% recognition accuracy, followed by CWT Res-
Net with 94.66% recognition accuracy. These results indicate that the SST ResNet model
is able to effectively distinguish different intensities and types of percussive vibration
events with high robustness and generalization ability.
Different time-frequency
(a) analysis methods and different deep (b)network models have
obvious differences in classification performance. Among the three time-frequency anal-
ysis methods, namely CWT, SST, and STFT, SST was found to attain the best classification,
followed by CWT, with STFT attaining the worst results. This indicates that SST is the
most suitable time-frequency analysis method because it can retain the phase information
of the signal and eliminate the interference of the cross terms, which improves the time-
frequency resolution and classification performance. Among the three deep network mod-
els, namely VGG, ViT, and ResNet, the classification performance of ResNet was found to
outperform the others, followed by ViT, with VGG attaining the worst results. This indi-
cates that ResNet is the most suitable deep network model because it enables fast locali-
zation and recognition of image data, has a deeper network structure and fewer parame-
ters compared to VGG [16], and differs from ViT [17] in the way it handles representations,
which has mixed local and global information at both the lower and higher levels. In this
paper, we believe that (c) the combination of SST and ResNet is the optimal (d) signal processing
and classification method
Figure 7. because
Signal Test they can make
classification knot full use
diagram of
for the
the time-frequency
original
Figure 7. Signal Test classification knot diagram for the original data. (a) Model Train data. character-
(a) Model Train Accuracy;
Accuracy;
istics of the (b) Model Test Accuracy; (c) Model Train Loss; (d) Model Test Loss.
(b) Model Testsignal and(c)
Accuracy; have high
Model computational
Train Loss; (d) Model efficiency
Test Loss.and classification accuracy.
Table 3. Comparison
Different ofFigure 8 illustrates the confusion matrix of the original data, showing the classific
test results.
time-frequency analysis methods and different deep network models have
tion results and errors of different deep network models with respect to different event
obvious differences in classification performance. Among the three time-frequency analysis
Classification The most confusingSix Intensities
events were mainly the events CH, WA, AllPV, TSLF, and TFLF. The
methods, namely CWT, SST, and STFT, SST was found to attain the best classification,
Method events were
Accuracy% characterized
F1-Score by smaller
Time/s amplitude, lower frequency, orTime/s
were simply mo
followed by CWT,unstable,
with STFT attaining the worst results. Accuracy%
This indicatesF1-Score
that SST is the
so they were not distinctive enough in the time-frequency domain and were ea
most
STFTtime-frequency
suitable VGG 94.79
analysis 0.95 because0.28
method it can 88.28
retain the phase 0.89
information 0.29
of the of the
ily confused with other events. In order to improve the classification accuracy
CWT
signal andVGGeliminate 96.61
the interference 0.97
of the cross 0.28
terms, which 91.15
improves 0.91
the 0.29
time-frequency
events, this paper suggests using the SST ResNet model, as it can utilize the phase info
SST VGG
resolution 97.66
and classification
mation of the 0.98to enhance
performance.
signals Among 0.29 the three
their 94.14
deep
feature network 0.94
representation models, 0.29high time-fr
and it namely
has
VGG,STFT
ViT, ViT
and ResNet, the
96.88
quency classification
resolution performance
0.97classification
and of ResNet
0.05performance.91.28was
As itfound
can betoseen
0.91 outperform
in0.05
Figure 8, the SS
the others,
CWT ViT followed by
ResNet
97.14ViT,
model with VGG
attained
0.97 attaining
higher the worst
classification
0.05 results.
accuracies
91.93 This
than
0.92 indicates
the other that for all
models
0.05
ResNet is the
SST ViT most suitable
these events
97.40 deep
and network
attained
0.97 model
the best because
overall
0.05 it enables
classification
93.10 fast localization
accuracy,
0.93 i.e., and
96.74%.
0.05
recognition of image data, has a deeper network structure and fewer parameters compared
STFT ResNet 96.61 0.97 0.09 92.32 0.93 0.09
to VGG [16], and differs from ViT [17] in the way it handles representations, which has
CWT ResNet 98.18 0.98 0.09 94.66 0.95 0.09
mixed local and global information at both the lower and higher levels. In this paper,
we SST ResNet
believe 99.48
that the combination of 0.99 0.09 is the optimal
SST and ResNet 96.74 signal0.97 0.09
processing and
Sensors 2023, 23, 8773 14 of 21
classification method because they can make full use of the time-frequency characteristics
of the signal and have high computational efficiency and classification accuracy.
Figure 8 illustrates the confusion matrix of the original data, showing the classification
results and errors of different deep network models with respect to different events. The
most confusing events were mainly the events CH, WA, PV, TSLF, and TFLF. These events
were characterized by smaller amplitude, lower frequency, or were simply more unstable,
so they were not distinctive enough in the time-frequency domain and were easily confused
with other events. In order to improve the classification accuracy of these events, this paper
suggests using the SST ResNet model, as it can utilize the phase information of the signals
to enhance their feature representation and it has high time-frequency resolution and
Sensors 2023, 23, x FOR PEER REVIEW 11 of 1
classification performance. As it can be seen in Figure 8, the SST ResNet model attained
higher classification accuracies than the other models for all of these events and attained
the best overall classification accuracy, i.e., 96.74%.
Figure 9.9.Exponential
Figure Exponentialattenuation signal
attenuation graph.graph.
signal
From Table 4 and Figure 11, it can be observed that the classification performance
of different time-frequency analysis methods and different deep network models on the
fading dataset has obvious differences. For the six intensity signals, the SST ResNet model
attained the highest recognition accuracy, i.e., 98.70%, and the CWT ResNet model was
the second best, with a recognition accuracy of 97.14%. For the recognition accuracy of the
six intensity signals under the six background signals, SST ResNet performed best with
95.18% recognition accuracy, followed by CWT ResNet with 94.40% recognition accuracy.
These results indicate that the SST ResNet model is able to effectively differentiate between
different intensities and types of percussive vibration events and has high robustness and
generalization ability even in the(a)
presence of signal attenuation.
(b)
Sensors 2023, 23, 8773 16 of 21
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
From Table 4 and Figure 11, it can be observed that the classification performance of
different time-frequency analysis methods and different deep network models on the fad-
ing dataset has obvious differences. For the six intensity signals, the SST ResNet model
attained the highest recognition accuracy, i.e., 98.70%, and the CWT ResNet model was
the second best, with a recognition accuracy of 97.14%. For the recognition accuracy of the
six intensity signals under the six background signals, SST ResNet performed best with
95.18% recognition accuracy, followed by CWT ResNet with 94.40% recognition accuracy.
These results indicate that the SST ResNet model is able to effectively differentiate be-
tween different intensities and types of percussive vibration events and has high robust-
ness and generalization ability even in the presence of signal attenuation.
For the three models, the overall recognition rate and speed are consistent with the
characteristics of the(a) (b)
original data. This indicates that the performance of different deep
network models on the attenuated dataset is consistent with the performance on the orig-
inal dataset, but there are some differences. With respect to the attenuated vibration data,
SST was found to attain the best classification performance under the ResNet model, alt-
hough the overall recognition rate was still lower than the unattenuated original data. This
suggests that signal attenuation has a certain impact on the classification performance and
needs to be considered for compensation or adjustment in practical applications.
dataset, but there are some differences. With respect to the attenuated vibration data, SST
was found to attain the best classification performance under the ResNet model, although
(c)
the overall recognition rate was still lower than the (d) unattenuated original data. This
suggests that12
Figure signal attenuation
presents has a certain
the confusion matriximpact
of the on the classification
attenuated performance
data, which illustratesand
the
needs
effect toofbesignal
considered for compensation
attenuation or adjustment
on the classification in practical
performance ofapplications.
different models. As
shown Figure 12 figure,
in the presents the confusion
signal attenuation matrix of the attenuated
introduces data, among
more confusion which illustrates
the events,thees-
effect of signal
pecially attenuation
for those on the amplitudes,
with smaller classificationsuch
performance
as WA and of different
PV. Thismodels. As shown
is because signal
in the figure, signal
attenuation lowersattenuation
the SNR and introduces
amplitude moreofconfusion
the signal,among
whichthe events,
are crucialespecially
features for
for
those with smaller
distinguishing amplitudes,
different typessuch as WAThe
of events. andproposed
PV. This ismodel
because signal
relies onattenuation
the amplitude lowers
and
the SNR andcharacteristics
frequency amplitude of of thethe
signal, which
signal are crucial
to classify featuresbut
the events, forthese
distinguishing different
characteristics may
types of events. The proposed model relies on the amplitude and frequency
be distorted or submerged by the noise and interference in the field environment. There- characteristics
of the the
fore, signal to classify
proposed the events,
model may notbut these characteristics
achieve the same high may be distorted
recognition or in
rate as submerged
the labor-
by the noise and interference in the field environment. Therefore,
atory setting. Among the tested models, the SST ResNet model was found to the proposed model
attainmay
the
not achieve
smallest the same high
classification errorrecognition rate as indata,
on the attenuated the laboratory
followed by setting.
the CWTAmong
ResNetthe model.
tested
models,
The other themodels
SST ResNet model
attained was
larger found to attain
classification theindicating
errors, smallest classification
that they areerror
moreon the
sensi-
attenuated data, followed
tive to signal attenuation. by the CWT ResNet model. The other models attained larger
classification errors, indicating that they are more sensitive to signal attenuation.
4.4.Conclusions
Conclusions
Thepurpose
The purposeofofthis
thispaper
paperisisto
toexplore
explorethe
theuse
useof
oftime-frequency
time-frequencydiagrams
diagramsand
anddeep
deep
learning models to identify and analyze the vibration signals collected in distributed
learning models to identify and analyze the vibration signals collected in distributed fiber
optic monitoring systems. In this study, different types and intensities of vibration signals
were firstly collected in the laboratory, including six kinds of tapping signals with differ-
ent strengths and six types of vibrations with different events, and then three deep learn-
ing models (VGG, ViT, and ResNet) were used to classify and compare the three kinds of
time-frequency diagrams (CWT, SST, and STFT), respectively. The results show that the
Sensors 2023, 23, 8773 20 of 21
fiber optic monitoring systems. In this study, different types and intensities of vibration
signals were firstly collected in the laboratory, including six kinds of tapping signals with
different strengths and six types of vibrations with different events, and then three deep
learning models (VGG, ViT, and ResNet) were used to classify and compare the three kinds
of time-frequency diagrams (CWT, SST, and STFT), respectively. The results show that
the classification accuracy of the Resnet model under the SST time-frequency diagram is
optimal, reaching 96.74%. Finally, in order to simulate the signals at different distances,
the data were attenuated, and the results show that the classification accuracy remained
as high as 95.18%, which indicates that effective identification can be carried out even at
a certain distance from the optical fiber. It can be seen that the method proposed in this
paper provides a new idea and technical means for distributed fiber monitoring.
Author Contributions: Conceptualization, R.Y. and J.L.; methodology, R.Y. and J.L.; software, R.Y.
and J.L.; validation, R.Y. and J.L.; formal analysis, R.Y. and J.L.; investigation, R.Y.; resources, R.Y.
and J.Z.; writing-original draft preparation, R.Y. and J.L.; writing-review and editing, R.Y. and J.L.;
visualization, R.Y. and J.L.; supervision, Y.W.; project administration, J.L.; funding acquisition, J.L. All
authors have read and agreed to the published version of the manuscript.
Funding: This work was supported in part by the National Key R&D Program of China (2021YFE0105000)
and in part by the Yulin Science and Technology Bureau Project under Grant CXY-2020 029.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: The authors would like to express their gratitude for the funding and for the
support of the reviewers, as well as for the editors’ insightful comments.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Adeel, M.; Shang, C.; Zhu, K.; Lu, C. Nuisance alarm reduction: Using a correlation based algorithm above differential signals in
direct detected phase-Φ-OTDR systems. Opt. Express 2019, 27, 7685–7698. [CrossRef]
2. Wu, H.; Chen, J.; Liu, X.; Xiao, Y.; Wang, M.; Zheng, Y.; Rao, Y. One-dimensional CNN-based intelligent recognition of vibrations
in pipeline monitoring with DAS. J. Light Technol. 2019, 37, 4359–4366. [CrossRef]
3. Mahmoud, S.S.; Katsifolis, J. Elimination of rain-induced nuisance alarms in distributed fiber optic perimeter intrusion detection
systems. Proc. SPIE 2009, 7316, 731604.
4. Cao, C.; Fan, X.Y.; Liu, Q.W.; He, Z.Y. Practical pattern recognition system for distributed optical fiber intrusion monitoring
system based on phase-sensitive coherent Φ-OTDR. In Proceedings of the Asia Communications and Photonics Conference,
Kowloon, Hong Kong, 19–23 November 2015; ASu2A.145.
5. Wu, H.J.; Xiao, S.K.; Li, X.Y.; Wang, Z.N.; Xu, J.W.; Rao, Y.J. Separation and determination of the disturbing signals in phase-
sensitive optical time domain reflectometry (Φ-OTDR). J. Light Technol. 2015, 33, 3156–3162. [CrossRef]
6. Xu, C.; Guan, J.; Bao, M.; Lu, J.; Ye, W. Pattern recognition based on time-frequency analysis and convolutional neural networks
for vibrational events in Φ-OTDR. Opt. Eng. 2018, 57, 016103. [CrossRef]
7. Xu, W.; Yu, F.; Liu, S.; Xiao, D.; Hu, J.; Zhao, F.; Lin, W.; Wang, G.; Shen, X.; Wang, W.; et al. Real-time multi-class disturbance
detection for Φ-OTDR based on YOLO algorithm. Sensors 2022, 22, 1994. [CrossRef] [PubMed]
8. Du, X.; Jia, M.; Huang, S.; Sun, Z.; Tian, Y.; Chai, Q.; Li, W.; Zhang, J. Event identification based on sample feature correction
algorithm for Φ-OTDR. Meas. Sci. Technol. 2023, 34, 085120. [CrossRef]
9. Grossman, A.; Morlet, J. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM J. Appl. Math.
1984, 15, 723–736. [CrossRef]
10. Daubechies, I.; Lu, J.; Wu, H.-T. Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool. Appl. Comput.
Harmon. Anal. 2011, 30, 243–261. [CrossRef]
11. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556.
12. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.;
Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929.
13. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
14. Wang, Z.; Lu, B.; Ye, Q.; Cai, H. Recent progress in distributed fiber acoustic sensing with Φ-OTDR. Sensors 2020, 20, 6594.
[CrossRef] [PubMed]
Sensors 2023, 23, 8773 21 of 21
15. Xu, C.; Guan, J.; Bao, M.; Lu, J.; Ye, W. Pattern recognition based on enhanced multifeature parameters for vibration events in
Φ-OTDR distributed optical fiber sensing system. Microw. Opt. Technol. Lett. 2017, 59, 3134–3141. [CrossRef]
16. Sharma, S.; Guleria, K. Deep learning models for image classification: Comparison and applications. In Proceedings of the 2022
2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida,
India, 28–29 April 2022; pp. 1733–1738.
17. Raghu, M.; Unterthiner, T.; Kornblith, S.; Zhang, C.; Dosovitskiy, A. Do vision transformers see like convolutional neural
networks? Adv. Neural Inf. Process. Syst. 2021, 34, 12116–12128.
18. Sha, Z.; Feng, H.; Shi, Y.; Zhang, W.; Zeng, Z. Phase-sensitive Φ-OTDR with 75-km single-end sensing distance based on RP-EDF
amplification. IEEE Photonics Technol. Lett. 2017, 29, 1308–1311. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.