0% found this document useful (0 votes)

25 views

A Comparison of Deep Learning Architectures For Spacecraft Anomaly Detection

This document compares deep learning architectures for spacecraft anomaly detection. It investigates convolutional neural networks (CNNs), recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and Transformer-based architectures on a dataset from multiple spacecraft missions. Initial results found that CNNs excel at spatial patterns but LSTMs and RNNs better capture temporal anomalies in time-series data. Transformer architectures showed promise for subtle anomalies spanning long durations but require more computational resources. The best model depends on the data, anomaly types, and operational constraints.

Uploaded by

larrylynnmail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

A Comparison of Deep Learning Architectures For Spacecraft Anomaly Detection

Uploaded by

larrylynnmail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

A Comparison of Deep Learning Architectures

for Spacecraft Anomaly Detection

Daniel Lakey Tim Schlippe
IU International University of Applied Sciences IU International University of Applied Sciences
[email protected] [email protected]

Abstract— Spacecraft operations are highly critical, demand- 1. I NTRODUCTION

ing impeccable reliability and safety. Ensuring the optimal
performance of a spacecraft requires the early detection and The field of space exploration has had significant advance-
mitigation of anomalies, which could otherwise result in unit or ments in recent decades, characterised by the increasing
mission failures. With the advent of deep learning, a surge of in- sophistication of spacecraft and the expanding complexity of
arXiv:2403.12864v1 [cs.LG] 19 Mar 2024

terest has been seen in leveraging these sophisticated algorithms missions. As mankind expands its presence in outer space,
for anomaly detection in space operations. Our study aims to the importance of precise and dependable data from space-
compare the efficacy of various deep learning architectures in craft systems has become of utmost significance. Time series
detecting anomalies in spacecraft data. The deep learning mod- data, which refers to a sequential arrangement of data points
els under investigation include Convolutional Neural Networks organised in chronological order, holds major significance
(CNNs), Recurrent Neural Networks (RNNs), Long Short-Term
Memory (LSTM) networks, and Transformer-based architec- in the domain of spacecraft telemetry. Spacecraft systems
tures. Each of these models was trained and validated using are reflected by telemetry data, which provides information
a comprehensive dataset sourced from multiple spacecraft mis- on their state, health, and performance. This data allows
sions, encompassing diverse operational scenarios and anomaly for the analysis of both regular and potentially abnormal
types. We also present a novel approach to the rapid assignment operations [1].
of spacecraft telemetry data sets to discrete clusters, based on
the statistical characteristics of the signal. This clustering allows Anomalies observed in spacecraft telemetry data are unantic-
us to compare different deep learning architectures to different ipated occurrences that pose potential risks, as they depart
types of data signal behaviour. Initial results indicate that significantly from the predicted operational patterns of the
while CNNs excel in identifying spatial patterns and may be
effective for some classes of spacecraft data, LSTMs and RNNs system. The quick detection and identification of these
show a marked proficiency in capturing temporal anomalies abnormalities is of paramount significance in order to avert
seen in time-series spacecraft telemetry. The Transformer-based catastrophic failures, limit risks, and guarantee the durability
architectures, given their ability to focus on both local and of space missions. According to [2], the prompt identification
global contexts, have showcased promising results, especially and effective detection of these anomalies by operational
in scenarios where anomalies are subtle and span over longer engineers play a crucial role in enhancing efficiency, min-
durations. Additionally, considerations such as computational imising expenses, and enhancing safety. As the complexity
efficiency, ease of deployment, and real-time processing capabil- of spacecraft continues to advance, there is a corresponding
ities were evaluated. While CNNs and LSTMs demonstrated a growth in the variety of telemetry parameters associated with
balance between accuracy and computational demands, Trans-
former architectures, though highly accurate, require significant them. The utilisation of conventional, manual or simple “out-
computational resources. In conclusion, the choice of deep of-limits” techniques are becoming ever more difficult for the
learning architecture for spacecraft anomaly detection is highly purpose of identifying anomalies [3].
contingent on the nature of the data, the type of anomalies,
and operational constraints. This comparative study provides a In recent years, there has been considerable focus on the
foundation for space agencies and researchers to make informed advancement of anomaly detection techniques for satellite
decisions in the integration of deep learning techniques for telemetry data. Numerous advanced algorithms and strate-
ensuring spacecraft safety and reliability. gies have been proposed by prominent organisations such as
NASA [4], ESA [3], and CNES [5] to tackle this task. Every
approach possesses its own set of advantages and disadvan-
tages. There is a clear trend towards deep learning approaches
TABLE OF C ONTENTS over statistical methods due to their ability to synthesise the
1. I NTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 complex multivariate temporally-connected data inherent to
spacecraft telemetry [6]. The objective of this paper is to
2. R ELATED W ORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 investigate and assess different methodologies for anomaly
3. E XPERIMENTAL S ETUP . . . . . . . . . . . . . . . . . . . . . . . . . . 3 identification in order to determine the most optimal and
efficient approach for analysing spacecraft telemetry.
4. E XPERIMENTS AND R ESULTS . . . . . . . . . . . . . . . . . . . . 5
5. C ONCLUSION AND F UTURE W ORK . . . . . . . . . . . . . . 8 Our work pioneers several notable contributions to the do-
R EFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 main of spacecraft anomaly detection, presenting advance-
ments that enhance the understanding of deep learning in
B IOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 this field. Firstly, it unfolds a comprehensive side-by-side
comparison of multiple deep learning model architectures,
shedding light on their effectiveness in detecting anomalies
in spacecraft telemetry. This comparison is distinctively
©2024 IEEE. Personal use of this material is permitted. Permission from
valuable as it incorporates models that, to our knowledge,
IEEE must be obtained for all other uses, in any current or future media, have not been previously applied to spacecraft anomalies,
including reprinting/republishing this material for advertising or promotional thereby opening new avenues for exploration and implemen-
purposes, creating new collective works, for resale or redistribution to servers tation. Secondly, we introduce an innovative unsupervised
or lists, or reuse of any copyrighted component of this work in other works.

1
mechanism to cluster spacecraft telemetry into like-types, Anomaly Detection
using statistical methods, which allows for a more granular
and nuanced understanding of telemetry data. Thirdly, our 1.0
study unveils insights into the comparative performance of
different deep learning models across the identified clusters, 0.5
providing insights for selecting the most suitable model based Actual Data

Value
on the specific type of telemetry data. These diverse con- 0.0 Forecast
tributions collectively elevate the current state of research Threshold
in spacecraft anomaly detection, offering robust and refined Point
0.5
tools and methodologies for practical applications and future Anomaly
explorations. Collective
1.0 Anomaly
Nomenclature
0 2 4 6 8 10
The following terms are used in this work. Time

Telemetry Channel; A specific pathway or conduit used for

transmitting telemetry data [7], for example from a specific Figure 1. Anomaly Detection with Forecasting and
sensor on the spacecraft. A telemetry channel consists of one Thresholding
or more parameters.

Parameter; A measurement within a telemetry channel. This

may be an analogue reading such as temperature or current, a ratory (MSL) [14] spacecraft. In Section 3, we will describe
discrete numerical value or a binary status. A telemetry time this dataset in the context of our experimental setup.
series is made up of many samples of a number of parameters
representing a number of telemetry channels. Approaches for Spacecraft Anomaly Detection
Dataset; A collection of data points or individual pieces of A typical approach, for example followed by [3], [4], [5],
information, organised usually in tabular form, where rows and [9], is the use of deep learning models to perform
represent individual records and columns represent attributes regression-based forecasting on a time series and identify
or variables of the data. The dataset used in our study is anomalies by comparing predictions to real values received
from [4] and contains data for 82 telemetry channels. from the spacecraft. The central concept is “to reconstruct
the telemetry sequence based on training data, and anomalies
Cluster; A grouping of data points or items in a dataset that are identified if the reconstruction errors exceed a given
share similar characteristics or properties, typically identified threshold.” [15], as illustrated in Figure 1. “The idea is to
through various methods of cluster analysis, allowing for the use past telemetry describing normal spacecraft behaviour in
study of relationships and patterns within the data [8]. order to learn a reference model to which can be compared
most recent data in order to detect potential anomalies.” [5].
Multivariate models are used to capture spatial and temporal
linkages between separate telemetry channels [16].
2. R ELATED W ORK
This section will describe related work investigated as part Whilst effective, it relies on the selection of some threshold
of our study, focusing especially on popular deep learning value beyond which the construction error is considered
architectures, many of which have not been applied to the anomalous. [4] propose “Telemanom”, an “unsupervised
problem of anomaly detection in spacecraft telemetry data. and nonparametric anomaly thresholding approach” where
the anomaly detector dynamically learns the error value
Data for Spacecraft Anomaly Detection corresponding to the anomaly for a particular time series.
They report excellent F1 scores for the anomaly detection,
Modern spacecraft have many thousands of telemetry chan- as synthesised in Table 1, which we use as our baseline.
nels [9], and this “huge” [10] amount of data is more than can
be monitored by human operators. Within these channels,
actual instances of anomalies are rare. By design a spacecraft Table 1. Telemanom F1 Scores
is a robust machine, fault tolerant and extensively tested to
ensure that anomalies do not occur [11]. For example, a study SMAP MSL Total
of seven different spacecraft over more than a decade yielded 85.5% 79.3% 83.6%
fewer than 200 critical anomalies [12].
There exist many types of architecture for deep learning,
Spacecraft Anomaly Detection is a particularly challenging many of which have been tuned specifically to time series
field due to the sparsity of publicly available datasets for prediction, for example [17]. We selected six state-of-the-
training. Indeed, of all the studies listed in our work, only [4] art families of architecture for further investigation. Addi-
make the data available, and even then with implementation- tionally, we investigated two hybrid architectures comprised
specific details hidden through scaling and normalisation. of a combination of two or more model types as suggested
This has led to their dataset becoming a benchmark for further by [18].
studies, such as [9] and consequently we used it in our
experiments. Chosen Deep Learning Architectures for our Study
The dataset provided by [4] comprises of 82 teleme- The following sections briefly review the chosen deep learn-
try channels taken from the Soil Moisture Active Passive ing architectures used this our study. In particular, it is
(SMAP) [13] spacecraft and “Curiosity” Mars Science Labo- noted whether previous work has tried these in the domain
of spacecraft anomaly detection and which are novel to this
2
task. Recurrent Neural Network (RNN)— Recurrent Neural Net-
works (RNNs) are a category of neural networks specialised
Multilevel Wavelet Decomposition— [19] introduced their for processing sequential data, enabling the modeling of
Multilevel Wavelet Decomposition Network (mWDN) for temporal features within the sequences. Our study includes
anomaly detection. mWDN leverages the benefits of wavelet two RNN variants. Long Short-Term Memory (LSTM) [40]
transformation in conjunction with a deep learning model units are a variant of RNNs designed to mitigate the vanishing
to analyse time series data, with a specific emphasis on and exploding gradient problems inherent in basic RNNs,
interpretability. Wavelet transformation is a powerful mathe- allowing them to learn long-range behaviours within the
matical tool often used for analysing different frequency com- data. The model used in our baseline study [4] is LSTM-
ponents in time series data, which makes it highly suitable based. Gated Recurrent Units (GRU) [41] are another variant
for anomaly detection in varied applications such as high- of RNNs, similar to LSTMs but with a simpler structure,
frequency signals [20] and power converters [21]. To the designed to capture dependencies for sequences of varied
best of our knowledge, an mWDN has yet to be applied to lengths. GRUs have been proven to perform comparably to
spacecraft telemetry anomaly detection. LSTMs on certain tasks [42] but with reduced computational
requirements, offering an efficient alternative for sequence
Multi-Layer Perceptron (MLP)— The gMLP [22], or gated modeling. This may make them of particular use in spacecraft
Multi-Layer Perceptron, is a type of artificial neural network anomaly detection [43], where the cost of training the more
model designed to have performance competitive to Trans- complex models may be prohibitive.
former models but with a more straightforward architecture.
It relies more on feedforward layers and less on attention Hybrid Models—To leverage the advantages of different deep
mechanisms. A gMLP utilises Spatial Gating Units (SGU), learning architectures, many previous studies have consid-
a central component that enables information exchange be- ered hybrid models [44], [18]. LSTM/Transformer models
tween different positions in the sequence, allowing the model are quite popular, for example [45] and [46], and in our
to capture dependencies between different parts of the input. study we include two such models in the suite of tested
MLPs have been used for anomaly detection in fields as architectures, TransformerLSTM [47], [48] and LSTMAtten-
varied as water treatment [23] to rogue trading [24], as well tion [30], [26], [48]. Other studies [49] have considered a
as spacecraft anomaly detection [25]. hybrid FCN/Transfomer model, combing the spatial learning
abilities of CNNs with the sequence learning of Transform-
Transformer—Transformers, originally proposed in [26], are ers, therefore we include LSTM FCN [48] in our test set.
a type of neural network architecture that have become the Other studies such as [50] go further still andcombining
foundation for most state-of-the-art models in natural lan- CNN, RNN and Transformer architectures in one model.
guage processing, and they are increasingly being used in
various domains like time series analysis and image process- We are unaware of the use of hybrid deep learning models
ing. Transformers use a mechanism called self-attention that for spacecraft anomaly detection, although there have been
allows each element in the input sequence to consider other studies in the area of hybrid machine-learning such as [15]
elements in the sequence when producing its output, weight- and [51].
ing each one differently depending on the learnt relation-
ships. Transformers are the subject of much active research
into anomaly detection, such as [27], [28] and [29]. The 3. E XPERIMENTAL S ETUP
implementation of the Transformer architecture investigated
in our study is TimeSeriesTransformer (TST) [30], which This section describes the details of the experimental setup
tunes the architecture specifically for multivariate time series used for the comparison of deep learning approaches.
data, of which spacecraft telemetry is an extreme example
owing to the potentially very large number of parameters to Implementation
consider [31]. We employed the Telemanom [4] anomaly detection frame-
work for conducting experiments on spacecraft anomaly de-
Convolutional Neural Network (CNN)—Convolutional Neu- tection. The original implementation of Telemanom uses a
ral Networks (CNNs) are a class of deep learning models pri- Tensorflow-based LSTM model as a default, designed
marily developed for analysing visual imagery, renowned for to recognize anomalous patterns in time series data relevant
their ability to learn hierarchical features from input data [32]. to spacecraft telemetry. We will replace this LSTM with
In the context of spacecraft anomaly detection, CNNs can be alternative architectures as described in Section 2.
utilised to process multivariate time series data generated by
spacecraft sensors [33], enabling the identification of anoma- We retain the dynamic thresholding and anomaly detection
lous patterns or events indicative of potential faults, malfunc- algorithms of Telemanom whilst replacing the time-series
tions, or other abnormalities in spacecraft systems [34]. By forecasting models with a variety of new architectures. Thus,
learning both spatial and temporal features in the data, CNNs we can clearly demonstrate the differences due to the archi-
can aid in early and accurate detection of anomalies [35]. tecture alone. To this end, the default LSTM was replaced
As a popular architecture in deep learning, there are many with various models provided by the tsai library [48], a
implementations of interest. Four are selected here, including PyTorch and fastai-based collection of time-series deep
the “classics” ResNet [36], [37] and Fully Convolutional learning architectures [52]. The tsai library implements
Network (FCN) [37], in addition to some implementations a wide selection of state-of-the-art models optimised for
specifically tailored to time series: XceptionTime [38] and time-series data. In order to keep the model code generic
InceptionTime [39]. For InceptionTime, two implementa- and easily fit to a variety of different model architectures,
tions from tsai we chose : MultiInceptionTimePlus and the tsai “Plus” implementations of the above architectures
InceptionTimeXLPlus. The former is an ensemble method were selected due to their common interfaces. Detailed
with multiple internal models, whereas the latter contains a documentation regarding the particulars of implementation
large number of parameters. can be found in [48].

3
Following the approach taken in [4], one model is trained anomalies are to be detected.
per telemetry channel. Our study compares thirteen different
architectures, leading to 82 × 13 = 1, 066 trained models The data was pre-split by [4] into “train” anomaly-free data
overall. to establish the nominal conditions and “test” sets, one per
telemetry channel, which contain the labelled anomalies. We
The models were trained utilizing the “fit one cycle” used the same split as in the original study in order to have
method [53], a technique noted for its efficacy in training comparable results.
deep learning models efficiently and reliably. The exper-
iment endeavoured to keep the setup fair and comparable; Data Clustering
thus, hyperparameter tuning was predominantly confined to
ensuring that the RNN-based architectures possessed at least Initial inspection of the telemetry channels show that different
telemetry channels had varying general characteristics such
equivalent depth to the default LSTM implemented in Tele- as “spiky” or “flat”. We wanted to investigate the link
manom. Apart from this modification, we retained the default between the characteristics of the telemetry channels and
hyperparameters provided by the tsai and fastai models the best performing deep learning model architecture, and
to maintain the integrity of the comparative analysis, on the
basis that the defaults are anyway sensible [54]. Furthermore, whether specific architectures work better for certain types of
data. Manual classification is not feasible due to the number
due to the large number of trained models, hyperparameter of telemetry channels, so our idea was to use an unsupervised
tuning was infeasible in any case.
clustering approach.
Early experience during model training showed that model To associate the telemetry channels into clusters, we used an
performance was very sensitive to the learning rate. In order
to negate these effects, we applied the learning rate reduction unsupervised clustering approach. Each class represents a
particular set of characteristics. The method used the standard
scheme ReduceLROnPlateau, provided by the fastai central moments (mean, standard deviation, skewness and
framework [52], to each model. The callback reduces the
learning rate on each epoch if the training loss metrics are kurtosis) calculated for the target parameter of each telemetry
unchanging between consecutive epochs. This has given channel using SciPy [57]. NaN1 values are set to 0. There-
good results in studies such as [55] but at the cost of longer fore each telemetry channel was represented by a single four-
training times. dimensional vector. We applied K-Means clustering [58] to
these four dimensional vectors, as illustrated in Figure 2 and
The computational environment for the experiments was pro- further elaborated in Listing 1.
visioned on a virtual machine, equipped with 8 CPU cores
(Intel Xeon Platinum 8260 CPU @ 2.40GHz) and 16GB of Time Series Input
RAM. No hardware acceleration or GPUs were available.
Data
Due to commercial, legal and security considerations, there Parameter 1 Parameter 2 Parameter 3
are very few well-labelled spacecraft anomaly datasets avail-
able to the public. The “SMAP/MSL” dataset provided by [4] Compute
is a dataset used in other studies into autonomous detection Statistics
of spacecraft anomalies (i.e. [9], an LSTM-based study,
and [56], a CNN-based approach). This dataset consists [mean1,std1,skew1,kurt1]
of curated telemetry streams from NASA’s Soil Moisture [mean2,std2,skew2,kurt2]
Active Passive (SMAP) [13] and Mars Science Laboratory
“Curiosity rover” (MSL) [14] missions. We will have selected [mean3,std3,skew3,kurt3]
this as the dataset for our study because it offers a good
baseline against which to compare our results. K-Means

Table 2. SMAP/MSL Dataset Statistics, from [4] PCA2

SMAP MSL Total PCA1

Total anomaly sequences
Point anomalies
69
43
36
19
105
62
Cluster output
Contextual anomalies 26 17 43 Figure 2. Clustering Process
Unique telemetry channels 55 27 82
Input dimensions 25 55 -
Telemetry values evaluated 429,735 66,709 496,444

The data in [4] has been scaled from between (-1,1) and The handling of NaN values is required for the statistics
anonymised. “Model input data also includes one-hot en- skewness and kurtosis because some telemetry channels con-
coded information about commands that were sent or retain parameter values with no variance (“flat”, in Table 5);
ceived by specific spacecraft modules in a given time win- these are forced to zeros. Skewness measures the asymmetry
dow.” [4]. This results in a collection of 82 multivariate data of the probability distribution. For a constant data series,
sets, with around 100 labelled anomalies in total across all skewness is not defined, as skewness presupposes that there
data sets, as detailed in Table 2. Each telemetry channel is variance in the data. Kurtosis measures the “tailedness” of
is a multivariate time series of one target parameter and the distribution. For a constant series, like skewness, kurtosis
additional parameters to be used as contextual information.
The target parameter is the time series to be forecast, in which 1 Not a Number - used to signify an arithmetic error

4
is also not defined because kurtosis measures the outliers and
a constant series has none. Mean and standard deviation are The Elbow Method showing the optimal k
defined in case of constant value so do not need to be treated
for NaNs. Skewness and Kurtosis are calculable for non-flat 300
telemetry channels and give a better summary of the data than
mean and standard deviation alone. 250
200

Distortion
Our clustering focuses on the training data set, without
anomalies, so as to identify what the “normal” behaviour of
the parameter is, as summarised by the shape of the curve. In 150
spacecraft operations this is the more likely scenario as often
data channels have yet to experience an anomaly [9], [5]. 100
For each t e l e m e t r y c h a n n e l i : 50
E x t r a c t t a r g e t p a r a m e t e r p i from i
C a l c u l a t e c e n t r a l moments o f p i :
0 2 4 6 8 10
[ Mean , s t a n d a r d d e v i a t i o n , k
skew , k u r t o s i s ] => v e c t o r i
S e t any v a l u e ( v e c t o r i j = NaN ) => 0
Add v e c t o r i t o l i s t Figure 3. Elbow Method used to Determine Optimum
Number of Clusters
Apply K−Means t o l i s t => n c l u s t e r s

Listing 1. Time Series Clustering Pseudo-Code

2D PCA of Clusters
The “elbow method” [59] is a heuristic to find an optimal 3
number of clusters by looking for a change in slope. For the Principal Component 2
[4] data, the method indicated that 5 clusters of data types
would be an optimal solution, as shown in Figure 3. The 2
change in slope at k = 5 is clear. Distortion is an indication
of how well the clusters fit, and k is the number of clusters. 1
Lower values of k would suggest insufficiently separated Clusters
clusters, whereas greater values would indicate overly split 0 0
clusters. This result is dataset-specific, and may not reflect all 1
spacecraft telemetry channels, however the K-means method 1 2
is portable to other data sets and fast: all 82 channels were 3
processed in under a second.
2 4
The resulting clusters are shown in Figure 4. Each dot 2 0 2 4 6
represents a telemetry channel target parameter, and the Principal Component 1
clusters are grouped by colour. Principle component analysis
(PCA) [60] has been used to reduce the number of dimension
from 4 to 2 for the purposes of visualisation. Despite being Figure 4. Resulting Data Clusters when k = 5
few in number, the telemetry channels comprising clusters 3
(“Spiky”) and 4 (“Complex”) are clearly separated, with the
remaining clusters being closer together yet still distinct.
In addition to reporting the results per model architecture
The outcome of this investigation is illustrated in Figure 5, trained on all telemetry channels, the best results per (model
and can be described as the following “types” of telemetry architecture, cluster) combination will also be given. This
data according to the behaviour of the target data channel will inform whether certain architectures work better on
(that is, the one to be predicted by the forecasting model): certain types of data (cluster), or whether there is a “one-size-
fits-all” universal solution which is applicable to all types of
• Cluster 0 “Binary”: the values alternate between one of data behaviour.
two values. When scaled to (-1,1), this shows as large spikes
across the full range. There are 43 data channels in this
cluster.
• Cluster 1 “Flat”: the value is not expected to change at all.
4. E XPERIMENTS AND R ESULTS
There are 21 data channels in this cluster. The results of the 13 different models (described in Section 2)
• Cluster 2 “Oscillating”: Similar to flag, but the value shows considerable difference in training times, ranging from
oscillates around a certain value rather than being fixed. a few hours to over one full day, as shown in Table 3.
There are 11 data channels in this cluster. However the performance of the models do not scale with
• Cluster 3 “Spiky”: occasional large changes in the data are processing time (Table 3). The performance is measured by
expected and normal. These represent a particular challenge two key metrics, the F1 (%) score considering the number
for univariate models as the cause of the spike can only be of anomalies correctly detected (“F1 anomaly”), and the
determined from additional data. There are 2 data channels F1 score of the number of time points correctly labelled as
in this cluster. occurring within anomalies (“F1 time point”).
• Cluster 4 “Complex”: combination of the other data types.
There are 2 data channels in this cluster. Table 3 shows, per model architecture implementation, the
5
First Member of Each Cluster
Cluster 0 "Binary", Channel E-1 Cluster 1 "Flat", Channel T-5
1.0
0.5
0.0
0.5
1.0
0 500 1000 1500 2000 2500 3000 0 500 1000 1500 2000
Cluster 2 "Oscillating", Channel G-4 Cluster 3 "Spiky", Channel G-3
1.0
0.5
0.0
0.5
1.0
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
Cluster 4 "Complex", Channel M-7
1.0
0.5
0.0
0.5
1.0
0 250 500 750 1000 1250 1500
Figure 5. Data Types per Cluster

total training time and the average training time per channel. Model Performance
True positive (TP), False positive (FP) and False negative The best performing model architecture in our study is
(FN) values are also given per anomaly.
the CNN-based XceptionTimePlus implementation, with F1
anomaly score of 69.9%. This is lower than the tuned
It is expected that “F1 time point” will not be very high, as results from the Telemanom study (Table 1) but represents
the nature of the threshold-based anomaly detector means that
data points either side of a labelled anomaly may not be de- a 6% better performance than the worst performing model
here, FCNPlus. It is noteworthy that the best and worst
tected as anomalous themselves, even though a domain expert performing models are both of the CNN architecture families
would label them as such. Nevertheless, it gives an indication
of the overall model performance when determining if any (XceptionTimePlus 69.9%, FCNPlus 63.2%). This suggests
that there is no intrinsic advantage of CNN-based models in
given data point is anomalous. This is illustrated in Fig- general.
ure 6, whereby a predicted anomaly and actual anomaly may
share few actual data point yet nevertheless be considered The hybrid TransformerLSTMPlus (69.6%) and RNN-based
a successful detection of an anomaly. That is, any overlap GRUPlus (69.1%) show similar performance although with
of predicted anomaly and actual anomaly is considered a
detection, no matter how small (how few data points are vastly different training times.
correctly labelled). This is the metric used in [4] and we
retain it to allow direct comparison of results between their Given the overall good performance of XceptionTimePlus
(69.9%) and the relatively low training time, this architecture
study and ours. would be our recommendation for a general purpose anomaly
The F1 score pertaining to the detected anomalies (“F1 detector, as an initial investigation, before extensive effort it
applied to tuning.
anomaly”) is more significant in terms of perceived anomaly
detection performance by the spacecraft operator [4], [9].

6
A: Unaware of the temporal adjacency B: Unaware of the event duration
True Positive (TP)
False Negative (FN)
Prediction
Truth False Positive (FP)

Time Time
This prediction, close but disjoint Each of those two events should
to the ground truth event have equal importance
should be rewarded irrespective of its duration

Figure 6. Anomaly detection metrics, per anomaly versus per time point

Adapted from [61] and [62]

Training Performance Best Architecture per Cluster

Spacecraft parameters number in the tens to hundreds of A further level of analysis into the results provides insight
thousands per spacecraft, so the effectiveness of the mod- into the best performing models per type of telemetry data, as
els in terms of training is of critical importance. Training determined by the clusters shown in Figure 5. To calculate the
a large number of slow-to-train models is thus potentially F1 anomaly scores per time point and per labelled anomaly,
prohibitive, and models would need regular retraining as the the true positives, false positives and false negatives (in each
spacecraft ages or environment changes. We introduce a ’F1 case) were summed across the relevant cluster.
per second’ (F1/s) metric, to provide a comprehensive view of
our models’ performance, taking into account both their ac- All (architecture, cluster) pairs were ranked by F1 anomaly
curacy and computational efficiency. In essence, it indicates score, to determine the best performing model for each
the F1 score achieved for every second of training. A higher cluster. The architectures identified as best and second-best
value implies that the model not only performs well but can performing for each cluster are given in Table 4.
be trained quickly, making it both effective and efficient.
Figure 7 shows that the CNN-based models ResNetPlus and It is notable that the best performing architecture identified
XceptionTimePlus are the most efficient, whereas the “large” in Table 3 (XceptionTimePlus, 69.9%) is not the best in each
(many model parameters) CNN-based InceptionTimeXLPlus data type cluster. The CNN-based MultiInceptionTimePlus
and RNN/Transformer hybrid LSTMAttentionPlus offer the and XceptionTimePlus models perform best in the “Binary”
worst performance/training time trade-off. Generally the (65.2%) and “Flat”/“Oscillating” clusters (86.3%, 72.0%)
Transformer-based architectures all suffer from a perfor- respectively, suggesting that the spatial awareness of the
mance/training time trade-off. The difference in training models is particularly useful in identifying anomalies in these
times is stark - from 2.5 minutes for XceptionTimePlus to cases.
over 18.5 minutes for TransformerLSTMPlus, whereas their
F1 score is nearly identical (69.9% vs 69.6%). With fewer telemetry channels, two apiece, the Spiky and
Complex clusters are more difficult to assess in general terms
due to the low number of examples in the data set. Despite
F1 per second for Different Models being an older architecture, the MLP-based gMLP performs
0.006 best for the “Spiky” (100%) and “Complex” cases (100%),
outperforming all other architectures. More examples of
0.004
F1/s

these clusters are needed before a recommendation can be

0.002 made on these specific telemetry data types, but it is in-
structive to note the differences in performance. With an
0.000 average F1 score of 84.7%, the best performing models per
im s
tio RU s
mWmePl s
DN us
TS Plus
ns TM g s
me CN P
TM s
TM s
LS nT CN s
Att eXL s
tio s
lus

cluster collectively outperform any single model by nearly

nT Plu
ep G ePlu
nT Plu

Tra LS TPlu

rLS Plu
LS Plu
tio F Plu
TM im Plu
en Plu
for _F ML

nP
tio et

15% (absolute).
ep sN

i
Xc Re

As an ensemble approach to the general anomaly detection

ep
nc

problem, taking the best performing architecture per-data

ltiI

Inc
Mu

type greatly increases the performance overall. The average

F1 score of 84.7% surpasses the 83.6% achieved by the
baseline study, Telemanom [4].
Figure 7. F1 score per training second

7
Table 3. Results per Architecture

Architecture Total Time Avg Time F1 (%) TP FP FN F1 (%)

(days per TM Time Point Anomaly Anomaly Anomaly Anomaly
hh:mm:ss) Channel
(mm:ss)
XceptionTimePlus 00 03:34:27 02:39 37.42 65 16 40 69.89
TransformerLSTMPlus 01 00:58:58 18:30 36.69 64 15 41 69.57
GRUPlus 00 03:32:16 02:37 32.71 66 20 39 69.11
ResNetPlus 00 03:02:39 02:15 38.36 63 15 42 68.85
MultiInceptionTimePlus 00 05:32:11 04:06 36.71 60 10 45 68.57
LSTM FCNPlus 00 23:15:57 17:14 32.83 66 23 39 68.04
LSTMPlus 01 02:21:02 09:46 35.10 67 25 38 68.02
mWDNPlus 00 05:53:12 04:22 34.99 63 21 42 66.67
gMLP 00 16:06:24 11:56 32.44 63 21 42 66.67
TSTPlus 00 13:29:22 10:00 34.94 57 16 48 64.04
InceptionTimeXLPlus 01 05:38:24 21:57 33.32 64 31 41 64.00
LSTMAttentionPlus 01 05:41:37 22:00 35.35 62 29 43 63.27
FCNPlus 01 00:57:41 09:15 36.29 61 27 44 63.21

Table 4. Results per Cluster

Architecture Cluster F1 (%) Time Point F1 (%) Anomaly

MultiInceptionTimePlus Binary 31.65 65.22
ResNetPlus Binary 34.02 64.58
gMLP Complex 43.24 100.00
LSTM FCNPlus Complex 42.87 100.00
XceptionTimePlus Flat 44.10 86.27
TransformerLSTMPlus Flat 41.00 85.19
XceptionTimePlus Oscillating 27.50 72.00
ResNetPlus Oscillating 26.87 64.00
gMLP Spiky 48.87 100.00
MultiInceptionTimePlus Spiky 35.08 85.71
Average - - 84.7

5. C ONCLUSION AND F UTURE W ORK deep learning model architectures exhibit varying degrees of
In conclusion, the insights derived from our study have shown proficiency depending on the nature of the data, be it “spiky”,
innovative advancements in spacecraft anomaly detection, “flat”, “complex”, “oscillating”, or “binary”. We introduced
an innovative clustering methodology in this paper, facilitat-
laying a robust foundation for future explorations and discov- ing the efficient allocation of spacecraft telemetry channels
eries in this domain.
into distinct clusters contingent on the inherent statistical
properties of the data, based on the shape of the curve. This
Conclusion novel approach has not only advanced our understanding but
In this work, we performed a comparative study of diverse has also paved the way for the advent of more sophisticated
deep learning model architectures, with the goal of assessing ensemble models, based on individual models that are har-
their efficacy in spacecraft anomaly detection. Our findings moniously optimized for disparate data types. This ensemble
revealed that model XceptionTimePlus (69.9%) exhibited the approach was able to exceed the performance of the baseline
most optimal performance among all the models assessed study (84.7% vs 83.6%), despite using unoptimised models.
in the study, across all telemetry channels. However, it is
important to note that the overall performance was not on par Future Work
with the outcomes demonstrated in [4]. A contributing factor The work in our study has suggested new possibilities and
to this is the conscious decision to refrain from hyperparame- directions for future research. A natural extension of this
ter optimisation in order to preserve the default comparisons
and allow direct relative comparisons. Nevertheless, our work would be the exploration of ensemble models that are
proficiently optimised to accommodate various data types,
study provides valuable insights into which families of deep leveraging the clustering methodology introduced in this pa-
learning architecture perform well, and which not.
per. Furthermore, a meticulous exploration of hyperparame-
Furthermore, due to constraints in computational resources it ter space will be pivotal to harness the maximal potential of
was not possible to follow the standard optimisation strategies the models, thereby advancing the state-of-the-art in space-
such as grid search, which runs many iterations of the model craft anomaly detection.
to explore the hyperparameter space. With some models
taking several days to run once (e.g. LSTMAttentionPlus at As described above, the individual models were not indi-
vidually optimised per model, rather used defaults from the
1 day and 5 hours), it is infeasible to run the large number of respective frameworks fastai and tsai. The success
iterations required.
of the clustering approach suggests itself as an alternative
approach to the one-model-for-all approach seen in other
In addition to this, our research illuminated that different studies ([9], [4], [3]): that of creating a set of optimised

8
hyperparameters per data type (spiky, binary, etc). Transactions on Software Engineering, vol. 30, no. 3,
pp. 172–180, 2004.
Additionally, current anomaly detection approaches (e.g. [3], [13] P. O’Neill, D. Entekhabi, E. Njoku, and K. Kellogg,
[4], [5], [9]) rely predominantly on forecasting models to “The nasa soil moisture active passive (smap) mission:
deduce nominal behavior, identifying anomalies through a Overview,” in 2010 IEEE International Geoscience and
comparative analysis of predictions against predetermined Remote Sensing Symposium, 2010, pp. 3236–3239.
thresholds. A promising avenue for future research would
be the application of deep learning classification techniques, [14] A. R. Vasavada, “Mission overview and scientific
which could potentially offer a direct assessment of the contributions from the mars science laboratory curiosity
telemetry channels without relying on thresholds. rover after eight years of surface operations,” Space
Science Reviews, vol. 218, no. 3, Apr. 2022. [Online].
Available: https://doi.org/10.1007/s11214-022-00882-
R EFERENCES 7
[1] A. Zacchei, S. Fogliani, M. Maris, L. Popa, N. Lama, [15] J. He, Z. Cheng, and B. Guo, “Anomaly detection
M. Türler, R. Rohlfs, N. Morisset, M. Malaspina, and in satellite telemetry data using a sparse feature-based
and F. Pasian, “Housekeeping and science telemetry: method,” Sensors, vol. 22, no. 17, 2022. [Online].
the case of planck/lfi,” Memorie della Supplementi, pp. Available: https://www.mdpi.com/1424-8220/22/17/
331–334, 2003. 6358
[2] S. Guan, B. Zhao, Z. Dong, M. Gao, and Z. He, [16] K. Chakraborty, K. Mehrotra, C. K. Mohan, and
“Gtad: graph and temporal neural network for S. Ranka, “Forecasting the behavior of multivariate time
multivariate time series anomaly detection,” Entropy, series using neural networks,” Neural networks, vol. 5,
vol. 24, no. 6, p. 759, 2022. [Online]. Available: no. 6, pp. 961–970, 1992.
https://www.mdpi.com/1099-4300/24/6/759 [17] D. Walther, J. Viehweg, J. Haueisen, and P. Mäder, “A
[3] J. M. Heras and A. Donati, “Enhanced telemetry mon- systematic comparison of deep learning methods for eeg
itoring with novelty detection,” AI Magazine, vol. 35, time series analysis,” Frontiers in Neuroinformatics,
no. 4, pp. 37–46, 2014. vol. 17, 2023. [Online]. Available: https://www.
frontiersin.org/articles/10.3389/fninf.2023.1067095
[4] K. Hundman, V. Constantinou, C. Laporte, I. Colwell,
and T. Soderstrom, “Detecting spacecraft anomalies [18] C. S. Han and K. M. Lee, “Hybrid deep learning
using lstms and nonparametric dynamic thresholding,” model for time series anomaly detection,” in RACS
arXiv, 2018. ’23: Proceedings of the 2023 International Conference
on Research in Adaptive and Convergent Systems,
[5] B. Pilastre, L. Boussouf, S. D’Escrivan, and J.-Y. ser. RACS ’23. New York, NY, USA: Association
Tourneret, “Anomaly detection in mixed telemetry for Computing Machinery, 2023. [Online]. Available:
data using a sparse representation and dictionary https://doi.org/10.1145/3599957.3606232
learning,” Signal Processing, vol. 168, p. 107320,
2020. [Online]. Available: https://www.sciencedirect. [19] J. Wang, Z. Wang, J. Li, and J. Wu, “Multilevel
com/science/article/pii/S0165168419303731 wavelet decomposition network for interpretable time
series analysis,” in Proceedings of the 24th ACM
[6] S. Schmidl, P. Wenig, and T. Papenbrock, “Anomaly SIGKDD International Conference on Knowledge
detection in time series: A comprehensive evaluation,” Discovery & Data Mining, ser. KDD ’18. New York,
Proc. VLDB Endow., vol. 15, no. 9, p. 1779–1797, may NY, USA: Association for Computing Machinery,
2022. [Online]. Available: https://doi.org/10.14778/ 2018, p. 2437–2446. [Online]. Available: https:
3538598.3538602 //doi.org/10.1145/3219819.3220060
[7] R. C. J. Chapman, G. Critchlow, and H. Mann, Com- [20] G. Michau, G. Frusque, and O. Fink, “Fully learnable
mand and Telemetry Systems. NASA, 1963. deep wavelet transform for unsupervised monitoring
[8] M. Omran, A. Engelbrecht, and A. Salman, “An of high-frequency time series,” Proceedings of the
overview of clustering methods,” Intell. Data Anal., National Academy of Sciences, vol. 119, no. 8, Feb.
vol. 11, pp. 583–605, 11 2007. 2022. [Online]. Available: https://doi.org/10.1073/pnas.
2106598119
[9] S. Baireddy, S. R. Desai, J. L. Mathieson, R. H. Foster,
M. W. Chan, M. L. Comer, and E. J. Delp, “Spacecraft [21] S. Ye and F. Zhang, “Unsupervised anomaly detection
time-series anomaly detection using transfer learning,” for multilevel converters based on wavelet transform
in 2021 IEEE/CVF Conference on Computer Vision and and variational autoencoders,” in 2022 IEEE Energy
Pattern Recognition Workshops (CVPRW), 2021, pp. Conversion Congress and Exposition (ECCE), 2022, pp.
1951–1960. 1–6.
[10] T. Yairi, T. Oda, Y. Nakajima, N. Miura, and N. Takata, [22] H. Liu, Z. Dai, D. R. So, and Q. V. Le, “Pay attention to
“Evaluation testing of learning-based telemetry moni- mlps,” 2021.
toring and anomaly detection system in sds-4 opera- [23] G. Raman MR, N. Somu, and A. Mathur, “A multilayer
tion,” in Proceedings of the International Symposium perceptron model for anomaly detection in water
on Artificial Intelligence, Robotics and Automation in treatment plants,” International Journal of Critical
Space (i-SAIRAS), 2014. Infrastructure Protection, vol. 31, p. 100393, 2020.
[11] P. Fortescue, G. Swinerd, and J. Stark, Spacecraft Sys- [Online]. Available: https://www.sciencedirect.com/
tems Engineering, 4th ed. Nashville, TN: John Wiley science/article/pii/S1874548220300573
& Sons, 2011. [24] E. Hedström and P. Wang, “Anomaly detection
[12] R. R. Lutz and I. C. Mikulski, “Empirical analysis using a deep learning multi-layer perceptron to
of safety-critical anomalies during operations,” IEEE mitigate the risk of rogue trading,” Ph.D. dissertation,
9
KTH, School of Electrical Engineering and Computer A. Mohammadi, “Xceptiontime: A novel deep archi-
Science (EECS), 2021. [Online]. Available: https: tecture based on depthwise separable convolutions for
//urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301948 hand gesture classification,” 2019.
[25] P. Bernal-Mencia, K. Doerksen, and C. Yap, “Machine [39] H. I. Fawaz, B. Lucas, G. Forestier, C. Pelletier,
learning for early satellite anomaly detection,” Proceed- D. F. Schmidt, J. Weber, G. I. Webb, L. Idoumghar,
ings of the Small Satellite Conference, 2021. P.-A. Muller, and F. Petitjean, “InceptionTime: Finding
[26] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, AlexNet for time series classification,” Data Mining
L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, and Knowledge Discovery, vol. 34, no. 6, pp.
“Attention is all you need,” 2023. 1936–1962, sep 2020. [Online]. Available: https:
//doi.org/10.1007%2Fs10618-020-00710-y
[27] J. Kim, H. Kang, and P. Kang, “Time-series anomaly
detection with stacked transformer representations and [40] S. Hochreiter and J. Schmidhuber, “Long short-term
1d convolutional network,” Engineering Applications memory,” Neural Computation, vol. 9, no. 8, pp. 1735–
of Artificial Intelligence, vol. 120, p. 105964, 2023. 1780, 1997.
[Online]. Available: https://www.sciencedirect.com/ [41] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Em-
science/article/pii/S0952197623001483 pirical evaluation of gated recurrent neural networks on
[28] Y. Jeong, E. Yang, J. H. Ryu, I. Park, and M. Kang, sequence modeling,” 2014.
“Anomalybert: Self-supervised transformer for time se- [42] R. Cahuantzi, X. Chen, and S. Güttel, “A comparison
ries anomaly detection using data degradation scheme,” of lstm and gru networks for learning symbolic se-
2023. quences,” 2023.
[29] J. Xu, H. Wu, J. Wang, and M. Long, “Anomaly [43] G. Xiang and R. Lin, “Robust anomaly detection for
transformer: Time series anomaly detection with as- multivariate data of spacecraft through recurrent neu-
sociation discrepancy,” in International Conference on ral networks and extreme value theory,” IEEE Access,
Learning Representations, 2022. [Online]. Available: vol. 9, pp. 167 447–167 457, 2021.
https://openreview.net/forum?id=LzQQ89U1qm [44] S. Lin, R. Clark, R. Birke, S. Schönborn, N. Trigoni,
[30] G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and S. Roberts, “Anomaly detection for time series
and C. Eickhoff, “A transformer-based framework for using vae-lstm hybrid model,” in ICASSP 2020 - 2020
multivariate time series representation learning,” in IEEE International Conference on Acoustics, Speech
Proceedings of the 27th ACM SIGKDD Conference and Signal Processing (ICASSP), 2020, pp. 4322–4326.
on Knowledge Discovery & Data Mining, ser. KDD [45] F. Andayani, L. B. Theng, M. T. Tsun, and C. Chua,
’21. New York, NY, USA: Association for Computing “Hybrid lstm-transformer model for emotion recogni-
Machinery, 2021, p. 2114–2124. [Online]. Available: tion from speech audio files,” IEEE Access, vol. 10, pp.
https://doi.org/10.1145/3447548.3467401 36 018–36 027, 2022.
[31] H. Meng, Y. Zhang, Y. Li, and H. Zhao, “Spacecraft [46] Z. Zeng, V. T. Pham, H. Xu, Y. Khassanov, E. S.
anomaly detection via transformer reconstruction Chng, C. Ni, and B. Ma, “Leveraging text data using
error,” in ICASSE 2019: Proceedings of the hybrid transformer-lstm based end-to-end asr in transfer
International Conference on Aerospace System Science learning,” 2020.
and Engineering 2019, 2020. [Online]. Available:
https://api.semanticscholar.org/CorpusID:214396765 [47] B. Urazalinov, “Parkinson’s freezing of
gait prediction,” 2023. [Online]. Avail-
[32] Y. LeCun, B. E. Boser, J. S. Denker, D. Henderson, able: https://www.kaggle.com/competitions/tlvmc-
R. E. Howard, W. E. Hubbard, and L. D. Jackel, parkinsons-freezing-gait-prediction/discussion/416026
“Handwritten digit recognition with a back-propagation
network,” in NIPS, 1989. [Online]. Available: https: [48] I. Oguiza, “tsai - a state-of-the-art deep learning library
//api.semanticscholar.org/CorpusID:2542741 for time series and sequential data,” Github, 2022.
[Online]. Available: https://github.com/timeseriesAI/
[33] Y. Song, J. Yu, D. Tang, J. Yang, L. Kong, and X. Li, tsai
“Anomaly detection in spacecraft telemetry data using
graph convolution networks,” in 2022 IEEE Interna- [49] E. Sanderson and B. J. Matuszewski, “Fcn-transformer
tional Instrumentation and Measurement Technology feature fusion for polyp segmentation,” in Medical Im-
Conference (I2MTC), 2022, pp. 1–6. age Understanding and Analysis, G. Yang, A. Aviles-
Rivero, M. Roberts, and C.-B. Schönlieb, Eds. Cham:
[34] M. Tennberg and L. Ekeroot, “Anomaly detection Springer International Publishing, 2022, pp. 892–907.
on satellite time-series,” Ph.D. dissertation, Uppsala
University, 2021. [Online]. Available: https://urn.kb.se/ [50] E. M. Al-Ali, Y. Hajji, Y. Said, M. Hleili, A. M. Alanzi,
resolve?urn=urn:nbn:se:uu:diva-446292 A. H. Laatar, and M. Atri, “Solar energy production
forecasting based on a hybrid cnn-lstm-transformer
[35] H. Fanaee-T and J. Gama, “Tensor-based anomaly model,” Mathematics, vol. 11, no. 3, 2023. [Online].
detection: An interdisciplinary survey,” Knowledge- Available: https://www.mdpi.com/2227-7390/11/3/676
Based Systems, vol. 98, pp. 130–147, 2016.
[Online]. Available: https://www.sciencedirect.com/ [51] Z. Xu, Z. Cheng, and B. Guo, “A hybrid data-
science/article/pii/S0950705116000472 driven framework for satellite telemetry data anomaly
detection,” Acta Astronautica, vol. 205, pp. 281–294,
[36] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual 2023. [Online]. Available: https://www.sciencedirect.
learning for image recognition,” 2015. com/science/article/pii/S0094576523000590
[37] Z. Wang, W. Yan, and T. Oates, “Time series classifica- [52] J. Howard and S. Gugger, “Fastai: A layered API for
tion from scratch with deep neural networks: A strong deep learning,” Information, vol. 11, no. 2, p. 108,
baseline,” 2016. feb 2020. [Online]. Available: https://doi.org/10.3390%
[38] E. Rahimian, S. Zabihi, S. F. Atashzar, A. Asif, and 2Finfo11020108
10
[53] L. N. Smith, “A disciplined approach to neural network B IOGRAPHY [
hyper-parameters: Part 1 – learning rate, batch size,
momentum, and weight decay,” 2018.
Daniel Lakey received his BSc degree in
Computer Science in 2003 from Cardiff
[54] J. Howard and S. Gugger, “fast.ai - fastai A University, and is completing a MSc in
Layered API for Deep Learning — fast.ai,” Data Science with IU International Uni-
https://www.fast.ai/posts/2020-02-13-fastai-A- versity of Applied Sciences. Working
Layered-API-for-Deep-Learning.html, 2021, with the European Space Agency, Daniel
[Accessed 02-10-2023]. has been deeply involved with interplan-
etary exploration missions since 2006.
Since 2013 Daniel has been a Spacecraft
[55] A. Al-Kababji, F. Bensaali, and S. P. Dakua, “Schedul- Operations Engineer on the ESA/NASA
ing techniques for liver segmentation: Reducelron- Solar Orbiter mission, with a particular focus on anomaly
plateau vs onecyclelr,” 2022. investigation and resolution.
orcid.org/0000-0002-8198-7892
[56] L. Liu, L. Tian, Z. Kang, and T. Wan, “Spacecraft
anomaly detection with attention temporal convolution Tim Schlippe is a professor of Artificial
network,” 2023. Intelligence at IU International Univer-
sity of Applied Sciences and CEO of
[57] P. Virtanen, R. Gommers, and T. E. Oliphant, the company Silicon Surfer. Prof. Dr.
“SciPy 1.0: fundamental algorithms for scientific Schlippe has in-depth knowledge in the
computing in python,” Nature Methods, vol. 17, fields of artificial intelligence, machine
no. 3, pp. 261–272, Feb. 2020. [Online]. Available: learning, natural language processing,
https://doi.org/10.1038/s41592-019-0686-2 multilingual speech recognition/synthe-
sis, machine translation, language mod-
eling, computer-aided translation, and
[58] A. M. Ikotun, A. E. Ezugwu, L. Abualigah, entrepreneurship, which can be seen in his numerous pub-
B. Abuhaija, and J. Heming, “K-means clustering lications at international conferences in these areas.
algorithms: A comprehensive review, variants orcid.org/0000-0002-9462-8610
analysis, and advances in the era of big data,”
Information Sciences, vol. 622, pp. 178–210, 2023.
[Online]. Available: https://www.sciencedirect.com/
science/article/pii/S0020025522014633

[59] D. Marutho, S. Hendra Handaka, E. Wijaya, and

Muljono, “The determination of cluster number at k-
mean using elbow method and purity evaluation on
headline news,” in 2018 International Seminar on Ap-
plication for Technology of Information and Communi-
cation, 2018, pp. 533–538.

[60] A. Maćkiewicz and W. Ratajczak, “Principal

components analysis (pca),” Computers & Geosciences,
vol. 19, no. 3, pp. 303–342, 1993. [Online].
Available: https://www.sciencedirect.com/science/
article/pii/009830049390090R

[61] A. Huet, J. M. Navarro, and D. Rossi, “Local evaluation

of time series anomaly detection algorithms,” in
Proceedings of the 28th ACM SIGKDD Conference
on Knowledge Discovery and Data Mining. ACM,
Aug. 2022. [Online]. Available: https://doi.org/10.
1145/3534678.3539339

[62] N. Tatbul, T. J. Lee, S. Zdonik, M. Alam, and

J. Gottschlich, “Precision and recall for time series,” in
Advances in Neural Information Processing Systems,
S. Bengio, H. Wallach, H. Larochelle, K. Grauman,
N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31.
Curran Associates, Inc., 2018. [Online]. Available:
https://proceedings.neurips.cc/paper files/paper/2018/
file/8f468c873a32bb0619eaeb2050ba45d1-Paper.pdf

Unit 14
No ratings yet
Unit 14
10 pages
Machine Learning Imp Questions
100% (2)
Machine Learning Imp Questions
95 pages
Spacecraft Time-Series Online Anomaly Detection Using Deep Learning
No ratings yet
Spacecraft Time-Series Online Anomaly Detection Using Deep Learning
9 pages
111111111111111111111111anomaly Detection in Spacecraft Telemetry Data Based On Transformer-LSTM
No ratings yet
111111111111111111111111anomaly Detection in Spacecraft Telemetry Data Based On Transformer-LSTM
6 pages
Full Text 01
No ratings yet
Full Text 01
39 pages
Anomaly Detection of Spacecraft Telemetry Data Using Temporal Convolution Network
No ratings yet
Anomaly Detection of Spacecraft Telemetry Data Using Temporal Convolution Network
5 pages
J Measurement 2021 109546
No ratings yet
J Measurement 2021 109546
11 pages
TECHNICALSEMINARFRONTwehfkh 3 Kewgkewga
No ratings yet
TECHNICALSEMINARFRONTwehfkh 3 Kewgkewga
6 pages
Anomaly Detection in Self-Organizing Networks - Conventional Versus Contemporary Machine Learning
No ratings yet
Anomaly Detection in Self-Organizing Networks - Conventional Versus Contemporary Machine Learning
9 pages
Machine Learning Methods For Spacecraft Telemetry Mining
No ratings yet
Machine Learning Methods For Spacecraft Telemetry Mining
12 pages
AIML Space Presentation
No ratings yet
AIML Space Presentation
18 pages
2023 - FPGA - Accelerator - For - Meta-Recognition - Anomaly - Detection - Case - of - Burned - Area - Detection
No ratings yet
2023 - FPGA - Accelerator - For - Meta-Recognition - Anomaly - Detection - Case - of - Burned - Area - Detection
13 pages
Stottler 2
No ratings yet
Stottler 2
9 pages
Current Technology in Space v4 Briefing
No ratings yet
Current Technology in Space v4 Briefing
12 pages
1 s2.0 S0167739X23000560 Main
No ratings yet
1 s2.0 S0167739X23000560 Main
12 pages
lstm
No ratings yet
lstm
9 pages
Generative AI For Space Exploration - Frontier in Autonomous Systems
No ratings yet
Generative AI For Space Exploration - Frontier in Autonomous Systems
9 pages
F3-BP-2024-Trna-Ales-Ales Trna - 2024 - Anomaly detection in robotic assembly process using force and torque sensors
No ratings yet
F3-BP-2024-Trna-Ales-Ales Trna - 2024 - Anomaly detection in robotic assembly process using force and torque sensors
68 pages
Time Series Anomaly Detection With DL
No ratings yet
Time Series Anomaly Detection With DL
18 pages
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
No ratings yet
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
17 pages
Anomaly_Detection_Review (1)(2)
No ratings yet
Anomaly_Detection_Review (1)(2)
3 pages
Multi-Objective_Hybrid_Artificial_Intelligence_Approach_for_Fault_Diagnosis_of_Aerospace_Systems
No ratings yet
Multi-Objective_Hybrid_Artificial_Intelligence_Approach_for_Fault_Diagnosis_of_Aerospace_Systems
14 pages
Space Technology
No ratings yet
Space Technology
1 page
1-s2.0-S0167739X22001285-main
No ratings yet
1-s2.0-S0167739X22001285-main
17 pages
1 s2.0 S0167404824000063 Main
No ratings yet
1 s2.0 S0167404824000063 Main
32 pages
SSACorrea
No ratings yet
SSACorrea
12 pages
1 s2.0 S0957417417302737 Main
No ratings yet
1 s2.0 S0957417417302737 Main
13 pages
4.A Review of Machine Learning and Deep Learning Techniques For Anomaly Detection in IoT Data
No ratings yet
4.A Review of Machine Learning and Deep Learning Techniques For Anomaly Detection in IoT Data
23 pages
Anomaly Detection On Iot Network Using Deep Learning
No ratings yet
Anomaly Detection On Iot Network Using Deep Learning
14 pages
Cyber-Resilient Autonomous Spacecraft by Anahita Tasdighi
No ratings yet
Cyber-Resilient Autonomous Spacecraft by Anahita Tasdighi
29 pages
A Review On Anomaly Detection in Time Series
No ratings yet
A Review On Anomaly Detection in Time Series
6 pages
Deep_Learning_for_Anomaly_Detection_in_Time-Series_Data_Review_Analysis_and_Guidelines
No ratings yet
Deep_Learning_for_Anomaly_Detection_in_Time-Series_Data_Review_Analysis_and_Guidelines
23 pages
Barbosa_2023_J._Phys.__Conf._Ser._2512_012012
No ratings yet
Barbosa_2023_J._Phys.__Conf._Ser._2512_012012
6 pages
Neo ppt
No ratings yet
Neo ppt
18 pages
[Önemli] Review of Electric Vehicles Charging Data Anomaly Detection Based on Deep Learning
No ratings yet
[Önemli] Review of Electric Vehicles Charging Data Anomaly Detection Based on Deep Learning
5 pages
Telemetry Fault-Detection Algorithms: Applications For Spacecraft Monitoring and Space Environment Sensing
No ratings yet
Telemetry Fault-Detection Algorithms: Applications For Spacecraft Monitoring and Space Environment Sensing
14 pages
1 s2.0 S1877050919305812 Main
No ratings yet
1 s2.0 S1877050919305812 Main
6 pages
Main Draft
No ratings yet
Main Draft
72 pages
Anomaly Detection Analysis and Prediction-2019
No ratings yet
Anomaly Detection Analysis and Prediction-2019
18 pages
2021 - A Graph Neural Network Method For Distributed Anomaly Detection in IoT - Protogerou Et Al
No ratings yet
2021 - A Graph Neural Network Method For Distributed Anomaly Detection in IoT - Protogerou Et Al
18 pages
Smart Anomaly Detection in Sensor Systems: A Multi-Perspective Review
No ratings yet
Smart Anomaly Detection in Sensor Systems: A Multi-Perspective Review
21 pages
Artemis Technologies and Mission Architecture: Definitive Reference for Developers and Engineers
From Everand
Artemis Technologies and Mission Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
A_Diverse_Space_Target_Dataset_With_Multidebris_and_Realistic_On-Orbit_Environment
No ratings yet
A_Diverse_Space_Target_Dataset_With_Multidebris_and_Realistic_On-Orbit_Environment
13 pages
Precision Navigation with the Compass: Definitive Reference for Developers and Engineers
From Everand
Precision Navigation with the Compass: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Online Time-Series Anomaly Detection A Survey of M
No ratings yet
Online Time-Series Anomaly Detection A Survey of M
36 pages
4-anomaly-detection
No ratings yet
4-anomaly-detection
4 pages
Raub
No ratings yet
Raub
16 pages
Sensors 23 01561
No ratings yet
Sensors 23 01561
16 pages
DeepLearningforTimeSeriesAnomalyDetection
No ratings yet
DeepLearningforTimeSeriesAnomalyDetection
42 pages
A Novel Anomaly Detection Approach For Internet of Things Time Series Data
No ratings yet
A Novel Anomaly Detection Approach For Internet of Things Time Series Data
13 pages
Anomaly Detection in Aircraft Data Using Recurrent Neural Networks RNN
No ratings yet
Anomaly Detection in Aircraft Data Using Recurrent Neural Networks RNN
8 pages
Applsci 13 10745
No ratings yet
Applsci 13 10745
22 pages
Anomaly Detection On Industrial Electrical Systems Using Deep Learning
No ratings yet
Anomaly Detection On Industrial Electrical Systems Using Deep Learning
6 pages
2408.13587v1
No ratings yet
2408.13587v1
16 pages
Estudio de redes neuronales autoencoder para detección de anomalías en edificios conectados-2018
No ratings yet
Estudio de redes neuronales autoencoder para detección de anomalías en edificios conectados-2018
5 pages
Bioengineering 10 00405 v2
No ratings yet
Bioengineering 10 00405 v2
30 pages
Anomaly_detection
No ratings yet
Anomaly_detection
4 pages
Artificial Intelligence in Space Exploration
No ratings yet
Artificial Intelligence in Space Exploration
4 pages
Klonowski
No ratings yet
Klonowski
19 pages
2503.00036v1
No ratings yet
2503.00036v1
17 pages
Towards The Use of Arti Ficial Intelligence On The Edge in Space Systems: Challenges and Opportunities
No ratings yet
Towards The Use of Arti Ficial Intelligence On The Edge in Space Systems: Challenges and Opportunities
13 pages
Survey
No ratings yet
Survey
19 pages
MCV4U 1-3I - Testing the Limits
No ratings yet
MCV4U 1-3I - Testing the Limits
5 pages
ENG3U 2-2C Graphic Organizer
No ratings yet
ENG3U 2-2C Graphic Organizer
3 pages
未命名 63
No ratings yet
未命名 63
2 pages
ENG3U 3-2B Poem Analysis Graphic Organizer Template
No ratings yet
ENG3U 3-2B Poem Analysis Graphic Organizer Template
3 pages
Orthogonality and Disjointness in Spaces of Measures: November 1997
No ratings yet
Orthogonality and Disjointness in Spaces of Measures: November 1997
11 pages
SPH3U 3-2F - WORK, ENERGY POWER ASSIGNMENT Updated
No ratings yet
SPH3U 3-2F - WORK, ENERGY POWER ASSIGNMENT Updated
18 pages
Cross-Layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes For Similarity Search
No ratings yet
Cross-Layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes For Similarity Search
7 pages
SPH3U 3-3G - SPECIFIC HEAT CAPACITY - Final
No ratings yet
SPH3U 3-3G - SPECIFIC HEAT CAPACITY - Final
10 pages
Has Approximate Machine Unlearning Been Evaluated Properly? From Auditing To Side Effects
No ratings yet
Has Approximate Machine Unlearning Been Evaluated Properly? From Auditing To Side Effects
22 pages
Stroobnet Optimization Via Gpu-Accelerated Proximal Recurrence Strategies
No ratings yet
Stroobnet Optimization Via Gpu-Accelerated Proximal Recurrence Strategies
10 pages
Melting Point: Mobile Evaluation of Language Transformers
No ratings yet
Melting Point: Mobile Evaluation of Language Transformers
16 pages
Wildfire Danger Prediction Optimization With Transfer Learning
No ratings yet
Wildfire Danger Prediction Optimization With Transfer Learning
6 pages
Renewable Energy in Copper Production - A Review
No ratings yet
Renewable Energy in Copper Production - A Review
27 pages
Initial Decoding With Minimally Augmented Language Model For Improved Lattice Rescoring in Low Resource ASR
No ratings yet
Initial Decoding With Minimally Augmented Language Model For Improved Lattice Rescoring in Low Resource ASR
14 pages
Optimal and Adaptive Non-Stationary Dueling Bandits Under A Generalized Borda Criterion
No ratings yet
Optimal and Adaptive Non-Stationary Dueling Bandits Under A Generalized Borda Criterion
45 pages
Unified Approach To Miura, B Acklund and Darboux Transformations For Nonlinear Partial Differential Equations
No ratings yet
Unified Approach To Miura, B Acklund and Darboux Transformations For Nonlinear Partial Differential Equations
37 pages
Dynamic Survival Analysis For Early Event Prediction: Preprint - , 2024 Under Review
No ratings yet
Dynamic Survival Analysis For Early Event Prediction: Preprint - , 2024 Under Review
18 pages
On Safety in Safe Bayesian Optimization: Christian Fiedler
No ratings yet
On Safety in Safe Bayesian Optimization: Christian Fiedler
29 pages
Flowerformer: Empowering Neural Architecture Encoding Using A Flow-Aware Graph Transformer
No ratings yet
Flowerformer: Empowering Neural Architecture Encoding Using A Flow-Aware Graph Transformer
12 pages
The Determinant Representation For Quantum Correlation Functions of The Sinh-Gordon Model
No ratings yet
The Determinant Representation For Quantum Correlation Functions of The Sinh-Gordon Model
21 pages
Algebraic Treatment of The Hypercoulomb Problem
No ratings yet
Algebraic Treatment of The Hypercoulomb Problem
18 pages
Simplified Diffusion Schrödinger Bridge
No ratings yet
Simplified Diffusion Schrödinger Bridge
28 pages
Duality in Perturbation Theory and The Quantum Adiabatic Approximation
No ratings yet
Duality in Perturbation Theory and The Quantum Adiabatic Approximation
9 pages
Allspark: Workload Orchestration For Visual Transformers On Processing In-Memory Systems
No ratings yet
Allspark: Workload Orchestration For Visual Transformers On Processing In-Memory Systems
14 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
9 pages
Advancing Artificial Intelligence of Things Security: Integrating Feature Selection and Deep Learning for Real-Time Intrusion Detection
No ratings yet
Advancing Artificial Intelligence of Things Security: Integrating Feature Selection and Deep Learning for Real-Time Intrusion Detection
17 pages
Hackathon PPT Tech Titans
No ratings yet
Hackathon PPT Tech Titans
4 pages
Cigref Artificial Intelligence in Companies Strategies Governance Challenges of Data Intelligence 2018 October EN
No ratings yet
Cigref Artificial Intelligence in Companies Strategies Governance Challenges of Data Intelligence 2018 October EN
40 pages
The Data Science Skills Competency Model: A Blueprint For The Growing Data Scientist Profession
No ratings yet
The Data Science Skills Competency Model: A Blueprint For The Growing Data Scientist Profession
12 pages
ChatGPT Mastery - Zaka
No ratings yet
ChatGPT Mastery - Zaka
10 pages
Accurate Retinal Vessel Segmentation in Color Fundus Images Via Fully Attention-Based Networks
No ratings yet
Accurate Retinal Vessel Segmentation in Color Fundus Images Via Fully Attention-Based Networks
11 pages
Dr. Arvind-K-Sharma PDF
No ratings yet
Dr. Arvind-K-Sharma PDF
7 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
1378-Article Text-5109-3-10-20250124
No ratings yet
1378-Article Text-5109-3-10-20250124
21 pages
Are Machines Stealing Our Jobs?: Andrea Gentili, Fabiano Compagnucci, Mauro Gallegati and Enzo Valentini
No ratings yet
Are Machines Stealing Our Jobs?: Andrea Gentili, Fabiano Compagnucci, Mauro Gallegati and Enzo Valentini
21 pages
Market_Reaseach_Prompting
No ratings yet
Market_Reaseach_Prompting
2 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
9 pages
A Lightweight Transformer Network For Hyperspectral Image Classification
No ratings yet
A Lightweight Transformer Network For Hyperspectral Image Classification
17 pages
NLP Slides 1-55
No ratings yet
NLP Slides 1-55
55 pages
Ai and Recruitment British English Teacher
100% (1)
Ai and Recruitment British English Teacher
10 pages
IOSCOPD786
No ratings yet
IOSCOPD786
56 pages
ISBR Conference Brochure March 1 2024-1
No ratings yet
ISBR Conference Brochure March 1 2024-1
10 pages
Research_paper_Format _For MCA
No ratings yet
Research_paper_Format _For MCA
6 pages
Thesis Philippe Saade
No ratings yet
Thesis Philippe Saade
69 pages
Advanced Image Segmentation Techniques
No ratings yet
Advanced Image Segmentation Techniques
71 pages
100 Days of GATE Data Science and AI
No ratings yet
100 Days of GATE Data Science and AI
13 pages
Arnav Sati IT File
No ratings yet
Arnav Sati IT File
15 pages
MC_English_Q8112_AI - DevOps Engineer_v.3.0
No ratings yet
MC_English_Q8112_AI - DevOps Engineer_v.3.0
22 pages
Sri Ram - Week 3 Assignment
No ratings yet
Sri Ram - Week 3 Assignment
14 pages
Wu_HairMapper_Removing_Hair_From_Portraits_Using_GANs_CVPR_2022_paper
No ratings yet
Wu_HairMapper_Removing_Hair_From_Portraits_Using_GANs_CVPR_2022_paper
10 pages
Business-Case-for-UK-MNCs_
No ratings yet
Business-Case-for-UK-MNCs_
24 pages
CSC413-Machine Learning and Data Mining
No ratings yet
CSC413-Machine Learning and Data Mining
7 pages

A Comparison of Deep Learning Architectures For Spacecraft Anomaly Detection

Uploaded by

A Comparison of Deep Learning Architectures For Spacecraft Anomaly Detection

Uploaded by

A Comparison of Deep Learning Architectures

for Spacecraft Anomaly Detection

Abstract— Spacecraft operations are highly critical, demand- 1. I NTRODUCTION

Telemetry Channel; A specific pathway or conduit used for

Parameter; A measurement within a telemetry channel. This

Table 2. SMAP/MSL Dataset Statistics, from [4] PCA2

SMAP MSL Total PCA1

Listing 1. Time Series Clustering Pseudo-Code

Adapted from [61] and [62]

Training Performance Best Architecture per Cluster

these clusters are needed before a recommendation can be

cluster collectively outperform any single model by nearly

As an ensemble approach to the general anomaly detection

problem, taking the best performing architecture per-data

type greatly increases the performance overall. The average

Architecture Total Time Avg Time F1 (%) TP FP FN F1 (%)

Table 4. Results per Cluster

Architecture Cluster F1 (%) Time Point F1 (%) Anomaly

[59] D. Marutho, S. Hendra Handaka, E. Wijaya, and

[60] A. Maćkiewicz and W. Ratajczak, “Principal

[61] A. Huet, J. M. Navarro, and D. Rossi, “Local evaluation

[62] N. Tatbul, T. J. Lee, S. Zdonik, M. Alam, and

You might also like