Approaches and Applications of Early Classification
Approaches and Applications of Early Classification
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
1
Abstract—Early classification of time series has been exten- natural property to satisfy human eagerness of visualizing the
sively studied for minimizing class prediction delay in time- structure (or shape) of data [12]. Numerous algorithms have
sensitive applications such as medical diagnostic and industrial developed to study various aspects of the time series such as
process monitoring. A primary task of an early classification ap-
proach is to classify an incomplete time series as soon as possible forecasting [13], clustering [14], and classification [15]. The
with some desired level of accuracy. Recent years have witnessed forecasting algorithms attempt to predict future data points
several approaches for early classification of time series. As most of the time series [13]. Next, the clustering algorithms aim
approaches have solved the early classification problem using to partition the unlabeled time series instances into suitable
a diverse set of strategies, it becomes very important to make number of groups based on their similarities [14]. Finally, the
a thorough review of existing solutions. These solutions have
demonstrated reasonable performance on a wide range of ap- classification algorithms attempt to predict the class label of an
plications including human activity recognition, gene expression unlabeled time series by learning a mapping between training
based health diagnostic, and industrial monitoring. In this paper, instances and their labels [15], [16].
we present a systematic review of the current literature on early Time Series Classification (TSC) has been a topic of great
classification approaches for both univariate and multivariate interest since the availability of labeled dataset repositories
time series. We divide various existing approaches into four
exclusive categories based on their proposed solution strategies. such as UCR [17] and UCI [18]. Consequently, a large number
The four categories include prefix based, shapelet based, model of TSC algorithms have emerged by introducing efficient
based, and miscellaneous approaches. We discuss the applications and cutting-edge strategies for distinguishing classes. Authors
of early classification and provide a quick summary of the current in [16], [19], [20] focused on instance based learning where
literature with future research directions. the class label of a testing time series is predicted based on a
Impact statement − Early classification is mainly an extension similarity measure. Dynamic Time Warping (DTW) [21] and
of classification with an ability to classify a time series using its variations [16], [20] with 1-Nearest Neighbors (1-NN) have
limited data points. It is true that one can achieve better been extensively used similarity measures in the instance based
accuracy if one waits for more data points, but opportunities TSC algorithms.
for early interventions could equally be missed. In a pandemic Recently, deep learning based TSC algorithms, discussed
situation such as COVID-19, early detection of an infected in [22], have also demonstrated a significant progress in time
person becomes more desirable to curb the spread of the virus series classification. Two robust TSC algorithms are proposed
and possibly save lives. Early classification of gas (e.g., methyl in [23] and [24], by using ResNet and Convolutional Neural
isocyanate) leakage can help to avoid life-threatening conse- Network (CNN) framework, respectively. The authors in [25]
quences on human beings. Early classification techniques have developed a reservoir computing approach for generating a
been successfully applied to solve many time-critical problems new representation of Multivariate Time Series (MTS). The
related to medical diagnostic and industrial monitoring. This approach is incorporated into recurrent neural networks to
paper provides a systematic review of the current literature avoid computational cost of the back propagation during
on these early classification approaches for time series data, classification. In [26], the authors proposed a multivariate TSC
along with their potential applications. It also suggests some approach by combining two deep learning models, Long Short-
promising directions for further work in this area. Term Memory (LSTM) and Fully Convolutional Network
(FCN), with an attention mechanism. Two recent studies [27],
[28] employed generative adversarial networks for TSC by
I. I NTRODUCTION
modeling the temporal dynamics of the data.
Due to the advancement of energy-efficient, small size, and The main objective of TSC algorithms is to maximize
low cost embedded devices, time series data has received an the accuracy of the classifier using complete time series.
unprecedented attention in several fields of research, to name However, in time-sensitive applications such as gas leakage
a few, healthcare [1]–[3], finance [4], [5], speech and activity detection [29], earthquake [30], and electricity demand pre-
recognition [6]–[8], and so on [9]–[11]. The time series has diction [31], it is desirable to maximize the earliness by clas-
an inherent temporal dependency among its attributes (data sifying an incomplete time series. A classification approach
points), which allows the researchers to analyze the behavior that aims to classify an incomplete time series is referred as
of any process over time. Moreover, the time series has a early classification [32]–[34]. Xing et al. [32] stated that the
earliness can only be achieved at the cost of accuracy. They
The authors are with the Department of Computer Science and En- indicated that the main challenge before an early classification
gineering, Indian Institute of Technology (BHU) Varanasi, 221005,
India (e-mail: [email protected]; [email protected]; approach is to optimize the balance between two conflicting
[email protected]; [email protected]) objectives, i.e., accuracy and earliness. One of the first known
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
2
approaches for early classification of time series is proposed incomplete MTS of motion angles. The authors in [40] differ-
in [35], and then after several researchers have put their efforts entiated between the normal and dangerous human behavior
in this direction and published a large number of research by classifying the sequential patterns of the activities such as
articles at renowned venues. After doing an exhaustive search, walking followed by lying down or falling. In the studies [41],
we found a minor survey in [36], which included only a [42], the authors have attempted to classify 19 different human
handful of existing early classification approaches and did not activities using only partial time series.
provide any categorization. 2) Medical diagnostic: A primary motivation of the work
This paper presents a systematic review of the early classi- in [3], [43]–[46], [46]–[51] is to develop the early clas-
fication approaches for both univariate and multivariate time sification approaches for medical diagnosis of the diseases
series data. At first, we discuss the potential applications that such as asthma, viral infection, abnormal Electrocardiogram
motivated the researchers to work in the early classification of (ECG), etc. Early diagnosis of these diseases can significantly
time series. Section 1 provides the detail of the applications. In minimize the consequences on patient health and assist the
Section III, we explain the research methodology for search- doctors in treatment. Gene expression has been used to study
ing, filtering, selection of the reviewed papers. Section IV the viral infection on patients, drug response on the disease,
discusses the fundamentals of early classification approaches and patient recovery [43]–[45]. Early detection of asthma can
and their categorization. We categorize the approaches into help to prevent life-threatening risk and further to provide
four groups based on the solution strategies that the researchers rapid relief [49]. The study in [51] focused on predicting the
adopted for early classification. The included approaches in the right time for transferring the patient to Intensive Care Unit
proposed categories are detailed in four subsequent sections. (ICU), using the MTS of physiological measurements such
Finally, Section IX summarizes the review by discussing the as temperature and respiratory rate. Further, ECG is also a
challenges of the solution approaches and their recommenda- time series of electrical signals that are generated from heart’s
tions for future work. activity. Early classification of ECG [3], [46], [47], [52] helps
to diagnose abnormal heart beating at the earliest, reducing
II. A PPLICATIONS OF E ARLY C LASSIFICATION the risk of heart failure.
In data mining and machine learning, early classification 3) Industrial process monitoring: With the advancements
of time series has received significant attention as it can in sensor technology, monitoring the industrial processes has
solve time-critical problems in many areas including medical, become convenient and effortless by using the sensors. The
industry, and transportation. Literature indicates numerous sensors generate time series, which is to be classified for
applications of early classification of time series. Some of knowing the status of the operation. The authors in [29], [35],
the important applications are illustrated in Fig. 1 and also [42], [53]–[56] are motivated to build early classification based
discussed in detail as follows. solutions for industrial problems by using sensory data. In
chemical industries, even a minor leakage can cause hazardous
Applications of Early Classification effects on the crew members’ health [29]. Early classification
not only reduces the risks of health but also minimizes the
maintenance cost by ensuring smooth operations all the time.
Human activity Industrial process Quality
Others In particular, an electronic nose is developed using gas sensors
classification monitoring monitoring
in [29], to smell the gas odor. It generates an MTS, which
· Walking · Gas leakage · Wafer · Bird identification needs to be classified as early as possible to detect any leakage.
· Running · Bearing faults · Olive oil · Leaf identification
· Sitting on sofa · Coffee · Vowel recognition In [42], the authors attempted to detect the problems such
as pump leakage, reduced pressure, and inefficient operation,
Medical Intelligent Electricity usage
diagonostic transporation monitoring
in a hydraulic system. Early detection of these problems can
significantly lower the maintenance cost. Early identification
· Gene expression · Road surface · Home appliances
· ECG · Traffic flow · Load forecasting
of instrumentation failure in nuclear power plants can save
· Asthma hazardous consequences [35], [56], [57].
Fig. 1: Applications of the early classification of time series. 4) Intelligent transportation: As modern vehicles are
equipped with several sensors, it becomes easy to monitor
1) Human activity classification: With the availability of the behavior of driver, road surface condition, and traffic flow
multi-modal sensors in smartphones and wearable, people can prediction, by using the generated sensory data. If the driver
easily monitor their daily routine activities such as walking, is overspeeding or alcoholic, early classification of driving
running, eating, and so on. The early classification of human pattern can reduce the chances of an accident that may occur
activities helps to minimize the response time of the system due to delayed classification. The study [10] attempted to early
and in turn, improves the user experience [8]. The researchers classify the type of road surface by using the sensors such as
in [8] attempted to classify various complex human activities accelerometer, light, temperature, etc. Such early classification
such as sitting on sofa, sitting on floor, standing while talking, of the road surfaces helps to choose an alternative path if the
walking upstairs, and eating, using the sensors generated MTS. surface condition is poor, e.g., bumpy or rough. The studies
The studies in [37]–[39] focused on identifying human actions in [37], [42] focused on identifying weekdays using traffic flow
such as pick up, chicken dance, golf swing, etc., using an MTS. It helps to forecast the road traffic for that particular day.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
3
5) Quality monitoring: It is a process of ensuring product papers fetched from the selected databases. As we intended
quality based on preset standards. With the help of sensors or to find maximum number of papers that have studied early
spectroscopy, the quality of a product can be ensured during classification of time series, we formed three different search
its manufacturing. Early detection of low-quality product is queries at varying level of granularity while keeping the term
desirable to avoid negative consequences on the production. early in common. The search results include the papers from
In semiconductor industries, checking the quality of silicon last 20 years (i.e., from 2000 to 2020).
wafers is a crucial task. Early classification of wafer qual-
ity [3], [47]–[50], [52], [58], [59] can help to minimize the TABLE I: Search query terms and the papers fetched from the databases.
maintenance cost while ensuring the smooth operations all the Fetched Final
Query terms (in title)
time. The Origin of olive oils can be distinguished geograph- papers papers
IX = 14
ically using time series data obtains from spectroscopy [60], Query 1
“early classification” AND
SD = 02
69
[61]. The work in [55], [62] attempted to classify two types “time series” (19 duplicate)
GS = 53
of coffee beans, i.e., Robusta and Aribica. Further, the quality “early detection” OR
of beef can also be ensured using the spectroscopy [56]. “early prediction” OR IX = 13
116
Query 2 “early recognition” AND SD = 13
6) Electricity usage monitoring: Awareness about electric- (44 duplicate)
“ongoing” OR GS = 90
ity usage helps the consumers avoid unnecessary wastage of
“time series” OR “sequence”
the energy and curb the monthly electricity bill. Early classi- “early detection” OR
fication of the currently running appliances can reduce usage “early prediction” OR
“early recognition” OR IX = 06
28
by turning them off during peak hours. The work in [39] has Query 3 SD = 01
“early classification” AND (08 duplicate)
GS = 21
successfully classified several household appliances including “temporal patterns” OR
air conditioner, washing machine, microwave oven, and so on. “observations”
Total 213
The studies in [54], [56], [59], [63] have been utilizing the
electricity usage data for distinguishing the seasons from April
to September and from October to March. 2) Inclusion criteria: A paper is included in this review
7) Others: The authors in [59] conducted a case study to only if it meets the following criteria:
identify a bird by using only 20% of time series obtains from • It must be written in English only.
chirping sound. The work in [64] distinguished major Indian • It should be a book chapter, conference, or journal paper.
rivers based on the time series of water quality parameters. • The early classification approach in the paper should
Early prediction of the stock market (IBEX 35) can help to essentially be developed for time series data only.
plan better strategies for investment [62]. In addition, the early • The data points in the time series should be of numeric
classification has shown good performance on Japanese vowel type. However, it may be a transformed version of any
recognition [52], [65], Australian sign language [35], [37], other type of data (e.g., image or video).
[58], and leaf identification [55], [57], [62].
3) Selection process: Once the papers are fetched using
three aforementioned query terms, we first remove the du-
III. R ESEARCH PROTOCOL plicate papers and then review the title and abstract of the
remaining. After reading the title and abstract, several papers
In order to conduct a systematic review of early classifica-
are filtered out as they could not meet the inclusion criteria.
tion approaches for time series, we followed a review style
Later, rest of the papers were subjected to full-text reading
similar to [66], [67] and developed a research protocol for
and some of them are also removed at this stage. Finally, we
searching, filtering, and selection of included reviewed paper.
also include some relevant papers from the references. Fig. 2
The protocol consists of the following steps:
illustrates the selection process with the number of filtered
• Search strategy: To search the papers from standard
papers at different stages.
databases using relevant queries.
• Inclusion criteria: To filter out the related papers.
• Selection process: To select the final set of papers in-
IV. F UNDAMENTALS AND C ATEGORIZATION OF E ARLY
cluded in this review. C LASSIFICATION A PPROACHES
This review gives a quick understanding of the notable con-
tributions that have been made over the years in the area of In this section, we first discuss the fundamentals of time
early classification of time series. series prerequisite for acquiring a sound understanding of
various early classification approaches. Later, we present the
1) Search strategy: At first, we form query terms that categorization of the approaches.
can fetch most of the relevant papers from the standard
databases. For this review, we selected following databases:
IEEE Xplore (IX), Science Direct (SD), and Google Scholar A. Fundamentals
(GS). These databases cover almost every aspect of research in
the engineering field including computer science and biomed- This subsection defines the terminologies and notations used
ical. Table I shows different query terms and the number of in this paper.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
4
63 papers
4) Earliness: It is an important measure to evaluate the
Included papers effectiveness of the early classifiers. Let the early classifier
After reading Found relevant uses t data points of a testing time series during classification.
full−text from references
Now, the earliness is defined as E = TT−t × 100, where T is
63 − 27 = 36 36 + 10 = 46
the length of complete time series [8]. The earliness is also
called as timeliness [70].
Fig. 2: Illustration of the paper selection process. 5) Prefix: In [33], prefix of a time series X is given as the
following subsequence X[1, t] = X[1], X[2], · · · , X[t], where
t denotes the length of the prefix. The training dataset is said
1) Time series: It is defined as a sequence of T ordered ob- to be in prefix space if it contains only the prefix of the time
servations typically taken at equal-spaced time intervals [68], series with their associate class labels.
where T denotes the length of complete time series. A time 6) Shapelet: It is defined as a quadruple S = (s, l, δ, y),
series is denoted as Xd = {X1 , X2 , · · · , XT }, where d is the where s is a subsequence of length l, δ is the distance
dimension and Xi ∈ Rd for 1 ≤ i ≤ T . If d = 1 then the threshold, and y is the associated class label [43], [60]. The
time series is referred as univariate otherwise multivariate. If distance threshold δ is usually learned using the training
the time series is a dimension of MTS then it can be referred instances and it is used to find whether the shapelet is matched
as component [8], [10]. In general, a time series is univariate with any subsequence of the testing time series.
unless it is explicitly mentioned as multivariate. 7) Interpretability: It mainly refers to the fact that how
2) Time series classification: It refers to predicting the convincing the classification results are to the domain experts.
class label of a time series by constructing a classifier using In the medical applications, adaptability of any early classifi-
a labeled training dataset [15]. Let D is a training dataset cation approach heavily relies on its interpretability [34]. The
consisting N instances as N pairs of time series X and their authors in [3], [34], [45], [60] assert that a short segment of
class labels y. The time series classifier learns a mapping the time series is more convincing and helpful than the time
function H : X → y. The classifier can predict the class series itself if such a segment contains class discriminatory
0
label of a testing time series X ∈ / D only if it is complete, patterns.
i.e., the length of X0 should be the same as that of training 8) Reliability: It expresses the guarantee that the probabil-
instances [69]. ity of early predicted class label of an incomplete time series is
3) Early classification of time series: According to [55], met with a user-specified threshold [63], [70]. Reliability is a
early classification is an extension of the traditional classifica- crucial parameter to ensure minimum required accuracy in the
tion with the ability to classify an unlabeled incomplete time early classification. It is also termed as uncertainty estimate
series. In other words, an early classifier is able to classify or confidence measure in different studies [59], [60].
a testing time series with t data points only, where t ≤ T .
Early classification is desirable in the applications where
data collection is costly or late prediction causes hazardous B. Categorization of early classification approaches
consequences [10]. Intuitively, an early classifier may take This work categorizes the early classification approaches
more informed decision about class label if more data points (discussed in the selected papers) into meaningful groups,
are available in the testing time series [63] but it will delay to better understand their differences and similarities. We
the decision. Therefore, the researchers focused on optimizing believe that one of the most meaningful ways to categorize
the accuracy of prediction with minimum delay (or maximum these approaches is the strategies that they have discovered to
earliness). Further, the early classification of time series is achieve the earliness. We broadly categorize the approaches
analogous to a case of missing features with the constraint into four major groups, as shown in Fig. 4. The summary of
that the features are missing only because of unavailability of the included papers in different groups is given in Table II.
data points [63]. In the context of early classification, a testing 1) Prefix based early classification: The strategy is to
time series can be referred as incomplete or incoming. Fig. 3 learn a minimum prefix length of the time series using the
illustrates an early classification framework for predicting the training instances and utilize it to classify a testing time series.
class label of an incomplete time series X0 . During training, a set of T classifiers, one for each prefix
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
5
Fig. 4: Categorization of the early classification approaches. [MPL: Minimum Prediction Length, RNN: Reverse Nearest Neighbor]
space, are constructed. The classifier that achieves a desired We split the prefix based approaches into two groups
level of stability with minimum prefix length, is considered as according to their MPL computation methods. In the first
early classifier and the corresponding prefix length is called group, the approaches [33], [50], [69] developed a concept
as Minimum Prediction Length (MPL) [33], [50], [69] or of Reverse Nearest Neighbor (RNN) to compute MPL of time
Minimum Required Length (MRL) [8], [10], [41], [42], [64]. series. In the second group of approaches [8], [10], [59], [64],
This early classifier can classify an incoming time series as the authors employed a probabilistic classifier first to obtain
soon as MPL is available. posterior class probabilities and then utilized them for MPL
2) Shapelet based early classification: A family of early computation.
classification approaches [3], [34], [43], [45], [47]–[49], [51],
[60], [71]–[73] focused on obtaining a set of key shapelets
from the training dataset and utilizing them as class discrimi-
A. MPL computation using RNN
natory features of the time series. As there are many shapelets
in the dataset, the different approaches attempted to select We first discuss the concept of RNN for the time series data
only those shapelets that can provide maximum earliness and and then describe the approaches that have been using RNN
uniquely manifest the class label. These selected shapelets are for MPL computation. Let D is a labeled time series dataset
matched with the incoming time series, and the class label of with N instances of length T . According to [33], RNN of a
best matched shapelet is assigned to the time series. time series X ∈ D is a set of those time series which have X
3) Model based early classification: Another set of early in their nearest neighbors. It is mathematically given as
classification approaches [29], [46], [55], [56], [62], [65],
[70], [74] proposed mathematical models based on condi- RN N t (X) = {X0 ∈ D X ∈ N N t (X0 )},
tional probabilities. The approaches obtain these conditional
where, t is the length of X in the prefix space and t = T in
probabilities by either fitting a discriminative classifier or
full-length space.
using generative classifiers on training. Some of these early
To compute MPL of the time series X, the authors [33]
classification approaches have also developed a cost-based
compares RN N of full-length space with RN N of prefix
trigger function to make reliable predictions.
spaces. The MPL of X is set to t if the following con-
4) Miscellaneous approaches: The early classification ap-
ditions are satisfied: (1) RN N t (X) = RN N T (X) 6= φ,
proaches that do not qualify any of the above mentioned 0
(2) RN N t−1 (X) 6= RN N T (X), and (3) RN N t (X) =
categories, are included here. Some of these approaches
RN N T (X), where t ≤ t0 ≤ T . Further, if RN N T (X) 6= φ
employed deep learning techniques [58], [61], reinforcement
then MPL of X is equal to T . Above conditions check the
learning [75], [76], and so on [35], [37], [44], [77].
stability of RNN using prefix of X with length t.
Xing et al. [33] developed two different algorithms, Early 1-
NN and Early Classification of Time Series (ECTS), for UTS
V. P REFIX BASED EARLY CLASSIFICATION
data. Early 1-NN algorithm computes the MPLs for all the
This section discusses the prefix based early classification time series of the training dataset. These computed MPLs are
approaches in detail. The first notable prefix based approach first arranged in ascending by their lengths and then used for
was proposed in [32]. The authors in [32] introduced two novel early classification of the testing time series X0 . Let m be a
methods, Sequential Rule Classification (SCR) and Generalize least value of the computed MPLs. As soon as the number
Sequential Decision Tree (GSDT), for early classification of of data points in X0 becomes equal to m, Early 1-NN starts
symbolic sequences. For a given training dataset, SCR method its classification. It first computes 1-NN of X0 with m data
first extracts a large number of sequential rules from the points as follows
different lengths of prefix spaces and then selects some top-k
rules based on their support and prediction accuracy. These N N m (X0 ) = argmin {dist(X0 [1, m], X[1, m])}, (1)
selected rules are used for the early classification. X∈Dmpl
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
6
TABLE II: Summary of the early classification approaches for time series.
Type of
Paper Classifier used Datasets for experimental evaluation Category
time series
Decision tree and rule-based
[32] ECG [17], synthetic control [18], DNA sequence [18]
classifier
[33] 1-NN 7 UCR datasets
UTS
[69] 1-NN 7 UCR datasets
[59] Gaussian Process (GP) classifier 45 UCR datasets
[64] GP classifier River dataset [78]
Wafer and ECG [79], character trajectories [18], robot execution Prefix
[50] 1-NN
failures [18] based
Human activity classification (collected), NTU RGB+D [80], early
[8] GP classifier daily and sports activities [18], heterogeneity human activity classification
recognition [18]
MTS
Road surface classification (collected), PEMS-SF [18], hetero-
GP and Hidden Markov Model
[10] geneity human activity recognition [18], gas mixtures detec-
(HMM) classifiers
tion [18]
Hydraulic system monitoring [18], PEMS-SF [18], daily and
[42] GP classifier
sports activities [18]
Closest shapelet using Euclidean
[34] 7 UCR datasets
Distance (ED)
[60] Closest shapelet using ED 20 UCR datasets
[71] CNN 12 UCR datasets UTS
[72] Closest shapelet using ED 35 UCR datasets
Closest shapelet using
[73] 16 UCR datasets
Trend-based ED
Closest multivariate shapelet using
[43] 8 gene expression datasets [81], [82]
ED Shapelet
Closest multivariate shapelet using based
[47] Wafer and ECG [79], 2 synthetic datasets
ED early
Closest multivariate shapelet using classification
[45] 2 gene expression datasets [81], ECG [83]
ED
Rule based and Query By MTS
[48] Wafer and ECG [79], 2 synthetic datasets
Committee (QBC) classifiers
Gene expression dataset [82], Wafer and ECG [79], robot execu-
[49] Decision tree
tion failures [18]
Closest multivariate shapelet using Wafer and ECG [79], Character trajectories [18], Japanese vow-
[52]
ED els [18], uWaveGestureLibrary [18]
[51] Decision tree and random forest ICU data of 2127 patients (collected)
Quadratic Discriminant Analysis
[70] 1 synthetic and 4 UCR datasets
(QDA)
Linear Support Vector Machines
[63] 15 UCR datasets
(SVM) and Local QDA
Naive Bayes and Multi Layer
[46] TwoLeadECG [17]
Perceptron
[65] HMM and iHMM Japanese vowel speaker [18]
UTS Model
[62] GP and SVM 45 UCR datasets, IBEX35 stock
based
CBF [17], control charts [18], character trajectories [18], local-
[40] Linear SVM early
ization data for person activity [18] (after preprocessing)
classification
[54] SVM 76 UCR datasets
[55] GP and SVM 45 UCR datasets
[56] GP 45 UCR datasets
[74] GP, SVM, and Naive Bayes 15 UCR datasets
Two-tier classifier using variants
[39] 45 UCR datasets, PLAID [84], ACS-F1 [85]
of SVM, DTW
[29] SVM Gas dataset (collected) MTS
[35] Adaboost ensemble classifier CBF [17], control charts [18], trace [17], auslan [18]
[75] Reinforcement learning agent 3 UCR datasets
UTS
[61] Combination of CNN and LSTM 46 UCR datasets
[53] SVM and Neural network Bearing faults dataset
[76] Deep Q-Network Living organisms dataset Miscellaneous
Hybrid model using HMM and approaches
[77] 5 gene expression datasets [82]
SVM
MTS
[37] Stochastic process Auslan [18], PEMS-SF [18], motion capture [86]
[38] Stochastic process Motion capture [86], NTU RGB+D [80], UT Kinect-Action [87]
[58] Combination of CNN and LSTM Wafer and ECG [79], auslan [18]
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
7
where, Dmpl is the dataset of those time series whose MPLs In [64], the authors developed a game theoretic approach
are at most m. The function dist(·) computes the euclidean for early classification of Indian rivers using the time series of
distance between the time series. water quality parameters such as pH value, turbidity, dissolved
Early 1-NN has two major drawbacks: i) each time series oxygen, etc. They formulated an optimization that helps to
can have different MPL, and ii) computed MPLs are short and compute the class-wise MPLs while maintaining α accuracy.
not robust enough due to the overfitting problem of 1-NN. To The authors in [10], [41], [42] attempted to classify an
overcome these drawbacks, ECTS algorithm [33] first clusters incoming MTS as early as possible with at least α accuracy.
the time series based on their similarities in full-length space They focused handling a special type of MTS collected by
and then computes only one MPL for each cluster. In [69], the sensors of different sampling rate. The proposed ap-
the authors presented an extension of ECTS, called as Relaxed proaches [10], [41], [42] first estimate the class-wise MPLs for
ECTS, to find shorter MPLs. It relaxed the stability condition each component (i.e., time series) separately and then develop
of RNN while computing MPLs for the clusters. a class forwarding method to early classify an incoming
In [50], the authors proposed an MTS Early Classification MTS using the computed MPLs. On the other hand, the
based on PAA (MTSECP) approach where PAA stands for approach [42] proposed a divide-and-conquer based method
Piecewise Aggregated Approximation [88]. MTSECP first to handle the different sampling rate components.
applies a center sequence method [89] to transform each MTS Gupta et al. [8] extended the concept of early classification
instance of the dataset into UTS and then reduces the length for the MTS with faulty or unreliable components. They
of the transformed UTS by using PAA method. proposed a Fault-tolerant Early Classification of MTS (FECM)
approach to classify an ongoing human activity by using
the MTS of unreliable sensors. FECM first identifies the
B. MPL computation using posterior probabilities
faulty components using Auto Regressive Integrated Moving
Apart from RNN, some researchers have also utilized the Average (ARIMA) model [91] whose parameters are learned
posterior probabilities for MPL computation of time series. from the training instances. A utility function is also developed
This group of early classification approaches computes a class to optimize the tradeoff between accuracy At and earliness E,
discriminative MPL for each class label of the dataset. For as given below
a given training dataset, these approaches fit a probabilistic
classifier in the prefix space of length t, where 1 ≤ t ≤ T . 2 × At × E
U(X[1, t]) = . (4)
The probabilistic classifier provides posterior class probabil- At + E
ities for each time series of the training dataset. The class The accuracy At is computed using the confusion matrix ob-
discriminative MPL for the class label y is set to t if tained by applying k-means clustering on the training dataset.
Next, the MPL of a time series X is computed as
Aty ≥ α · ATy , (2)
MPL(X) = argmax{U(X[1, t])}. (5)
where, Aty and ATy are the training accuracy for class label y in 1≤t≤T
the prefix space of length t and full-length space, respectively.
The parameter α denotes a desired level of accuracy of the Finally, FECM employed the kernel density estimation method
early classification and 0 < α ≤ 1. Fig. 5 shows an example for learning the class-wise MPLs.
of the discriminative MPLs for five different classes along the
progress of time series. C. Critical analysis
The prefix based early classification approaches are simple
Classes y5 y4 {y1 , y3 } y2
Timeline and easy to understand. However, due to a lack of interpretabil-
Time t1 t2 t3 t4 T ity in the early classification results, these approaches are not
Fig. 5: Illustration of the discriminative MPLs for five different class labels, suitable for medical applications. As we already categorized
i.e., y1 , y2 , · · · , y5 . the approaches based on their similarities, we now review them
based on the following parameters: strength, limitation, and
major concern. These parameters are sufficient to make a crit-
Mori et al. [59] proposed an Early Classification framework ical or insightful analysis of an early classification approach.
based on DIscriminativeness and REliability (ECDIRE) of the Table III presents the analysis of prefix based approaches using
classes over time. ECDIRE employed GP classifier [90] to the above mentioned parameters.
compute the class discriminative MPLs. It also computes a
threshold for each class label to ensure the reliability of the
predictions. The threshold for the class label y is computed as VI. S HAPELET BASED E ARLY CLASSIFICATION
θt,y = min {pt1 (X) − pt2 (X)}, (3) This section presents a detailed review of the approaches
X∈Dy
that have used shapelets for the early classification of time
where, pt1 (X) and pt2 (X) denote first and second highest series. The authors in [68], [92] have successfully implemented
posterior probabilities for a training time series X using the the idea of shapelets for time series classification, which
prefix of length t, respectively. The dataset Dy consists the became the motivation point for many researchers to utilize
time series that are correctly classified in class y. the shapelets for achieving the earliness in the classification.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
8
TABLE III: Analysis of prefix based early classification approaches using their strength, limitation, and major concern.
Paper Strength Limitation Major Concern
[32] Robust entropy-based utility measure Time series has to be discretized properly Symbolic representation of time series
Simple model construction with Cluster separation can not be guaranteed for MPLs computation using hierachical
[33]
reliability assurance small datasets clustering
[69] Simple and effective model with 1-NN Overfitting problem for small datasets Relaxation of RNN stability condition
Class discriminative MPLs with Avoiding unnecessary predictions before
[59] No check for stability while learning MPLs
reliability threshold the availability of sufficient data
Game theory based tradeoff Need to set several parameters which requires Computation of class-wise MPLs using
[64]
optimization domain knowledge probabilistic classifier
Abililty to handle MTS using center Approximation of segments causes to lose
[50] Transformation of MTS into UTS
sequence transformation identifiable information of the classes
Number of samples in the faulty components Selection of relevant components of
[8] Robust to faulty components of MTS
should be equal to non-faulty MTS
Capable to classify MTS with varying Component of highest sampling rate is required Designing of the class forwarding
[10]
length of components to have full length before starting the prediction method to incorporate correlation
Ability to handle MTS generated from
[42] No check for stability while learning MPLs Utilization of correlation
the sensors of different sampling rate
For a given training dataset, the early classification ap- training dataset and then computes a distance threshold δ for
proaches first extract all possible subsequences (segments) of each subsequence. Next, in feature selection step, the authors
the time series with different lengths and then evaluate their select key shapelets based on their utility. In EDSC, the utility
quality and earliness to obtain a set of key shapelets. Let of the shapelet S is computed using its precision P and
S = {s, l, δ, y} be a shapelet as discussed in the fundamentals. weighted recall Rw , as given below
The distance threshold δ of the shapelet S is computed as 2 × P (S) × Rw (S)
U (S) = .
P (S) + Rw (S)
dist(s, s0 ) ,
d= min (6) The precision P (S) captures the class distinctive ability of
s0 vX,|s0 |=|s|
the shapelet on the training dataset. On the other, the weighed
where, the symbol v is used to select a subsequence from recall Rw (S) captures earliness and frequency of shapelets in
the set of all subsequences of X. The authors in the existing the training instances.
approaches [3], [43], [45], [47], [48], [60] have developed Ghalwash et al. [60] presented an extension of EDSC called
different methods for computing the distance threshold δ. The as Modified EDSC with Uncertainty (MEDSC-U) estimate.
shapelets are filtered out based on their utility to obtain the key The uncertainty estimate indicates the confidence level with
shapelets, which are later used to early classify an incomplete which the prediction decision is made, and if it is less than
time series. some user-defined confidence level then the decision may be
An example of the early classification using shapelets is delayed even after a shapelet is matched. The work in [73]
illustrated in Fig. 6. The class label of the shapelet S is also introduced an Improved version of EDSC (IEDSC) with
assigned to X0 if the distance d between X0 and S is less than a trend-based euclidean distance.
its pre-computed threshold δ. The shapelet based approaches In [43], the authors utilized shapelets for early classification
are divided into two groups according to the key shapelet of gene expression data. A Multivariate Shapelets Detection
selection methods. (MSD) method is proposed to classify an incoming MTS by
Shapelet S
extracting the key shapelets from the training dataset. MSD
(s, l, δ, c) If d ≤ δ: finds several multivariate shapelets from all dimensions of
Assign class c to X′ MTS with same start and end points.
d Lin et al. [49] developed a Reliable EArly ClassifiTion
(REACT) approach for MTS where some of the components
Incomplete time series (X′ )
are categorical along with numerical. REACT first discretizes
the categorical time series and then generates their shapelets.
Fig. 6: Early classification of the time series using shapelet. It employed a concept of Equivalence Classes Mining [93]
to avoid redundant shapelets. Finally, a change-point based
distance measure is proposed in [72], to compute similarity
A. Key shapelet selection using utility measure between the time series and shapelet. Besides that, deep
learning techniques [71] are also employed for extracting the
The first work to address the early classification problem
multi-scale shapelets based on a cost function.
using shapelets is presented in [34]. The authors developed
an approach called Early Distinctive Shapelet Classification
(EDSC), which utilizes the local distinctive subsequences as B. Key shapelet selection using clustering
shapelets for the early classification. EDSC consists of two He et al. [47] attempted to solve an imbalanced class
major steps: feature extraction and feature selection. In former problem of ECG classification where training instances in
step, it first finds all local distinctive subsequences from the the abnormal class are much lesser than normal. The authors
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
9
TABLE IV: Analysis of shapelet based early classification approaches using their strength, limitation, and major concern.
Paper Strength Limitation Major Concern
No assurance of classification
[34] Highly interpretable shapelets Utility based shapelet selection
accuracy during training
Uncertainty estimates with highly Generate huge number of candidate Computation of shapelet rank by incorporating
[60]
interpretable shapelets shapelets of varying length accuracy and earliness
Multi-scale deep features (shapelets) of Need to set several parameters at each Automatic feature extraction using deep
[71]
time series layer of the model learning models
Robust to distance information noise while Extracted shapelets tend to lose Computation of distance between shapelet and
[72]
extracting shapelets natural interpretability time series in change-point space
Computationally inefficient due to Developing trend based euclidean distance to
[73] Diverse and highly interpretable shapelets
large number of shapelets incorporate diversity
Extracted shapelets can not have Separate distance threshold along each
[43] Multivariate shapelets with high utility
variable length of dimensions dimension of MTS
Ability to handle imbalance distribution of
[47] Limited to binary classification Clustering based core shapelet selection
the instances among classes
Ability to identify relevant dimensions of Data points of time series must be Formulation of convex optimization problem
[45]
MTS obtained at regular interval for key shapelet selection
Utilization of internal relationship among Quality of shapelets heavily depends Evaluation strategy to check the quality of
[48]
the dimensions on the employed clustering method shapelets
Capable enough to classify a time series Pattern discovery using sequential and
[49] Computationally inefficient
with categorical samples simultaneous combinations of shapelets
Highly interpretable shapelets with
[52] Limited to binary classification Key-point based shapelet extraction
confidence estimates about early prediction
No assurance of reliability and need to
[51] Ability to classify asynchronous MTS Inclusion of short-term trend in features
set several parameters
addressed this problem in the framework of early classification of a time series X is X[t] if
and proposed a solution approach, called as Early Predic-
tion on Imbalanced MTS (EPIMTS). At first, the candidate X[t] > X[t − 1] & X[t] > X[t + 1]
shapelets are clustered using Silhouette Index method [94]. or
Later, the shapelets in the clusters are ranked according to a X[t] < X[t − 1] & X[t] < X[t + 1].
Generalized Extended F-Measure (GEFM). The shapelet with
maximum rank is used to present the respective cluster. For a where, 1 ≤ t ≤ T . Next, the turning point of X is X[t] if the
shapelet S, GEFM is computed as following condition holds.
T
1
X X[i] − X[i − 1]
GEF M (S) = , (7) (X[t + 1] − 2X[t] + X[t − 1]) > .
w0 /E(S) + w1 /P (S) + w2 /R(S) i=1
T −1
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
10
[48], [52] utilize the correlation among the components of the [74] computed a confidence-based threshold to indicate the
shapelets, which improved their earliness and interpretability data sufficiency for making the early prediction.
to a great extent. Further, a detailed analysis of the shapelet The authors in [39] introduced a two-tier early classification
based approaches is presented in Table IV. approach based on the master-slave paradigm. In the first tier,
the slave classifier computes posterior probabilities for each
VII. M ODEL BASED EARLY CLASSIFICATION class label of the dataset and constructs a feature vector for
each training time series. Let p1:k (X) be a set of k posterior
This section discusses the model based early classification
probabilities which is obtained for the training time series X.
approaches for time series data. Unlike prefix and shapelet
The feature vector for X is given as
based approaches, the model based approaches [55], [57],
[63], [70] formulate a mathematical model to optimize the
tradeoff between earliness and reliability. We divide these
approaches into two following groups based on the type of FX = {y(X), p1:k (X), ∆X }, (9)
adopted classifier.
where, y(X) is the most probable class label and ∆X is the
difference between first and second highest posterior proba-
A. Using discriminative classifier bilities. The feature vector is passed to the master classifier.
In [29], the authors developed an ensemble model to recog- In the second tier, the master classifier checks the reliability
nize the type of gas using an incomplete 8-dimensional time of the probable class label and makes the decision.
series generated from a sensors-based nose. The ensemble
model consists of a set of probabilistic classifiers with a
reject option that allows them to express their doubt about B. Using generative classifier
the reliability of the predicted class label. The probabilistic
The authors in [63] formulated a decision rule to classify an
classifier assigns a class label y 0 to the incomplete time series
incomplete time series with some predefined reliability thresh-
X0 as given below
old. Two generative classifiers linear SVM and QDA [95]
with the formulated decision rule, are adopted to provide the
y0 = argmax {p(y|X0 )}. (8) desired level of accuracy in the early classification. An Early
y∈{y1 ,y2 ,··· ,yk } QDA model is proposed in [70] assuming that the training
If p(y 0 |X0 ) is close to 0.5 then the classifier chooses reject instances have Gaussian distribution. This assumption helps
option to express the doubt on the class label y 0 . Another estimate parameters (i.e., mean and covariance) easily from
work in [40] focused on minimizing response time to obtain training data.
the earliness in the classification. This work developed an Antonucci et al. [65] developed a generative model based
empirical function that optimizes the earliness with a high approach for early recognition of Japanese vowel speakers
degree of confidence. using their speech time series data. The proposed approach
Dachraoui et al. [46] proposed a non-myopic early classi- employed an imprecise HMM (iHMM) [96] to compute like-
fication approach where the term non-myopic means, at each lihood of intervals of incoming time series with respect to the
time step t, the classifier estimates an optimal time τ ∗ to give training instances. For the reliable prediction, a class label is
an assurance of the reliable prediction in the future. For an assigned to the time series only if the ratio of two highest
0
incomplete time series Xt with t data points, the optimal time likelihoods is greater than a predefined threshold.
∗
τ is calculated using following expression
0
τ∗ = argmax fτ (Xt ), C. Critical analysis
τ ∈{0,1,··· ,T −t}
0 We found two exciting approaches [46], [54] addressing
where, the function fτ (Xt ) estimates an expected cost of the the early classification problem with the non-myopic property.
future time step t + τ . The authors in [54] pointed out two However, the computational complexity of these approaches is
weaknesses of the work [46]: i) assumption of low intra-cluster very high during classification. We analyzed that the genera-
variability, and ii) clustering with the complete time series. tive classifier based early classification approaches are more
In [54], two different algorithms (NoCluster and 2Step) are complicated than those of discriminative classifiers. This work
introduced to overcome these weaknesses while preserving the also analyzed the model based approaches by their strength,
adaptive and non-myopic properties of the classifier. limitation, and major concern, as summarized in Table V.
Mori et al. [57] proposed an EarlyOpt framework that
formulates a stopping rule using the two highest posterior
probabilities obtained from the classifiers. The main objective VIII. M ISCELLANEOUS APPROACHES
of EarlyOpt is to minimize the cost of prediction by satisfying
the stopping rule. The authors also conducted a case study This section covers the early classification approaches that
for IBEX 35 stock market in [62]. In another work [55], do not meet the inclusion criteria of other categories. We split
they developed different stopping rules by using the class- the included approaches into the following two groups: with
wise posterior probabilities. Besides that, the authors in [56], and without tradeoff.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
11
TABLE V: Analysis of model based early classification approaches using their strength, limitation, and major concern.
Paper Strength Limitation Major Concern
Guarantee of the desired level of accuracy Several iterations are required for setting
[70] Construction of the optimal decision rules
with earliness an appropriate value of reliability threshold
Capable to estimate future time step where Cost function heavily relies on the Designing of the cost-based trigger
[46]
correct prediction is expected clustering accuracy function
Likelihood based similarity measure
[65] Ability to work on streaming data High domain dependency
between the time series
Stopping rules using selected posterior
[62] Simple and effective model Non-convex cost function
probabilities
Formulation of the convex optimization
[40] Ensemble classifier with reject option Computationally inefficient
problem for training
Impose unnecessary computations due to
[54] Adaptive to the testing time series Designing of the trigger function
non-myopic property
Stopping rules using all posterior
[55] Highly relevant cost functions Non-convex cost function
probabilities
Effective model with the adaptive learning Single confidence threshold is not Fusion of multiple classifiers for making
[56]
of important parameters sufficient for generalization the early decision
Class-wise safeguard points with the No check for stability while discovering
[74] Avoidance of premature predictions
assurance of desirable accuracy safeguard points
Difficult to find a suitable interval length Master-slave architecture for finding an
[39] Robust to varying start time of events
to capture identifiable patterns optimal decision time
Limited to work with probabilistic Providing a reject option to express the
[29] Stable ensemble classifier
classifiers only doubt about reliability of prediction
A. With tradeoff In [44], [77], hybrid early classification models are pre-
The authors in [75], [76] introduced a reinforcement learn- sented by combining a generative model with a discriminative
ing based early classification framework using a Deep Q- model. At first, HMM classifiers are trained over short seg-
Network (DQN) [97] agent. The framework uses a reward ments (shapelets) of the time series to learn the distribution
function to keep balance between the accuracy and earliness. of patterns in the training data. Next, the trained classifiers
The DQN agent learns an optimal decision-making strategy generate an array of log likelihoods for the disjoint shapelets
during training, which helps pick a suitable action after of the time series, which is passed as feature vector for training
receiving an observation in an incoming time series during the discriminative classifiers.
the testing. The work in [37], [38] employed a stochastic process,
In another work [61], the authors developed a deep neural called as Point Process model, to capture the temporal dy-
network based early classification framework that focused on namics of different components of MTS. At first, the temporal
optimizing the tradeoff by estimating the stopping decision dynamics of each component is extracted independently and
probabilities at all time stamps of the time series. They then the sequential cue that occurs among the components is
formulated a new loss function to compute the loss of the computed to capture the temporal order of events.
classifier when a class label y is predicted for an incomplete Recently, Huang et al. [58] proposed a Multi-Domain
0
time series Xt using first t data points. The loss at time t is Deep Neural Network (MDDNN) based early classification
computed as framework for MTS. MDDNN employed two widely used
deep learning techniques including CNN and LSTM. It first
0 0 truncates the training MTS up to a fixed time step and then
Lt (Xt , y; β) = βLa (Xt , y) + (1 − β)Le (t), (10) gives it as input to a CNN layer which is followed by another
where, β is a tradeoff parameter to control the weights of the CNN and LSTM layers. Frequency domain features are also
accuracy loss La (·) and earliness loss Le (·). calculated from the truncated MTS. Another work in [53],
computed a vector of statistical features from the incomplete
time series for its classification.
B. Without tradeoff Furthermore, the authors in [98]–[100] worked on video
The first work that mentioned the early classification of time data for early recognition of an ongoing activity by extracting
series is presented in [35]. The authors aimed to classify the the time series features. In particular, the work in [98] first rep-
incomplete time series, but they did not attempt to optimize resented the human activity as histograms of spatio-temporal
the tradeoff between accuracy and earliness. In [35], the time features obtained from the sequence of video frames. Later,
series is represented by its states such as increase, decrease, these histograms are used to classify an ongoing activity. The
stay, over the time, and so on. The authors first divided the authors in [99] encoded the video frames into the histograms
time series into segments such that the segment can capture of oriented velocity, which are later used as features for
only one particular state. Later, each segment is replaced by a early classification of the human activity. Hoai et al. [100]
predicate indicating the presence of the state in terms of True introduced a maximum-margin framework for early detection
or False. Finally, the availability of the predicates in the time of facial expressions by using the time series data of the
series is used for the early classification. partially executed events.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
12
TABLE VI: Analysis of miscellaneous early classification approaches based on their strength, limitation, and major concern.
Paper Strength Limitation Major Concern
[35] Simple and effective model No guarantee about the earliness Improvement of the classification accuracy
Ability to classify UTS, MTS and Difficult to understand the early classification Designing reward function to make
[75]
symbolic sequence problem as reinforcement learning balance between accuracy and earliness
Difficult to select appropriate values for the Designing stopping rule using deep
[61] Scalability to large dataset
parameters learning based posterior probabilities
Statistical features lose the temporal Computing relevant features from the time
[53] Computationally efficient model
information of the time series series data
Complex formulation with large number of
[76] Support to online learning Designing reliable reward function
parameters
Highly accurate model for the gene Computationally inefficient due to large Combining generative and discriminative
[77]
expression MTS number of segments models
Ability to capture identifiable patterns
[38] No assurance of the reliable prediction Utilization of the correlation
of an ongoing time series
Capable to extract features Unable to utilize correlation among the Combining deep learning techniques for
[58]
automatically components of MTS the early prediction of class label
C. Critical analysis the deep learning techniques and reinforcement framework are
A primary objective of the early classification approach is to also employed for optimizing the tradeoff [61], [75], [76].
build a classifier that can provide earliness while maintaining 2) Interpretability: In order to improve the adaptability
a desired level of reliability or accuracy. However, some ap- of an early classification approach, the results should be
proaches [35], [37], [38], [44], [58], [77] do not ensure the re- interpretable enough to convince the domain experts, and
liability, but they are capable enough to classify an incomplete thus interpretability becomes a challenge for the researchers.
time series. Recently, the researchers in approaches [58], [61], The studies in [34], [43], [45], [47], [52], [60], [71], [73]
[75], [76] have successfully employed reinforcement learning have primarily focused on extracting the interpretable features
and deep learning techniques for the early classification. These (shapelets) from the time series. In particular, the authors
approaches have unfolded a new direction for further research in [34], [60] selected the shapelets (subsequences) based on
in this area. Analysis of the miscellaneous early classification their utility and interpretability. The work in [73] developed
approaches is presented in Table VI. a trend-based distance measure to find a diverse set of inter-
pretable shapelets. In [51], the authors extracted the shapelets
IX. D ISCUSSION with short-term trends to improve the interpretability of results
while classifying the irregular time series of physiological
With the presented categorization of early classification
measurements.
approaches, one can get a quick understanding of the notable
3) Reliability: It is a confidence measure that guarantees
contributions that have been made over the years. After review-
a certain level of accuracy in the early classification. Without
ing the literature, we found that most of the early classification
the assurance of reliability, the early classification algorithm
approaches have appeared after ECTS [33]. Although the
can not be used in real-world applications such as medical
authors in [35] have attempted to achieve the earliness far
diagnostic and industrial process monitoring. In studies [43],
before than ECTS, they did not maintain the reliability of
[56], [59], [60], [63], [70], [72], [74], [76], the authors have
class prediction. We therefore discussed such approaches in
attempted to improve the reliability of early prediction. The
the without tradeoff category. Next, we discuss the challenges
authors in [72] developed a confidence area based criterion
and recommendations of the reviewed studies.
to ensure the reliability of early prediction. The approaches
in [43], [60] learned a confidence threshold for each of
A. Challenges the extracted key shapelet. The learned threshold is used to
After reviewing the included studies, we observed that the indicate the uncertainty estimate (reliability) of the shapelet
researchers have encountered four major challenges while for achieving the earliness. The prefix based approaches in [8],
developing the early classification approaches for time series [10], [32], [33], [59], [69] also ensure the reliability while
data. These challenges are discussed below: learning the minimum prediction length, but they did not
1) Tradeoff optimization: A most critical challenge before mentioned it explicitly.
an early classifier is to optimize the tradeoff between accu- 4) Correlation: There exists an inherent correlation among
racy and earliness. The studies in [8], [10], [41], [42], [59], the components of MTS, which can help find the identifiable
[64] have attempted to address this challenge by learning a patterns at early stage. However, it is challenging to incorpo-
minimum required length of time series while maintaining a rate such correlation in the classification. The studies in [37],
desired level of accuracy during the training. The researchers [38] employed point process models to early classify the MTS
in [55], [57], [62] introduced stopping rules to decide the right by using its temporal dynamics and sequence cue patterns.
time for early classification of an incomplete time series. Such These patterns are capable enough to capture the correlation
a decision is evaluated through a cost function that ensures among components of MTS. In [10], [41], [42], the researchers
the balance between accuracy and earliness. A game theory have introduced class forwarding methods to incorporate the
model is adopted in [41] for tradeoff optimization. Recently, correlation during early classification.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
13
B. Recommendations for future work [4] S. M. Idrees, M. A. Alam, and P. Agarwal, “A prediction approach for
stock market volatility based on time series data,” IEEE Access, vol. 7,
We briefly discuss some of the recommendations that have pp. 17 287–17 298, 2019.
been provided by the reviewed studies. These recommenda- [5] J. Yin, Y.-W. Si, and Z. Gong, “Financial time series segmentation
tions help to conduct future research in this area to mitigate based on turning points,” in Proc. ICSSE, 2011, pp. 394–399.
[6] P. Esling and C. Agon, “Multiobjective time series matching for audio
the limitations of the existing literature. classification and retrieval,” IEEE Transactions on Audio, Speech, and
1) Prefix based approaches: The studies in [32], [33], Language Processing, vol. 21, no. 10, pp. 2057–2072, 2013.
[69] suggested to employ probabilistic classifiers for obtain- [7] W. Pei, H. Dibeklioğlu, D. M. Tax, and L. van der Maaten, “Multi-
variate time-series classification using the hidden-unit logistic model,”
ing shorter and effective MPLs. They also recommended to IEEE transactions on neural networks and learning systems, vol. 29,
explore early classification methods for steaming time series no. 4, pp. 920–931, 2017.
with multiple class labels. A feature subset selection can be [8] A. Gupta, H. P. Gupta, B. Biswas, and T. Dutta, “A fault-tolerant
early classification approach for human activities using multivariate
incorporated to improve the reliability of MPLs [50]. The work time series,” IEEE Transactions on Mobile Computing, pp. 1–1, 2020.
in [59] suggested to use an informative uncertainty measure in [9] S. Aminikhanghahi, T. Wang, and D. J. Cook, “Real-time change
the learning phase for accommodating additional knowledge of point detection with application to smart home time series data,” IEEE
Transactions on Knowledge and Data Engineering, vol. 31, no. 5, pp.
the classes. The desired level of accuracy can be determined 1010–1023, 2019.
automatically by the knowledge of application domain [42]. [10] A. Gupta, H. P. Gupta, B. Biswas, and T. Dutta, “An early classification
The study in [8] recommended to extract relevant features from approach for multivariate time series of on-vehicle sensors in trans-
portation,” IEEE Transactions on Intelligent Transportation Systems,
the MTS for its early classification. pp. 1–1, 2020.
2) Shapelet based approaches: Neural network models, es- [11] R. H. Shumway and D. S. Stoffer, Time series analysis and its
pecially LSTM, can deal with time series data in more natural applications: with R examples. Springer, 2017.
[12] P. Esling and C. Agon, “Time-series data mining,” ACM Computing
way [71]. The studies in [34], [73] suggested to incorporate Surveys, vol. 45, no. 1, pp. 1–34, 2012.
better similarity measures for improving the effectiveness of [13] G. Mahalakshmi, S. Sridevi, and S. Rajaram, “A survey on forecasting
feature selection step. Fourier and wavelet transform tech- of time series data,” in Proc. ICCTIDE, 2016, pp. 1–8.
[14] S. Aghabozorgi, A. S. Shirkhorshidi, and T. Y. Wah, “Time-series
niques can be employed to obtain an useful combination of the clustering–a decade review,” Information Systems, vol. 53, pp. 16–38,
features [49]. The authors in [43] indicated to incorporate a 2015.
concept of maximal closed shapelets for pruning the redundant [15] A. Bagnall, J. Lines, A. Bostrom, J. Large, and E. Keogh, “The great
time series classification bake off: a review and experimental evalu-
and smaller shapelets. The study in [48] suggested to utilize ation of recent algorithmic advances,” Data Mining and Knowledge
the sequential relationship between the components of MTS Discovery, vol. 31, no. 3, pp. 606–660, 2017.
shapelets for achieving better earliness. [16] A. Sharabiani, H. Darabi, A. Rezaei, S. Harford, H. Johnson, and
F. Karim, “Efficient classification of long time series by 3-d dynamic
3) Model based approaches: The study in [46] suggested time warping,” IEEE Transactions on Systems, Man, and Cybernetics:
to use the training instances for predicting the future decision Systems, vol. 47, no. 10, pp. 2688–2703, 2017.
cost without using any clustering method. The work in [62] [17] A. Bagnall, J. Lines, W. Vickers, and E. Keogh, “The UEA & UCR
Time Series Classification Repository,” 2020. [Online]. Available:
recommended the use of genetic algorithms to automatically www.timeseriesclassification.com
learn the shape of the stopping rule from the training data. The [18] D. Dheeru and E. Karra Taniskidou, “UCI machine learning
early classification can also be formulated as multi-objective repository,” 2020. [Online]. Available: http://archive.ics.uci.edu/ml
[19] J. Lines and A. Bagnall, “Time series classification with ensembles
optimization problem without specifying the desired level of of elastic distance measures,” Data Mining and Knowledge Discovery,
accuracy [55]. The study in [54] suggested to employ the time vol. 29, no. 3, pp. 565–592, 2015.
series specific classifiers to develop an efficient cost function [20] T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover,
Q. Zhu, J. Zakaria, and E. Keogh, “Addressing big data time series:
for early classification. Mining trillions of time series subsequences under dynamic time
4) Miscellaneous approaches: A dynamic adjustment warping,” ACM Transactions on Knowledge Discovery from Data,
strategy is recommended in [75], [76] for setting the reward vol. 7, no. 3, pp. 1–31, 2013.
function parameters in the reinforcement learning-based early [21] D. J. Berndt and J. Clifford, “Using dynamic time warping to find
patterns in time series,” in Proc. KDD workshop, vol. 10, no. 16, 1994,
classification approach. The studies in [37], [38] suggested pp. 359–370.
to explore domain-dependent density functions to capture the [22] H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller,
structure of time series data. A hybrid model can be developed “Deep learning for time series classification: a review,” Data Mining
and Knowledge Discovery, vol. 33, no. 4, pp. 917–963, 2019.
to incorporate statistical features with the incomplete time [23] Z. Wang, W. Yan, and T. Oates, “Time series classification from scratch
series during its early classification [53]. The work in [58] with deep neural networks: A strong baseline,” in Proc. IJCNN, 2017,
recommended to improve the interpretability of the neurons pp. 1578–1585.
[24] C.-L. Liu, W.-H. Hsaio, and Y.-C. Tu, “Time series classification
in neural networks for addressing the imbalanced distribution with multivariate convolutional neural network,” IEEE Transactions on
of the training instances. Industrial Electronics, vol. 66, no. 6, pp. 4788–4797, 2018.
[25] F. M. Bianchi, S. Scardapane, S. Lkse, and R. Jenssen, “Reservoir
R EFERENCES computing approaches for representation and classification of multivari-
ate time series,” IEEE Transactions on Neural Networks and Learning
[1] B. Liu, J. Li, C. Chen, W. Tan, Q. Chen, and M. Zhou, “Efficient motif Systems, pp. 1–11, 2020.
discovery for large-scale time series in healthcare,” IEEE Transactions [26] F. Karim, S. Majumdar, H. Darabi, and S. Harford, “Multivariate lstm-
on Industrial Informatics, vol. 11, no. 3, pp. 583–590, 2015. fcns for time series classification,” Neural Networks, vol. 116, pp. 237–
[2] G. Chen, G. Lu, W. Shang, and Z. Xie, “Automated change-point 245, 2019.
detection of eeg signals based on structural time-series analysis,” IEEE [27] J. Yoon, D. Jarrett, and M. van der Schaar, “Time-series generative
Access, vol. 7, pp. 180 168–180 180, 2019. adversarial networks,” in Proc. NIPS, 2019, pp. 5508–5518.
[3] G. He, Y. Duan, G. Zhou, and L. Wang, “Early classification on [28] C. Esteban, S. L. Hyland, and G. Rätsch, “Real-valued (medical)
multivariate time series with core features,” in Proc. DEXA, 2014, pp. time series generation with recurrent conditional gans,” arXiv preprint
410–422. arXiv:1706.02633, 2017.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
14
[29] N. Hatami and C. Chira, “Classifiers with a reject option for early and earliness,” IEEE transactions on neural networks and learning
time-series classification,” in Proc. CIEL, 2013, pp. 9–16. systems, vol. 29, no. 10, pp. 4569–4578, 2017.
[30] K. Fauvel, D. Balouek-Thomert, D. Melgar, P. Silva, A. Simonet, [56] J. Lv, X. Hu, L. Li, and P. Li, “An effective confidence-based early
G. Antoniu, A. Costan, V. Masson, M. Parashar, I. Rodero, and classification of time series,” IEEE Access, vol. 7, pp. 96 113–96 124,
A. Termier, “A distributed multi-sensor machine learning approach to 2019.
earthquake early warning,” in Proc. AAAI, vol. 34, no. 01, 2020, pp. [57] U. Mori, A. Mendiburu, S. Dasgupta, and J. Lozano, “Early classifi-
403–411. cation of time series from a cost minimization point of view,” in Proc.
[31] A. Dachraoui, A. Bondu, and A. Cornuejols, “Early classification of NIPS Workshop, 2015.
individual electricity consumptions,” Real-World Challenges for Data [58] H.-S. Huang, C.-L. Liu, and V. S. Tseng, “Multivariate time series
Stream Mining, pp. 18–21, 2013. early classification using multi-domain deep neural network,” in Proc.
[32] Z. Xing, J. Pei, G. Dong, and P. S. Yu, “Mining sequence classifiers DSAA, 2018, pp. 90–98.
for early prediction,” in Proc. SIAM, 2008, pp. 644–655. [59] U. Mori, A. Mendiburu, E. Keogh, and J. A. Lozano, “Reliable early
[33] Z. Xing, J. Pei, and P. S. Yu, “Early prediction on time series: A nearest classification of time series based on discriminating the classes over
neighbor approach,” in Proc. IJCAI, 2009, pp. 1297–1302. time,” Data mining and knowledge discovery, vol. 31, no. 1, pp. 233–
[34] Z. Xing, J. Pei, P. S. Yu, and K. Wang, “Extracting interpretable features 263, 2017.
for early classification on time series,” in Proc. SIAM, 2011, pp. 247– [60] M. F. Ghalwash, V. Radosavljevic, and Z. Obradovic, “Utilizing tem-
258. poral patterns for estimating uncertainty in interpretable early decision
[35] C. J. Alonso González and J. J. R. Diez, “Boosting interval-based making,” in Proc. SIGKDD, 2014, pp. 402–411.
literals: Variable length and early classification,” in Data mining in [61] M. Rußwurm, S. Lefevre, N. Courty, R. Emonet, M. Körner, and
time series databases, 2004, pp. 149–171. R. Tavenard, “End-to-end learning for early classification of time
[36] T. Santos and R. Kern, “A literature survey of early time series series,” arXiv preprint arXiv:1901.10681, 2019.
classification and deep learning,” in Proc. Sami@iKnow, 2016. [62] U. Mori, A. Mendiburu, I. M. Miranda, and J. A. Lozano, “Early clas-
[37] K. Li, S. Li, and Y. Fu, “Early classification of ongoing observation,” sification of time series using multi-objective optimization techniques,”
in Proc. ICDM, 2014, pp. 310–319. Information Sciences, vol. 492, pp. 204–218, 2019.
[38] S. Li, K. Li, and Y. Fu, “Early recognition of 3d human actions,” [63] H. S. Anderson, N. Parrish, and M. R. Gupta, “Early time series
ACM Transactions on Multimedia Computing, Communications, and classification with reliability guarantee.” Sandia National Lab.(SNL-
Applications, vol. 14, no. 1s, pp. 1–21, 2018. NM), Albuquerque, NM (United States), Tech. Rep., 2012.
[39] P. Schäfer and U. Leser, “Teaser: Early and accurate time series [64] A. Gupta, R. Pal, R. Mishra, H. P. Gupta, T. Dutta, and P. Hirani,
classification,” Data Mining and Knowledge Discovery, pp. 1–27, 2020. “Game theory based early classification of rivers using time series
[40] S. Ando and E. Suzuki, “Minimizing response time in time series data,” in Proc. WF-IoT, 2019, pp. 686–691.
classification,” Knowledge and Information Systems, vol. 46, no. 2, [65] A. Antonucci, M. Scanagatta, D. D. Mauá, and C. P. de Campos, “Early
pp. 449–476, 2016. classification of time series by hidden markov models with set-valued
[41] A. Gupta, H. P. Gupta, and T. Dutta, “Early classification approach for parameters,” in Proc. NIPS Workshop, 2015.
multivariate time series using sensors of different sampling rate,” in [66] A. Albahri, R. A. Hamid et al., “Role of biological data mining and
Proc. SECON, 2019, pp. 1–2. machine learning techniques in detecting and diagnosing the novel
[42] A. Gupta, H. P. Gupta, B. Biswas, and T. Dutta, “A divide-and- coronavirus (covid-19): A systematic review,” Journal of Medical
conquer–based early classification approach for multivariate time series Systems, vol. 44, no. 7, 2020.
with different sampling rate components in iot,” ACM Transactions on [67] O. Albahri, A. Albahri, K. Mohammed, A. Zaidan, B. Zaidan,
Internet of Things, vol. 1, no. 2, pp. 1–21, 2020. M. Hashim, and O. H. Salman, “Systematic review of real-time remote
[43] M. F. Ghalwash and Z. Obradovic, “Early classification of multivariate health monitoring system in triage and priority-based sensor technol-
temporal observations by extraction of interpretable shapelets,” BMC ogy: Taxonomy, open challenges, motivation and recommendations,”
bioinformatics, vol. 13, no. 1, p. 195, 2012. Journal of medical systems, vol. 42, no. 5, p. 80, 2018.
[44] M. F. Ghalwash, D. Ramljak, and Z. Obradović, “Early classification [68] L. Ye and E. Keogh, “Time series shapelets: a new primitive for data
of multivariate time series using a hybrid hmm/svm model,” in Proc. mining,” in Proc. SIGKDD, 2009, pp. 947–956.
BIBM, 2012, pp. 1–6. [69] Z. Xing, J. Pei, and S. Y. Philip, “Early classification on time series,”
[45] M. F. Ghalwash, V. Radosavljevic, and Z. Obradovic, “Extraction Knowledge and information systems, vol. 31, no. 1, pp. 105–127, 2012.
of interpretable multivariate patterns for early diagnostics,” in Proc. [70] H. S. Anderson, N. Parrish, K. Tsukida, and M. R. Gupta, “Reliable
ICDM, 2013, pp. 201–210. early classification of time series,” in Proc. ICASSP, 2012, pp. 2073–
[46] A. Dachraoui, A. Bondu, and A. Cornuéjols, “Early classification of 2076.
time series as a non myopic sequential decision making problem,” in [71] W. Wang, C. Chen, W. Wang, P. Rai, and L. Carin, “Earliness-aware
Proc. ECML PKDD, 2015, pp. 433–447. deep convolutional networks for early time series classification,” arXiv
[47] G. He, Y. Duan, T. Qian, and X. Chen, “Early prediction on imbalanced preprint arXiv:1611.04578, 2016.
multivariate time series,” in Proc. CIKM, 2013, pp. 1889–1892. [72] L. Yao, Y. Li, Y. Li, H. Zhang, M. Huai, J. Gao, and A. Zhang, “Dtec:
[48] G. He, Y. Duan, R. Peng, X. Jing, T. Qian, and L. Wang, “Early Distance transformation based early time series classification,” in Proc.
classification on multivariate time series,” Neurocomputing, vol. 149, SIAM, 2019, pp. 486–494.
pp. 777–787, 2015. [73] W. Yan, G. Li, Z. Wu, S. Wang, and P. S. Yu, “Extracting diverse-
[49] Y.-F. Lin, H.-H. Chen, V. S. Tseng, and J. Pei, “Reliable early shapelets for early classification on time series,” World Wide Web-
classification on multivariate time series with numerical and categorical Internet and Web Information Systems, 2020.
attributes,” in Proc. PAKDD, 2015, pp. 199–211. [74] A. Sharma and S. K. Singh, “Early classification of time series based
[50] C. Ma, X. Weng, and Z. Shan, “Early classification of multivariate on uncertainty measure,” in Proc. CICT, 2019, pp. 1–6.
time series based on piecewise aggregate approximation,” in Proc. HIS, [75] C. Martinez, G. Perrin, E. Ramasso, and M. Rombaut, “A deep
2017, pp. 81–88. reinforcement learning approach for early classification of time series,”
[51] L. Zhao, H. Liang, D. Yu, X. Wang, and G. Zhao, “Asynchronous in Proc. EUSIPCO, 2018, pp. 2030–2034.
multivariate time series early prediction for icu transfer,” in Proc. [76] C. Martinez, E. Ramasso, G. Perrin, and M. Rombaut, “Adaptive
ICIMH, 2019, pp. 17–22. early classification of temporal sequences using deep reinforcement
[52] G. He, W. Zhao, and X. Xia, “Confidence-based early classification learning,” Knowledge-Based Systems, vol. 190, p. 105290, 2020.
of multivariate time series with multiple interpretable rules,” Pattern [77] M. F. Ghalwash, D. Ramljak, and Z. Obradović, “Patient-specific early
Analysis and Applications, pp. 1–14, 2019. classification of multivariate observations,” International journal of
[53] G. Ahn, H. Lee, J. Park, and S. Hur, “Development of indicator of data mining and bioinformatics, vol. 11, no. 4, pp. 392–411, 2015.
data sufficiency for feature-based early time series classification with [78] River Dataset, 2020. [Online]. Available: http://thoreau.uchicago.edu/
applications of bearing fault diagnosis,” Processes, vol. 8, no. 7, p. water/thoreaumap index
790, 2020. [79] Wafer and ECG Datasets, 2020. [Online]. Available: https://www.cs.
[54] R. Tavenard and S. Malinowski, “Cost-aware early classification of cmu.edu/∼bobski/
time series,” in Proc. ECML PKDD, 2016, pp. 632–647. [80] A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang, “Ntu rgb+d: A large
[55] U. Mori, A. Mendiburu, S. Dasgupta, and J. A. Lozano, “Early scale dataset for 3d human activity analysis,” in Proc. CVPR, 2016,
classification of time series by simultaneously optimizing the accuracy pp. 1010–1019.
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAI.2020.3027279, IEEE
Transactions on Artificial Intelligence
15
2691-4581 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of South Australia. Downloaded on October 06,2020 at 20:17:24 UTC from IEEE Xplore. Restrictions apply.