Know Abnormal Find Evil
Know Abnormal Find Evil
Article:
Homayoun, S., Dehghantanha, A., Ahmadzadeh, M. et al. (2 more authors) (2020) Know
abnormal, find evil: frequent pattern mining for ransomware threat hunting and intelligence.
IEEE Transactions on Emerging Topics in Computing, 8 (2). pp. 341-351. ISSN 2168-6750
https://doi.org/10.1109/TETC.2017.2756908
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be
obtained for all other users, including reprinting/ republishing this material for advertising or
promotional purposes, creating new collective works for resale or redistribution to servers
or lists, or reuse of any copyrighted components of this work in other works. Reproduced
in accordance with the publisher's self-archiving policy.
Reuse
Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless
indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by
national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of
the full text version. This is indicated by the licence information on the White Rose Research Online record
for the item.
Takedown
If you consider content in White Rose Research Online to be in breach of UK law, please notify us by
emailing [email protected] including the URL of the record and the reason for the withdrawal request.
[email protected]
https://eprints.whiterose.ac.uk/
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 1
Abstract—Emergence of crypto-ransomware has significantly using a strong cryptography algorithm such as AES or RSA
changed the cyber threat landscape. A crypto ransomware [6].
removes data custodian access by encrypting valuable data Ransomware has dominated the threat landscape in 2016
on victims’ computers and requests a ransom payment to re-
instantiate custodian access by decrypting data. Timely detec- with annual increase rate of 267% [7]. It is estimated that
tion of ransomware very much depends on how quickly and in 2014 only, cybercriminals have made more than $3 mil-
accurately system logs can be mined to hunt abnormalities and lion profit using ransomware programs [8]. These days, ran-
stop the evil. In this paper we first setup an environment to somware programs are indiscriminatly targeting all industries
collect activity logs of 517 Locky ransomware samples, 535 Cerber ranging from healthcare to the banking sector and even power
ransomware samples and 572 samples of TeslaCrypt ransomware.
We utilize Sequential Pattern Mining to find Maximal Frequent grids [2]. The Crypto-ransomware programs are much more
Patterns (MFP) of activities within different ransomware families popular than Lockers as almost always security engineers
as candidate features for classification using J48, Random Forest, could find ways to unlock a system without paying the
Bagging and MLP algorithms. We could achieve 99% accuracy ransom while the only viable solution for decrypting strongly
in detecting ransomware instances from goodware samples and encrypted data is to pay ransom and receive decryption key [9].
96.5% accuracy in detecting family of a given ransomware sam-
ple. Our results indicate usefulness and practicality of applying Therefore, focus of this paper is only on crypto-ransomware
pattern mining techniques in detection of good features for ran- and in the rest of the paper, the word ”ransomware” is actually
somware hunting. Moreover, we showed existence of distinctive referring to the ”crypto-ransomware” only. It was already
frequent patterns within different ransomware families which reported that cyber security training and employee awareness
can be used for identification of a ransomware sample family for would reduce the risk of ransomware attacks [10]. However,
building intelligence about threat actors and threat profile of a
given target. automated tools and techniques are required to detect ran-
somware applications before they are launched [11] or within
Index Terms—Malware, ransomware, crypto ransomware, ran-
a short period after their execution [12]. The growing danger
somware detection, ransomware family detection.
of ransomware attacks requires new solutions for prevention,
detection and removing ransomwares programs.
I. I NTRODUCTION In this paper, we are using a sequential pattern mining
YBERCRIMINALS pose a real and persistent threat to
C business, government and financial institutions all around
the globe [1]. The volume, scope and cost of cybercrime all
technique to detect best features for classification of ran-
somware applications from benign apps as well as identifying
a ransomware sample family. We investigate usefulness of
remain on an upward trend [2]. Malicious programs have our detected features by applying them in J48, Random
always been an important tool in cyber criminals portfolios Forest, Bagging and MLP classification algorithms against a
and almost everyday we are detecting new variants of malware dataset contains 517 Locky ransomware samples, 535 Cerber
programs [3]. Development and wide adoption of e-currencies ransomware samples, 572 samples of TeslaCrypt ransomware
such as Bitcoin led to many changes in cybercriminal ac- and 220 standalone Windows Portable and Executable (PE32)
tivities including development of a new type of malware benign applications. We not only achieved 99% accuracy in
called ransomware [4]. Ransomware is a type of malware detection of ransomware samples and 96.5% in detection of
that removes a custodian access to her data and request for their families but reduced the detection time to less than 10
a ransom payment to re-instantiate data access [5]. There are seconds of launching a ransom application; a third of the
two main types of ransomwares namely Locker and Crypto time reported by earlier studies i.e. [13]. Our results are not
ransomwares. The former locks a system and denies users’ only indicative of usefulness of pattern mining techniques in
access without making any changes to the data stored on the identification of best features for hunting ransomware applica-
system while the latter encrypts all or selected data usually tions but show how patterns of different ransomware families
can help in detecting a ransomware family which assist in
S. Homayoun, M. Ahmadzadeh and Raouf Khayami are with the Depart-
ment of IT and Computer Engineering, Shiraz University of Technology, building intelligence about threats applicable to a given target.
Shiraz, Iran. e-mail: [email protected]. To the best of authors knowledge this is the very first paper
A. Dehghantanha is with Department of Computer Science, School of applying sequence pattern mining to detect frequent features
Computing, Science and Engineering, University of Salford, Salford, U.K.
S. Hashemi is with Department of Computer Engineering, Shiraz University, of ransomware applications and to build vectored datasets of
Shiraz, Iran. ransomware applications logs. Our created datasets contain
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2
TP
logs of Dynamic Link Libraries (DLL) activities, file system N −S×N
M CC = p (7)
activities and registry activities of 1624 ransomware samples P S(1 − S)(1 − P )
from three different families and 220 benign applications.
The remainder of this paper is organized as follows. Section
We are using widely accepted criteria namely True Positive
II reviews some related research in while Section III explains
(TP), False Positive (FP), True Negative (TN), and False
our method for collecting and preprocessing of data in a
Negative to evaluate our model [14]–[16]. TP is reflecting
controlled environment. We describe feature extraction and
total samples that correctly identified. FP shows incorrectly
vectorization in Section IV. Section V introduces our approach
identified samples. TN demonstrates the number of correctly
for ransomware detection followed by Section VI that de-
rejected samples, while FN shows incorrectly rejected sam-
scribes our performance in detecting ransomwares families.
ples. Precisions of a classification algorithm is a measure of
Finally, section VII discusses about the achievements of this
relevancy of results and is calculated by dividing TP by total of
paper and concludes the paper.
FP and TP predicted by a classifier as shown in equation (1).
Recall reflects the proportion of positives that are correctly
identified by classification technique which is calculated by II. R ELATED W ORK
dividing TP by total of TP and FN as shown in equation Ransomware programs are reportedly becoming a dominant
(2). F-measure is showing the performance of a classification tool for cybercriminals and a growing threat to our ICT in-
algorithm and is calculated by the harmonic mean of precision frastructure [4], [19], [20]. The possibility of using encryption
and recall as shown in equation (3). techniques to encrypt users data as part of a Denial of Service
(DoS) attack is known for a very long time [21]. However,
TP
P recision = (1) recent adoption of eCurrencies such as BitCoin provided
TP + FP
many new opportunities for attackers including receiving a
TP ransom payment for decrypting users data [21]. In spite of its
Recall = (2)
TP + FN simplicity and primitive utilization of cryptographic techniques
[22], ransomware programs are becoming a major tool in cyber
P recision × Recall
F − measure = 2 × (3) criminals toolset [23]. For any cyber threat, prevention is ideal
P recision + Recall
but detection is a must and ransomware is not an exception
We will also report Receiver Operating Characteristic (ROC) [3], [24].
that is a potentially powerful metric for comparison of different Situational cyber security awareness plays an important role
classifiers, because it is invariant against skewness of classes in preventing cyber-attacks [25]. An educational framework
in the dataset. In a ROC curve the true positive rate is plotted that is tailored to ransomware threats [10] as well as a
in function of the false positive rate for different thresholds. In tool which mimicked ransomware attacks [26] proved to be
addition to ROC, Area Under the Curve (AUC) is a measure useful in reducing ransomware infections. Moreover, technical
of how well a parameter can be used to distinguish between countermeasures such verifying applications trustworthiness
two classes. AUC is a single value that summarizes the ROC when calling a crypto library [27] or minimizing attack surface
by calculating the area of the convex shape below the ROC by limiting end-users privilege proved effective in preventive
curve. AUC can be between 0 and 1, where the value of 1 ransomware attacks [9].
shows optimal point of perfect prediction. Most ransomwares detection solutions are relying on filesys-
Matthews Correlation Coefficient (MCC) [17] provides an- tem [28]–[30] and registry events [31] to identify malicious
other measures of quality to compare different classifiers [18]. behaviors. Investigation of 1359 ransomware samples showed
The MCC value is between −1 and +1, where in cases of that majority of ransomware samples are using similar APIs
perfect prediction it gives +1. −1 coefficient shows total and generating similar logs of filesystem activities [29]. For
disagreement between prediction and observation while the example, using 20 types of filesystem and registry events
coefficient value of 0 indicates that the classifier does not work as features of a Bayesian Network model against 20 Win-
better than a random prediction. MCC is also a useful measure dows ransomware samples resulted to an accurate ransomware
of classifier performance against imbalanced datasets. While detection with F-Measure of 0.93 [31]. UNVEIL [29] as a
Precision, Recall or F-measure values in a random guessing rasnsomware classification system utilized filesystem events
would be higher than 0.5, MCC value would be around 0 to distinguish 13,637 ransomwares from a dataset of 148,223
for random guessing. Therefore, for making sure that our malware samples with accuracy of 96.3%. CloudRPS [32] was
classifiers are far from random classifiers, we will compute a cloud-based ransomware detection system which relied on
MCC values for each classifier. The values can be computed abnormal behaviors such as conversion of large quantities of
using equation (7) which is composed of equations (4), (5) files in a short interval to detect ransomware samples. EldeRan
and (6), where N is the total number of samples. [13] utilized association between different operating system
events to build a matrix of applications activities and to detect
N = TP + FP + TN + FN (4)
ransomware samples within 30 seconds of their execution with
TP + FN AUC of 0.995. Timely detection of a ransomware upon its
S= (5)
N execution is very crucial and systems that fail to detect ran-
TP + FP somware in less than 10 seconds are not considered effective
P = (6) [5]. Moreover, timely identification of a ransomware family
N
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 3
would assist in building intelligence about applicable threat 1: procedure E VENT T YPE(Event E)
actors and threat profile for a given target. 2: if E ∈ Registry Events Set return R
3: if E ∈ F ilesystem Events Set return F
III. DATA C REATION 4: if E ∈ DLL Events Set return D
5: end procedure
We have downloaded 1624 Windows Portable Executable
(PE32) ransomware samples from virustotal.com which were Fig. 1. Determining Even Type of a given event.
active in the period of February 2016 to March 2017 as
reported by RansomwareTracker.abuse.ch. Collected samples TABLE I
belong to three families of ransomware namely 517 Locky L IST OF ACTIVITIES CAN BE CAPTURED BY P ROCESS M ONITOR
samples, 535 Cerber samples and 572 samples of TeslaCrypt.
Activity Type List
The best type of goodware counterpart for malware applica-
tions are portable and standalone benign apps [25]. Therefore, Registry RegQueryKey, RegOpenKey, RegQueryValue,
RegCloseKey, RegCreateKey, RegSetInfoKey,
we have collected all 220 available portable Windows PE32 RegEnumKey, RegQueryKeySecurity, Re-
benign applications from portableapps.com1 in April 2017 to gEnumValue, RegSetValue, RegDeleteValue,
serve as goodware counterpart of our dataset. RegQueryMultipleValueKey, RegDeleteKey,
RegLoadKey, RegFlushKey
We have setup the environment shown in Fig. 2 to collect
logs of ransomware and goodware samples runtime activities. File QueryNameInformationFile, ReadFile,
The Controller application on the host machine is randomly CreateFile, QueryBasicInformationFile,
selecting a ransomware or goodware sample and passes it CloseFile, QueryStandardInformationFile,
CreateFileMapping, QuerySizeInformation-
through FTP server to the Virtual Machine (VM). When Volume, FileSystemControl, QueryDirectory,
the sample is successfully transferred, the Controller notifies WriteFile, QueryNetworkOpenInformation-
the Launcher app to run the ProcessMonitor application and File, QueryRemoteProtocolInformation,
QuerySecurityFile, LockFile, UnlockFileSin-
executes a given sample. Similar to the previous research [5], gle, DeviceIoControl, SetEndOfFileInfor-
the first 10 seconds log of ransomware and benign applications mationFile, FlushBuffersFile, SetAllocation-
runtime activities is collected and the created log file is up- InformationFile, SetBasicInformationFile,
QueryAttributeTagFile, QueryFileInternalIn-
loaded to the Log repository on the host machine. Since major- formationFile, QueryInformationVolume,
ity of benign applications require human interactions to run (i.e QueryAttributeInformationVolume,
clicking on a button), we have developed an application called SetRenameInformationFile, QueryNormalized-
NameInformationFile, NotifyChangeDirectory,
PyWinMonkey which automates user interactions with an ap- QueryFullSizeInformationVolume,
plication. When the log file is successfully stored on the host SetSecurityFile, QueryStreamInformationFile,
machine, the Controller application reverts the VM back to its SetDispositionInformationFile, QueryEaIn-
formationFile, QueryAllInformationFile,
original copy and passes the next sample. It is notable that Py- QueryIdInformation, SetPositionInforma-
WinMonkey is similar to Monkey2 Android app which utilized tionFile, QueryPositionInformationFile,
in many previous Android malware research papers [33] for SetValidDataLengthInformationFile
mimicking human interactions. We have used Python 3.6.1 to
DLL LoadImage
develop Controller, Launcher and PyWinMonkey apps (accessi-
ble at https://github.com/sajadhomayoun/PyWinMonkey) and
run ProcessMonitor V3.31 on Windows10 build number 10240 ransomware and benign application. Si represents a sequence
on a computer with Core i7 CPU with 8 cores of 4GHz and of all events E caused by launching an application i ordered
16GB of RAM. For each and every process, ProcessMonitor by time as follow:
records loaded Dynamic Linked Libraries (DLLs), file system
Si = {E1,i (argE1 ), E2,i (argE2 ), ..., E2,i (argEn )} where
activities and registry activities. Therefore, we will have three
Ex,y (argEx ) represents event x for an application y and
sets of events namely Registry Events Set, which includes
argEx shows the argument passed to the event Ex .
all registry events, DLL Events Set, which includes all DLL
For example, {LoadImage(C :
events and FileSystem Events Set, which contains all Filesys-
\system32\gdi32.dll)}, {LoadImage(ReadF ile(C :
tem events as listed in Table I. Moreover, EventType(E) is a
\W indows\SysW OW 64\wininet.dll)} shows a sequence
procedure that returns the type of given event (R for Registry
of two events where the first event loads gdi32.dll in the
events, F for Filesystem events, and D for DLL events) as Fig.
memory of calling process (hence C : \system32\gdi32.dll
1.
is the parameter for this event) and the second event reads
As we will be using a sequential pattern mining technique
wininet.dll file located at C : \W indows\SysW OW 64. The
(MG-FSM) to detect candidate features for classification task,
size of each sequence depends on the number of events that
we should convert our data into a sequential dataset which
are called by an application and varies between different
is a collection of sequences such as D = {S1 , S2 , ..., Sn }
apps.
where Si represents a sequentially ordered set of events. We
Once all sequences are created, we have utilized the Outlier
have created a sequence of runtime events for each and every
Factor [34] technique to remove any outlier sequence from
1 https://portableapps.com/apps our dataset similar to [35]. The Outlier Factor technique first
2 https://developer.android.com/studio/test/monkey.html extracts all frequent patterns from a dataset and then detects
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 4
0.6 1
0.9
0.5 Goodware
0.8 J48
0.8
Fig. 6. ROC diagrams for classifiers.
Probability
0.6
0.4
0.2
0 1
0 5 10 15 20 25 30 0.99
SR D Value 0.98
(b)
AUC Value
0.97
0.6 0.96
0.95
0.5 0.94
0.93
0.4 0.92
Probability
0.91
0.3 J48 Random Bagging MLP
Forest
0.2
0.1
0
0 20 40 60 80 100 Fig. 7. AUC of classifiers for detecting ransomwares
SR FD Value
(c)
1
Fig. 5. Histogram of the probability of SR values for ransomware and
0.99
goodware.
0.98
MCC Value
0.97
TABLE IV 0.96
C LASSIFIERS PERFORMANCE ON V DD T otal,M CRansomware 0.95
0.94
0.93
Classifier TPR FPR F-Measure
0.92
J48 0.994 0.040 0.994 0.91
Random Forest 0.993 0.040 0.993 J48 Random Bagging MLP
Bagging 0.994 0.039 0.977 Forest
MLP 0.994 0.035 0.994
TABLE V TABLE VI
R ESULTS OF CLASSIFIERS ON V DD OF,M CRansomware FOR DETECTING T HE CLASSIFIERS PERFORMANCE ON SV DD T otalF amily
RANSOMWARE
Classifier TPR FPR F-Measure MCC
Classifier Accuracy
J48 0.981 0.006 0.981 0.974
J48 0.994
Random Forest 0.983 0.006 0.983 0.978
Random Forest 0.994
Bagging 0.980 0.007 0.980 0.974
Bagging 0.994
MLP 0.980 0.007 0.980 0.973
MLP 0.994
TABLE VII
R ESULTS OF CLASSIFIERS ON DATASET SV DD OF FOR DETECTING
VI. T HREAT I NTELLIGENCE : D ETECTION OF A RANSOMWARE FAMILY
R ANSOMWARE FAMILY
Classifier Accuracy
To investigate performance of classifiers in detection
J48 0.947
of a ransomware family we have created D TotalFamily Random Forest 0.965
dataset which contains all sequences from D Locky, Bagging 0.959
D Cerber, D TeslaCrypt and D Goodware. We then gener- MLP 0.959
ated V DD T otalF amily,M CLocky , V DD T otalF amily,M CCerber
and V DD T otalF amily,M CT eslaCrypt vectored datasets and fed
them to CfsSubsetEval of Weka3.8.1. All in all 13 candidate average [41] F-Measure of 0.983 with F P R ≤ 0.006 reflects
features were detected for classification of ransomware fami- suitability of our features for detecting ransomware samples
lies as shown in Fig. 9. families. MCC values of more than 0.95 for all classifiers
Fig. 9a depicts the distribution of SR R, SR RF, SR FD and also indicate quality of our features in enabling classifiers
SR DF values in V DD T otalF amily,M CLocky . The interesting to provide an almost perfect prediction. Finally, as shown
point in Fig. 9a is that Locky samples have taken higher in Table VII our features enabled classifiers to offer an
values for all selected features in V DD T otalF amily,M CLocky accurate prediction (≥ 0.965) even on unforeseen samples
in compare with Cerber, TeslaCrypt and goodware samples. (SV DD OF ).
This is because V DD T otalF amily,M CLocky is vectored using
M CLocky , so it is vivid that more MSPs are matched to VII. C ONCLUDING R EMARKS
Locky samples which is reflected in higher SR values for
corresponding features. Fig. 9a also shows that behaviours In this paper, by combining sequential pattern mining for
of Cerber and TeslaCrypt samples have been different from feature identification with machine learning classification tech-
Locky, and this leads to smaller values for features R, RF, FD niques we could accurately distinguish between ransomware
and DF. and goodware samples and identify given ransomware fam-
Fig. 9b shows histograms of distribution of SR val- ilies with in first 10 seconds of a ransomware execution.
ues namely SR R, SR D, SR FR, SR FD and SR DF in We achieved minimum F-measure of 0.994 with minimum
V DD T otalF amily,M CCerber . The histograms in Fig. 9b show AUC value of 0.99 in detection of ransomware samples from
higher SR values for Cerber samples in compare with samples goodware using Registry (R) events, DLL (D) events and
of other classes and it means Cerber samples matched more Filesystem to Registry (FD) transitions as features for J48,
MSPs from M CCerber . Random Forest, Bagging and MLP classifiers. We achieved
Fig. 9c illustrates the histogram of SR values for SR R, F-Measure of more than 0.98 with FPR of less than 0.007
SR F, SR RF and SR FD for V DD T otalF amily,M CT eslaCrypt . in detection of a given ransomware family using 13 se-
The diagrams in Fig. 9c present higher SR values for Tes- lected features detected in this study. Reported features for
laCrypt samples in compare with Locky, Cerber and goodware differentiating ransomware and benign applications can be
samples. Fig. 9 also demonstrates that goodware samples used for effective hunting of a ransomware while features
always take smaller and far SR values for each feature because reported for ransomware family classification are great for
goodwares behave differently in compare with ransomwares. building intelligence about threat profiles applicable to a given
In other words, goodware samples have matched fewer MSPs target. Applying other classification techniques such as fuzzy
from M CLocky , M CCerber and M CT eslaCrypt , and subse- classification can be considered as a future work of this
quently they have taken lower SR values. study. Moreover, utilization of Stream Data Mining techniques
to reduce ransomware detection time is another interesting
As detection of ransomware families is a multi-class classi-
extension of this study.
fication task with four class labels (Locky, Cerber, TeslaCrypt
and Goodware), therefore, we have trained J48, Random
Forest, Bagging and MLP with a multi-class classifier using ACKNOWLEDGMENT
SV DD T otalF amily dataset with 13 selected features in Fig. The authors would like to thank virustotal.com for shar-
9. ing malware samples and giving the capability of scanning
Table VI presents performance of all classifiers obtained files. We also thank ransomwaretracker.abuse.ch for their
from 10-fold cross validation. Obtained minimum weighted updated and precious information about different families of
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 8
0.6 1 1 1
Probability
Probability
Probability
Probability
0.4
0.5 0.5 0.5
0.2
0 0 0 0
0 500 1000 0 50 100 0 50 100 0 50 100
SR R Value SR RF Value SR FD Value SR DF Value
0.6 1
Probability
Probability
0.4 Locky
0.5 Cerber
0.2 TeslaCrypt
Goodware
0 0
0 1000 2000 0 20 40
SR R Value SR D Value
1 1 1
Probability
Probability
Probability
0 0 0
0 100 200 0 50 100 0 20 40
SR FR Value SR FD Value SR DF Value
0.4 0.3 1
0.6
Probability
Probability
Probability
Probability
0.2
0.4
0.2 0.5
0.1 0.2
0 0 0 0
0 1000 2000 3000 0 500 0 50 100 150 0 50 100
SR R Value SR F Value SR RF Value SR FD Value
ransomwares. This work is partially supported by the Euro- [19] “Ransomware becomes most popular form of attack as payouts approach
pean Council 268 International Incoming Fellowship (FP7- $1bn a year,” Network Security, vol. 2017, no. 1, pp. 1–2, jan 2017.
[Online]. Available: https://doi.org/10.1016/s1353-4858(17)30001-6
PEOPLE-2013-IIF) grant. [20] “UK major target for ransomware,” Computer Fraud & Security,
vol. 2016, no. 1, p. 3, jan 2016. [Online]. Available: https:
//doi.org/10.1016/s1361-3723(16)30003-3
R EFERENCES [21] H. Orman, “Evil offspring - ransomware and crypto technology,” IEEE
Internet Computing, vol. 20, no. 5, pp. 89–94, sep 2016. [Online].
[1] M. Hopkins and A. Dehghantanha, “Exploit kits: The production line of Available: https://doi.org/10.1109/mic.2016.90
the cybercrime economy?” in 2015 Second International Conference on [22] C. Everett, “Ransomware: to pay or not to pay?” Computer Fraud &
Information Security and Cyber Forensics (InfoSec). IEEE, nov 2015. Security, vol. 2016, no. 4, pp. 8–12, apr 2016. [Online]. Available:
[Online]. Available: https://doi.org/10.1109%2Finfosec.2015.7435501 https://doi.org/10.1016/s1361-3723(16)30036-7
[2] EUROPOL. (2016) The internet organised crime [23] A. Gazet, “Comparative analysis of various ransomware virii,” Journal
threat assessment (iocta) 2016. [Online]. Avail- in Computer Virology, vol. 6, no. 1, pp. 77–90, jul 2008. [Online].
able: https://www.europol.europa.eu/activities-services/main-reports/ Available: https://doi.org/10.1007%2Fs11416-008-0092-2
internet-organised-crime-threat-assessment-iocta-2016 [24] R. Brewer, “Ransomware attacks: detection, prevention and cure,”
[3] N. Milosevic, A. Dehghantanha, and K.-K. R. Choo, “Machine Network Security, vol. 2016, no. 9, pp. 5–9, sep 2016. [Online].
learning aided android malware classification,” Computers & Electrical Available: https://doi.org/10.1016/s1353-4858(16)30086-1
Engineering, feb 2017. [Online]. Available: https://doi.org/10.1016% [25] M. Damshenas, A. Dehghantanha, and R. Mahmoud, “A survey on
2Fj.compeleceng.2017.02.013 malware propagation, analysis, and detection,” International Journal of
[4] K. Cabaj and W. Mazurczyk, “Using software-defined networking Cyber-Security and Digital Forensics (IJCSDF), vol. 2, no. 4, pp. 10–29,
for ransomware mitigation: The case of CryptoWall,” IEEE Network, 2013.
vol. 30, no. 6, pp. 14–20, nov 2016. [Online]. Available: https: [26] S. Al-Sharif, F. Iqbal, T. Baker, and A. Khattack, “White-hat
//doi.org/10.1109/mnet.2016.1600110nm hacking framework for promoting security awareness,” in 2016
[5] A. Azmoodeh, A. Dehghantanha, M. Conti, and R. Choo, “Detect- 8th IFIP International Conference on New Technologies, Mobility
ing crypto-ransomware in iot networks based on energy consumption and Security (NTMS). IEEE, nov 2016. [Online]. Available:
footprint,” Journal of Ambient Intelligence and Humanized Computing, https://doi.org/10.1109/ntms.2016.7792489
2017. [27] A. L. Young, “Cryptoviral extortion using microsoft crypto API,”
International Journal of Information Security, vol. 5, no. 2,
[6] H. L. Kevin Savage, Peter Coogan, The evolution of ransomware.
pp. 67–76, mar 2006. [Online]. Available: https://doi.org/10.1007%
Symantec, 2015.
2Fs10207-006-0082-7
[7] Symantec, “Internet security threat report,” Symantec, Tech. Rep., apr
[28] A. Kharraz, W. Robertson, D. Balzarotti, L. Bilge, and E. Kirda,
2016.
“Cutting the gordian knot: A look under the hood of ransomware
[8] K. T. D. Y. Huang, D. W. E. B. C. GrierD, T. J. Holt, C. Kruegel, attacks,” in Detection of Intrusions and Malware, and Vulnerability
D. McCoy, S. Savage, and G. Vigna, “Framing dependencies introduced Assessment. Springer Nature, 2015, pp. 3–24. [Online]. Available:
by underground commoditization,” in Workshop on the Economics of https://doi.org/10.1007%2F978-3-319-20550-2 1
Information Security, 2015. [29] A. Kharaz, S. Arshad, C. Mulliner, W. Robertson, and
[9] Monika, P. Zavarsky, and D. Lindskog, “Experimental analysis of E. Kirda, “Unveil: A large-scale, automated approach to
ransomware on windows and android platforms: Evolution and detecting ransomware,” in 25th USENIX Security Symposium
characterization,” Procedia Computer Science, vol. 94, pp. 465–472, (USENIX Security 16). Austin, TX: USENIX Association, 2016,
2016. [Online]. Available: https://doi.org/10.1016/j.procs.2016.08.072 pp. 757–772. [Online]. Available: https://www.usenix.org/conference/
[10] X. Luo and Q. Liao, “Awareness education as the key to ransomware usenixsecurity16/technical-sessions/presentation/kharaz
prevention,” Information Systems Security, vol. 16, no. 4, pp. [30] C. Moore, “Detecting ransomware with honeypot techniques,” in
195–202, sep 2007. [Online]. Available: https://doi.org/10.1080% 2016 Cybersecurity and Cyberforensics Conference (CCC). Institute
2F10658980701576412 of Electrical and Electronics Engineers (IEEE), aug 2016. [Online].
[11] M. Simmonds, “How businesses can navigate the growing tide of Available: https://doi.org/10.1109%2Fccc.2016.14
ransomware attacks,” Computer Fraud & Security, vol. 2017, no. 3, [31] M. M. Ahmadian and H. R. Shahriari, “2entfox: A framework for
pp. 9–12, mar 2017. [Online]. Available: https://doi.org/10.1016/ high survivable ransomwares detection,” in 2016 13th International
s1361-3723(17)30023-4 Iranian Society of Cryptology Conference on Information Security
[12] S. Grzonkowski, A. Mosquera, L. Aouad, and D. Morss, “Smartphone and Cryptology (ISCISC). Institute of Electrical and Electronics
security: An overview of emerging threats.” IEEE Consumer Electronics Engineers (IEEE), sep 2016. [Online]. Available: https://doi.org/10.
Magazine, vol. 3, no. 4, pp. 40–44, oct 2014. [Online]. Available: 1109%2Fiscisc.2016.7736455
https://doi.org/10.1109/mce.2014.2340211 [32] J. K. Lee, S. Y. Moon, and J. H. Park, “CloudRPS: a cloud
[13] D. Sgandurra, L. Muoz-Gonzlez, R. Mohsen, and E. C. Lupu, “Auto- analysis based enhanced ransomware prevention system,” The Journal
mated dynamic analysis of ransomware: Benefits, limitations and use of Supercomputing, jul 2016. [Online]. Available: https://doi.org/10.
for detection,” 2016. 1007/s11227-016-1825-5
[14] M. Sohrabi, M. M. Javidi, and S. Hashemi, “Detecting intrusion [33] M. Damshenas, A. Dehghantanha, K.-K. R. Choo, and R. Mahmud,
transactions in database systems:a novel approach,” Journal of “M0droid: An android behavioral-based malware detection model,”
Intelligent Information Systems, vol. 42, no. 3, pp. 619–644, dec 2013. Journal of Information Privacy and Security, vol. 11, no. 3, pp. 141–
[Online]. Available: https://doi.org/10.1007%2Fs10844-013-0286-z 157, jul 2015. [Online]. Available: https://doi.org/10.1080/15536548.
[15] M. Sun, X. Li, J. C. S. Lui, R. T. B. Ma, and Z. Liang, “Monet: 2015.1073510
A user-oriented behavior-based malware variants detection system for [34] Z. He, X. Xu, Z. Huang, and S. Deng, “FP-outlier: Frequent
android,” IEEE Transactions on Information Forensics and Security, pattern based outlier detection,” Computer Science and Information
vol. 12, no. 5, pp. 1103–1112, may 2017. [Online]. Available: Systems, vol. 2, no. 1, pp. 103–118, 2005. [Online]. Available:
https://doi.org/10.1109%2Ftifs.2016.2646641 https://doi.org/10.2298%2Fcsis0501103h
[16] M. R. Watson, N. ul-hassan Shirazi, A. K. Marnerides, A. Mauthe, and [35] J. Yu, G. X. Yao, and W. W. Zhang, “Intrusion detection
D. Hutchison, “Malware detection in cloud computing infrastructures,” method based on frequent pattern,” Advanced Materials Research,
IEEE Transactions on Dependable and Secure Computing, vol. 13, vol. 204-210, pp. 1751–1754, feb 2011. [Online]. Available: https:
no. 2, pp. 192–205, mar 2016. [Online]. Available: https://doi.org/10. //doi.org/10.4028/www.scientific.net/amr.204-210.1751
1109%2Ftdsc.2015.2457918 [36] R. Agrawal and R. Srikant, “Mining sequential patterns,” in Proceedings
[17] B. Matthews, “Comparison of the predicted and observed secondary of the Eleventh International Conference on Data Engineering, ser.
structure of t4 phage lysozyme,” Biochimica et Biophysica Acta (BBA) ICDE ’95. Washington, DC, USA: IEEE Computer Society, 1995,
- Protein Structure, vol. 405, no. 2, pp. 442–451, oct 1975. [Online]. pp. 3–14. [Online]. Available: http://dl.acm.org/citation.cfm?id=645480.
Available: https://doi.org/10.1016%2F0005-2795%2875%2990109-9 655281
[18] D. M. Powers, “Evaluation: from precision, recall and f-measure to roc, [37] W. Shen, J. Wang, and J. Han, “Sequential pattern mining,” in Frequent
informedness, markedness and correlation,” School of Informatics and Pattern Mining. Springer International Publishing, 2014, pp. 261–282.
Engineering-Flinders University, Tech. Rep., 2011. [Online]. Available: https://doi.org/10.1007/978-3-319-07821-2 11
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 10
[38] K. Damevski, D. C. Shepherd, J. Schneider, and L. Pollock, “Mining Sattar Hashemi received the PhD degree in com-
sequences of developer interactions in visual studio for usage smells,” puter science from Iran University of Science and
IEEE Transactions on Software Engineering, vol. 43, no. 4, pp. Technology in conjunction with Monash Univer-
359–371, apr 2017. [Online]. Available: https://doi.org/10.1109%2Ftse. sity, Australia, in 2008. Following academic ap-
2016.2592905 pointments at Shiraz University, he is currently
[39] I. Miliaraki, K. Berberich, R. Gemulla, and S. Zoupanos, “Mind the gap,” an associate professor at Electrical and Computer
in Proceedings of the 2013 international conference on Management of Engineering School, Shiraz University, Shiraz, Iran.
data - SIGMOD 13. Association for Computing Machinery (ACM), His research interests include machine learning, data
2013. [Online]. Available: https://doi.org/10.1145%2F2463676.2465285 mining, social networks, data stream mining, game
[40] M. A. Hall, “Correlation-based feature selection for machine learning,” theory, and adversarial learning.
Tech. Rep., 1998.
[41] M. Sokolova and G. Lapalme, “A systematic analysis of performance
measures for classification tasks,” Information Processing &
Management, vol. 45, no. 4, pp. 427–437, jul 2009. [Online].
Available: https://doi.org/10.1016/j.ipm.2009.03.002