0% found this document useful (0 votes)
91 views

A Review On Feature Selection in Mobile Malware Detection

The document reviews feature selection techniques used in mobile malware detection research from 2010 to 2014. It categorizes features into static, dynamic, hybrid, and application metadata features and discusses datasets and evaluation measures used.

Uploaded by

shivam kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

A Review On Feature Selection in Mobile Malware Detection

The document reviews feature selection techniques used in mobile malware detection research from 2010 to 2014. It categorizes features into static, dynamic, hybrid, and application metadata features and discusses datasets and evaluation measures used.

Uploaded by

shivam kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Digital Investigation 13 (2015) 22e37

Contents lists available at ScienceDirect

Digital Investigation
journal homepage: www.elsevier.com/locate/diin

A review on feature selection in mobile malware detection


Ali Feizollah*, Nor Badrul Anuar, Rosli Salleh, Ainuddin Wahid Abdul Wahab
Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, 50603,
Kuala Lumpur, Malaysia

a r t i c l e i n f o a b s t r a c t

Article history: The widespread use of mobile devices in comparison to personal computers has led to a
Received 11 October 2014 new era of information exchange. The purchase trends of personal computers have started
Received in revised form 2 February 2015 decreasing whereas the shipment of mobile devices is increasing. In addition, the
Accepted 4 February 2015
increasing power of mobile devices along with portability characteristics has attracted the
Available online 13 March 2015
attention of users. Not only are such devices popular among users, but they are favorite
targets of attackers. The number of mobile malware is rapidly on the rise with malicious
Keywords:
activities, such as stealing users data, sending premium messages and making phone call
Mobile malware
Android
to premium numbers that users have no knowledge. Numerous studies have developed
Feature selection methods to thwart such attacks. In order to develop an effective detection system, we have
Review paper to select a subset of features from hundreds of available features. In this paper, we studied
Mobile operating system 100 research works published between 2010 and 2014 with the perspective of feature
selection in mobile malware detection. We categorize available features into four groups,
namely, static features, dynamic features, hybrid features and applications metadata.
Additionally, we discuss datasets used in the recent research studies as well as analyzing
evaluation measures utilized.
© 2015 Elsevier Ltd. All rights reserved.

Introduction various other malicious activities. The malicious activities


are hidden from the user and are committed in the back-
The ubiquity of mobile devices is undeniable because ground or at midnight when the user is asleep (Eslahi et al.,
they have brought new possibilities to every days life. 2012). Based on such characteristics, we assess the research
Contemporary mobile devices are more powerful when works done to detect these malware.
compared to Personal Computers (PCs) ten years ago. Un- The aim of this paper is to scrutinize various features
like PCs, portability of mobile devices makes them attrac- available in Android malware, since feature selection has
tive to users. In addition, their small sizes as compared to considerable effects on results of experiments. We discuss
personal computers play an important role in increasing such effect in the following sections. Suarez-Tangil et al.
their popularity. Furthermore, users interests are (2013) discuss malware for smart devices in general.
increasing towards the Rich Mobile Applications (RMA), However, the paper discusses various types of features very
such as Google Maps that deliver rich user experience along briefly and the authors did not cover all types of available
with high interaction (Knoernschild, 2010). However, such features. Similarly, La Polla et al. (2013) investigates various
popularity has serious security and privacy threats and types of mobile devices, available malware, their effect on
the devices and different detection methods. Nevertheless,
they did not mention what features they used in detection,
* Corresponding author.
considering that features have significant impact on
E-mail addresses: [email protected] (A. Feizollah),
[email protected] (N.B. Anuar), [email protected] (R. Salleh), detection. Mohite and Sonar (2014) survey different anal-
[email protected] (A.W.A. Wahab). ysis techniques in mobile malware detection. The paper

http://dx.doi.org/10.1016/j.diin.2015.02.001
1742-2876/© 2015 Elsevier Ltd. All rights reserved.
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 23

mention examples of detection methods along their 2014 (Gartner, 2013). Table 1 shows the number of devices
description. The paper does not include datasets and shipments in 2012, 2013 and 2014.
evaluation measures. In addition, it does not cover all the The comparison between PC and mobile devices, ultra-
recent works comprehensively. Peng et al. (2014) examines mobile, tablets, and mobile phones, reveals that the num-
evolution of mobile malware, their damages and their ber of PCs is decreasing while the shipment of mobile
propagation model. They included various operating sys- devices is increasing. In terms of usage of mobile devices,
tem in the paper, which makes it difficult to examine all Walker Sands published a report that indicated the Internet
available aspects thoroughly. However, we focus on traffic pertaining to mobile devices increased. Based on the
Android operating system and the results are more accu- report, the Internet traffic of mobile devices represents 67%
rate and comprehensive. Additionally, to the best of our increase in the third quarter of 2013 compared to the same
knowledge, surveying Android features is unprecedented period in 2012 (Sands, 2013).
in research works.
The rest of this paper is organized as follows. Section 2 The rise of android malware
gives background information needed for the rest of the
paper. Section 3 examines four types of features in mobile There are numerous mobile operating systems in the
malware detection including the static, dynamic, hybrid market namely, Android (Google, 2014a), iOS (Apple, 2014),
and applications metadata. We comprehensively analyze Windows Phone (Microsoft, 2014), and BlackBerry (R. in
each type of features. Section 4 presents discussions Motion, 2014). Android has dominated the mobile devices
regarding the datasets used in the recent research works industry. Based on a report, a total of 261.1 million devices
and their description. Additionally, we discuss the evalua- were shipped in the third quarter of 2013 and 81.3% of the
tion measures of malware detection in this section. Finally, shipped devices were running Android operating system
Section 5 concludes the paper by highlighting important (CNET, 2013). Fig. 1 depicts the dominance of Android
points. among other mobile operating systems.
The number of attacks is steadily going up for Android.
Background Based on the report from F-Secure, Android incorporated
79% of all malware in 2012 compared to 66.7% in 2011 and
In this section, we present background information. just 11.25% in 2010 (Techcrunch, 2013). Similarly, Symantec
First, we investigate the supremacy of mobile devices and said that number of Android malware increased almost
see where mobile devices stand against PCs. Next, we four times between June 2012 and June 2013 (Symantec,
examine how widespread mobile malware are; we then 2013). In addition, the period of April 2013 to June 2013
explain different types of malware, ranging from simple witnessed a massive growth of almost 200% in Android
one to the most dangerous and sophisticated one. It is malware. Fortinet (2014), a world leader in high perfor-
beneficial to know popularity of mobile devices, as well as, mance network security, announced that within the period
the seriousness of mobile malware. We scrutinize the of January 1, 2013 until December 31, 2013, they discovered
importance of feature selection in malware detection in the over 1800 new distinct families of malware and the ma-
next sub-section in order to establish the necessity of this jority of which were Android malware. Malware growth
research work. Finally, we take a closer look at Android files not only degrades performance of the devices, but also has
and their components, since we refer to various compo- posed serious concerns towards the privacy and security of
nents throughout this work. data (Fortinet, 2014). In February 2014, Symantec stated
that an average of 272 new malware and five new malware
Supremacy of mobile devices families are discovered every month targeting Android
operating system (Symantec, 2014).
Popularity of mobile devices is on the rise (Chen and The reason of such enormous increase in Android mal-
Bilton, 2014). Gartner, an American information technol- ware lies in the fact that Android is an open source oper-
ogy research and advisory firm, reported that total ship- ating system (Teufl et al., 2013) and the application market
ment of mobile devices increased in 2013 by 5.9% and of Android, known as Google Play, is not monitored
reached 2.35 billion devices compared to the previous year meticulously in terms of security (Feizollah et al., 2013).
and it is estimated that the growth continues to 2.5 billion
devices in 2014 (Gartner, 2013). On the other hand, the
shipment of PCs has declined to 305 million units in 2013
and it is expected to decrease below 300 million units in

Table 1
Worldwide devices shipments (thousands of units) (Gartner, 2013).

2012 2013 2014

PC (Desktop and Notebook) 341,273 305,178 289,239


Ultramobile 9787 20,301 39,824
Tablet 120,203 201,825 276,178
Mobile Phone 1,746,177 1,821,193 1,901,188
Total 2,217,440 2,348,497 2,506,429 Fig. 1. Mobile operating systems market share (% of global unit shipments)
(Motley, 2014).
24 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

Moreover, there are some unofficial Android markets, for Some of malware have gone further by making phone
example Aptoide (2014) in which the security issues are not calls in the background without users knowledge. Moua-
paid much attention. Furthermore, as mentioned earlier, Bad is a malware that secretly makes phone calls in such a
the market share of Android among mobile operating sys- way that it waits until a while after the devices screen goes
tems is considerably high. Consequently, attackers target off and the screen becomes locked. It then starts calling
Android to gain more benefits as compared to other oper- premium numbers. As soon as the user interacts with the
ating systems. device, the malware ends the call. Fortunately, this mal-
ware does not alter call logs and the user is able to discover
Types of android malware existence of the malware by checking the call logs (Lookout,
2013).
There are variety of attacks particular to Android A botnet is more dangerous than the aforementioned
ranging from adware to the most sophisticated and malicious applications. Upon infecting the device, the
dangerous ones. Adware have the purpose of just adver- attacker gains access to the devices and can perform ma-
tising a product or a website that are harmless but licious activities by controlling applications on the device. A
annoying. The most dangerous and sophisticated malware botnet is the network of such infected devices. An attacker
are capable of accessing personal data on the device as well can access hundreds of infected devices in a single botnet
as hijacking the mobile device. (Vural et al., 1007). For example, security analysts discov-
Android Dowgin is an example of an adware that installs ered an infected version of the Angry Birds Space in April
itself on Android device as a bundle with other application. 2012. It functions like a normal application without suspi-
It then displays advertisements in the notification area of cious symptoms. However, it uses a software trick known
the device and is not easily removed. It is estimated that as GingerBreak to acquire root access that allows it to do
between 10,000 and 50,000 users are infected with this tasks outside its privilege. It secretly downloads malicious
adware (AVG.ThreatLabs, 2013). It has been spreading since codes from a server and opens a back door for attackers
July 2013 and continues to proliferate (Eset, 2013). The upon which the device joins the botnet eventually (Sophos,
alarming issue is that as of December 2013, some of the 2013). The ZeroAccess botnet is adding approximately
prominent anti-virus software such as Symantec, Trend- 100,000 new infections weekly by paying considerable
Micro, and McAfee were not able to detect it (Virustotal, amount of money weekly to generate new associated in-
2013). fections. It had 88.65% share of botnet dominance in 2013
The Android attackers, sometimes, have financial en- (Fortinet, 2014).
couragements and also have turned out to be more Recently, attackers have adopted a new approach to-
aggressive recently (Symantec, 2014). Upon installation, wards infecting mobile devices. Thus far, attackers were
some applications send expensive short message service depending on alluring users to download their malicious
(SMS) to premium numbers without users knowledge that applications after which the application performs mali-
reflects itself in users bills. Such applications have been on cious activity without users knowledge. It has been
the rise for years. A report published in 2013 shows that observed that personal computers have been used as a
some attackers earn up to USD 12,000 per month via such conduit to Android devices, which is called hybrid threats
malware (The.Register, 2013). Fig. 2 shows the number of (Symantec, 2014). Trojan Droidpak uses hybrid threats to
malware with financial motivations within the period of infect mobile devices. It first gains access to a personal
2006 and 2012. computer and based on that a malicious Android applica-
Based on a report by Sophos, a security firm, a malicious tion package (APK) file downloads itself. When the user
version of the popular Angry Bird game secretly sends connects an Android device to the computer, the malicious
premium SMS that costs GBP 15. Each time the user starts file attempts to install itself on the device. After the suc-
the application, it sends a premium SMS. It is estimated cessful installation, it attempts to convince the user to
that 1391 devices were infected with this malware and it is download and install the infected version of Korean
evaluated that the developers of this malicious application banking application (Symantec, 2014).
earned GBP 27,850 through sending SMS to premium
numbers (Sophos, 2012). The importance of feature selection

Numerous studies try to confront the danger of Android


malware. One of the first approaches is signature-based
detection in which the detection system constructs a
unique signature for a malware and detects malware by
matching the signature with the collected data. However, a
small modification in a malware leads to a new variant of
the malware and is able to bypass the signature detection
method (Garcia-Teodoro et al., 2009). DroidAnalytics
(Zheng et al., 2013) was developed based on the signature-
based method. It automatically collects, extracts, and ana-
lyzes the signature of an Android application file. It uses
Fig. 2. Number of mobile malware motivated by profit per year, 2006e2012 Java code and classes as a signature to detect the known
(Techcrunch, 2013). malware. Nevertheless, it is unable to detect unknown
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 25

malware. With the steady rise of new variants of Android As Table 2 illustrates, different features yield different
malware, it is vital to detect them effectively. results despite the fact that the data collection process and
Researchers turned to machine learning methods to used classifier are same for both the experiments. Thus, the
overcome the limitations of the signature-based method. effect of feature selection is conspicuous. In addition, se-
We train machine learning algorithms based on collected lection of the most useful features is an important and
data. They are capable of detecting anomalies, which is challenging task.
necessary to discover new malware. For example, Sahs and In this paper, we are scrutinizing various types of fea-
Khan (2012) performed an experiment in which they tures available in Android applications and their trend in
applied support vector machine (SVM) algorithm on their the research works. Moreover, we provide some guidelines
dataset. The authors trained the algorithm and achieved on how to choose the best features. Additionally, we discuss
93% of accuracy. Similarly, Garcia-Teodoro et al. (2009) various utilized datasets and present discussion on various
presented a study in which they used k-means algorithm evaluation measures employed by recent works.
for malware detection. The authors achieved 87.39% of
detection accuracy.
In machine learning methods, feature selection is one of Structure of android application package (APK)
the first and most crucial steps. In case of Android appli-
cations, they consist of various elements such as permis- Throughout this study, we refer to various components
sions, Java code, certification, behavior of the application on of an Android installation file, known as APK. It is beneficial
the device and its behavior on the network. Selecting the to explain about different parts of such files. It is worth
most useful subset of features from massive number of noting that an APK is an archive file type that softwares
available features changes the result of the whole experi- such as WinZip are able to open it (Sanz et al., 2012).
ment (Guyon and Elisseeff, 2003). Some of the benefits of Components of an APK file are as follows:
feature selection are as follows:
 AndroidManifest.xml: An XML file holding meta infor-
 Feature selection makes it possible to reduce dimen- mation about an application, such as descriptions and
sionality of datasets because with less data, it is possible security permissions. Prior to installation of an Android
to easily visualize the trend in data (Crussell et al.). application, the application provides prospective users
 Analyzing datasets involve processing vast amount of with a list of permissions that are available in the file.
data and therefore, reducing them to a useful subset not  Classes.dex: It contains source code of an application
only saves the time and cost of experiments, but also written in Java and compiled for Android that the ma-
minimizes the time for real world implementation chine converts it to a special file format with .dex
(Crussell et al.). Selecting useful subset of the features extension.
considerably reduces runtime of the machine learning  Resources: It entails all resources an application needs
algorithms in training phase. to run, such as pictures used in the application, layout of
 Feature selection removes noisy and irrelevant data the application, its appearance to a user, use of a data-
from datasets leading to more accurate results of ma- base, and data stored in the database.
chine learning algorithms (Jensen and Shen, 2008).
Feature selection in mobile malware detection
We conducted two experiments to examine the effect of
features on results. We collected network traffic of over 800 An Android application consists of several parts that
Android applications, including normal and malicious, have the potential to be a feature in Android malware
from MalGenome (Yajin and Xuxian, 2012) data sample. detection. Fig. 3 shows various types of features and sub-
The dataset consists of ten network traffic features out of types of each category. In this work, we analyzed 100 pa-
which we selected five features for each experiment. The pers, from highly credible sources such as IEEE and
dataset comprises of 504,148 records. K-nearest neighbors Springer, with the perspective of feature selection in
classifier with three as the number of neighbors were used. Android malware detection. We selected publications be-
Table 2 shows results of the experiments. tween 2010 and March 2014. Based on our findings, re-
searchers published 49 research papers regarding Android
malware detection in year 2013. It signifies that the interest
Table 2
towards developing effective systems for Android malware
Results of the experiments.
detection is increasing compared to 22 papers in 2012 and
Experiment 1 Experiment 2 7 papers in 2011. Moreover, the first quarter of 2014 has 15
Features frame.len tcp.dstport publications.
frame.number tcp.window_size We have divided mobile malware features into four
frame.time_delta value
groups. Table 3 shows these four groups of features with
frame.time_relative tcp.seq
tcp.srcport ip.src their descriptions.
ip.dst Additionally, we have analyzed recent works based on
True Positive 98.63% 99.98% type of features they used. Fig. 4 shows the percentage of
Rate (TPR) each group in reviewed papers. Based on Fig. 4, re-
False Positive 1.37% 0.02%
Rate (FPR)
searchers chose static features nearly equal as dynamic
features. Although the hybrid features are more
26 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

Fig. 3. Taxonomy of mobile malware features.

comprehensive, they constitute only 10% of the literatures. Android permission feature
The apparent reason is the complicated process of using We know that Android operating system has Linux core;
two types of features, since the hybrid features require among which it comprises important part of Linux security
collecting static and dynamic features separately. In the architecture. Prior to installation of an application, it pro-
following sections, we discuss each type of features in vides list of requested permissions to the user. Upon
detail. granting the permissions by the user, the application in-
stalls itself on the device. There are 134 official Android
Static features permissions in Android 2.2 (i.e. API 8) (Felt et al.). Google
categorized them into four groups, namely, normal,
Static features include features available in the apk file dangerous, signature, and signature or system (Google,
such as AndroidManifest.xml file and Java code file. Out of 2014b). There are different approaches taken by re-
100 papers reviewed, 45 papers used static features to searchers in analyzing Android permissions. Authors of
conduct their experiments. Among the static features, (Peng et al., 2012; Wang et al., 2013; Pandita et al.; Grace
researchers used Android permission in 36% of the papers et al.) used permissions to evaluate applications and
(Yerima et al., 2014; Aung and Zaw, 2013; Grace et al.; ranked them based on possible risks (using probabilistic
Peng et al., 2012; Grace et al., 2307; Sarma et al.; Samra generative models, quantitative security risk assessment).
et al., 2013; Yerima et al., 2013; Sahs and Khan, 2012; Numerous studies simply extracted permissions and uti-
Sanz et al., 2013a; Zhou et al.; Sanz et al., Bringas; Seo lized machine learning to detect malicious application,
et al., 2014; Wu et al., 2012; Sanz et al., 2013b; Arp (Samra et al., 2013; Aung and Zaw, 2013; Yerima et al., 2014;
et al., 2014), more than other static features. Selection Sanz et al., 2013a). Researchers in (Moonsamy et al., 2013;
of Java code comes second with 29% of papers (Desnos, Huang et al.) argue that merely analyzing requested per-
2012; Rastogi et al., 2014; Faruki et al.; Suarez-Tangil missions is not sufficient for detecting malicious applica-
et al., 2014; Grace et al.; Lu et al., 2012; Crussell et al.; tions. They analyzed used permissions in addition to
Deshotels et al.; Zhou et al.; Shabtai et al., 2010a; Zheng
et al., 2013; Almohri et al.; Zheng et al.). The following
sections discuss the static features in details.

Table 3
Categorization of mobile malware features.

Type of feature Description

Static They are pertaining to the content of APK files.


There are various features in APK files, such as
permissions, Java code, intent filters, network
address, strings and hardware components.
Dynamic They represent post-installation behavior of
applications on mobile devices and include
behavior of the application in the operating
system or on the network.
Hybrid Hybrid features are combination of both static
and dynamic features. They are the most
comprehensive features because they analyze
applications from various aspects.
Applications' The last group consists of metadata pertaining
Metadata to Android applications, such as their
information on Google Play.
Fig. 4. Categorizing literature based on type of features.
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 27

requested permissions in order to detect malware. App- attackers command malware to send private data to
Guard (Backes et al.) has gone one step further and has them that requires the presence of intentions in the
extended Android permission system to alleviate current intent filter part of AndroidManifest.xml file.
vulnerabilities. They claim that their system is a practical
extension for Android permission system as it is possible to In DroidMat (Wu et al., 2012), various features from an
use it on devices without any modification or root access. Android file including intent filters are extracted and are
Why Android permission is the most used static analyzed. The authors utilized several machine learning
feature? As mentioned earlier, Android operating system algorithms such as k-means, k-nearest neighbors and naive
has Linux architecture. Permission is the first barrier to bayes to develop malware detection system. Evaluation of
attackers. Even though the Java code contains malicious DroidMat exhibited an improvement over similar systems
code, some of API calls in the code need permission to be in that time.
invoked (Wu et al., 2012). Permission-protected API calls Zhang et al. (Luoshi et al.) published a system (named
are part of the security features of Android operating sys- A3) that considers several features including intent filters
tem. For example, before sending a message or accessing in Android installation file. It then constructs a call graph
the camera, Android checks if the application has permis- that represents flow of the Java code execution. It then uses
sion to do so (Felt et al., 2046). Based on such scenario, A* algorithm to determine the shortest path that subse-
focus of researchers is on permissions more than other quently shows the behavior of the malware.
static features to detect malware based on demanded DREBIN (Arp et al., 2014) presents a broad static anal-
permissions. ysis. The approach collects static features of Android
installation file including intent filters. They used support
Android Java code feature vector machine (SVM), a machine learning algorithm, for
Developers write Android applications in Java pro- detection purpose. The results of the experiment showed
gramming language and subsequently compile them to a that DREBIN detected 94% of malware with low false alarm.
special format called Dalvik, which is proprietary to Android
operating system. Google introduced Android runtime  Network Address: Attackers instructs malware to con-
(ART) in the 4.4 release. Although ART offers new features tact them and report their status or send users personal
(i.e. ahead-of-time compilation, improved garbage collec- data. To do so, attackers embed address of the server,
tion, development and debugging improvements), Dalvik known as command & control (C&C) server, in malicious
remains the default runtime in the Android operating sys- code of the malware. Researchers look for network
tem (Google, 2014c). Researchers have used various analysis address or IP address of the C&C server in code of
approaches on the Java code. Some researchers use Appli- Android installation files. Zhang et al. (Luoshi et al.) and
cation Programming Interface (API) calls to detect malware Arp et al. (Arp et al., 2014) incorporated the network
(Rastogi et al., 2014; Grace et al., 2307; Deshotels et al.; address as one of the static features in their systems.
Yerima et al., 2013; Zheng et al., 2013). Every Android  Strings: Sanz et al. stated that one of the widely used
application needs to have API calls to interact with the de- techniques in classic malware detection is analyzing
vice. As an example, there are API calls to the telephony strings available in the file. They applied the same
manager of the operating system to retrieve phone ID and technique for Android malware by extracting every
subscriber ID. The API calls in a method are sequential. Re- printable string in Android file, such as menus in the
searchers consider such a sequence as the applications application or the server address with which the
signature that is unique to that application. Changing the application connects. The authors used Vector Space
sequence of API calls is a strategy used by attackers to Model (VSM) (Baeza-Yates and Ribeiro-Neto, 1999) to
bypass the detection process, called code obfuscation. represent the strings as vectors in multidimensional
Analyzing control flow of the Java code is another approach space. Afterward, authors used distance measures, such
adopted by researchers (Suarez-Tangil et al., 2014; Crussell as Manhattan distance, Euclidean distance and Cosine
et al.; Xu et al., 2013; Chin et al., 2000; Sahs and Khan, similarity to calculate anomaly of the data. The authors
2012). Although attackers can change the sequence of API evaluated the results with 666 samples of Android ap-
calls or rename API calls to evade detection system, the flow plications. They achieved accuracy of 83.51% and TPR of
of the Java code does not change and researchers use it to 94% in the experiments.
develop stronger detection systems.  Hardware Components: DREBIN (Arp et al., 2014) used
hardware components as a static feature. As part of
Other static features AndroidManifest.xml file, applications request combi-
Besides permissions and Java code, some researchers nations of hardware that they need in order to function,
analyze several other static features that are as follows. for example, the camera or GPS. Combinations of
requested hardware imply harmfulness of the applica-
 Intent Filter: Intent filter is one of the elements tion, for example, 3G and GPS access implies a malware
described in the manifest file. It is an abstract informa- that reports location of the user to the attacker.
tion about an operation, with which we infer intentions
of the applications. For example, pick a contact, take a Other than mentioned static features, additional fea-
photo, dial a number, etc. Based on intent filters, tures have the potential to be used as Android static
Android takes appropriate actions. Researchers have feature. Shabtai et al. (2010a) used features like .apk size,
been using intent filters for malware detection, since number of zip entries, number of files for each file type,
28 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

count of XML elements and features for each element could be as many as a million records. Furthermore,
name. analyzing collected network traffic requires profound un-
derstanding of network architecture.
Dynamic features
Other dynamic features
We define dynamic features as behavior of the appli- In addition to system calls and network traffic, re-
cation in interaction with operating system or network searchers have been using other dynamic features. The
connectivity. There are two main types of dynamic features following discusses other dynamic features.
used in recent works: system calls and network traffic.
Every application demands resources and services from  System Components: Mobile devices have similar
operating system by issuing system calls, such as read, components as personal computers, such as CPU and
write and open. memory. Some researchers investigated detection of
Network traffic is another dynamic feature used by re- Android malware using system components. In MADAM
searchers. Applications tend to connect to network to send (Dini et al., 1007), the authors analyzed CPU usage, free
and receive data, receive updates, or maliciously leak per- memory, and running processes of mobile devices that
sonal data to attackers. Monitoring network traffic of mo- are considered kernel level of the operating system. In
bile devices is a way of catching culprit in the act. Based on addition, it examined user/application level features,
our analysis, 42 out of 100 papers used dynamic features. such as Bluetooth and Wi-Fi status of the device. The
Twenty-two papers used system calls as their dynamic collected data were used to train k-nearest neighbors
feature and 10 papers used network traffic. The remaining algorithm.
10 papers selected other dynamic features, such as system
components or user interaction. STREAM (Amos et al., 2013) was introduced in 2013 for
Android operating system. It collects data regarding system
Android system call feature components like cpuUser, cpuIdle, cpuSystem, cpuOther,
There are more than 250 system calls in a Linux kernel memActive, and memMapped. It subsequently uses ma-
that are also available in Android (Burguera et al., 2046). chine learning algorithms to train the system in order to
Analyzing system calls leads to anomaly detection in ap- detect Android malware. Ham and Choi (Hyo-Sik and Mi-
plications behavior (Feizollah et al., 2013). Applications use Jung, 2013) and Hoffmann et al. (Hoffmann et al., 2013),
system calls to perform specific tasks such as read, write also used system components as dynamic features.
and open since they cannot directly interact with the
Android operating system. Upon issuing a system call in  User Interaction: Users are potential victims of mali-
user mode, Android operating system switches to kernel cious applications. Analyzing users interaction with
mode to perform the required task. System call is the most applications is one of the possible solutions in malware
selected feature among dynamic features, constituting detection. PuppetDroid (Gianazza et al., 2014) captures
more than half of the reviewed papers. Works such as users interaction with the device (e.g. pushing a button,
(Burguera et al.; Zhao et al.; Yan and Yin; Su et al., 2012; zooming and navigating through pages). The authors
Khune and Thangakumar, 2012) captured and analyzed evaluated the system with 15 Android applications. The
system calls to detect malicious application. goal is that after capturing user interactions related to a
malware, the system looks for similar user interaction to
Android network traffic feature detect malicious applications.
Majority of applications, normal or malicious, require
network connectivity. In (Yajin and Xuxian, 2012), the au-
Dynodroid (Machiry et al., 2491) is another system
thors stated that 93% of their collected Android malware developed based on the user interaction analysis. It collects
samples need network connection in order to connect to
users activities, such as tapping the screen, long pressing
attackers. Additionally, Sarma et al. published a research and dragging. The evaluation of Dynodroid involves
work in 2012 in which they analyzed permissions of
analyzing 50 Android applications. The results found bugs
Android files. They examined over 150,000 applications in Android applications.
and found out that 68.50% of normal applications require
As we mentioned in the static features section, intent
network access while 93.38% of malicious applications filters are extracted from AndroidManifest.xml file as static
require network access. Similarly, in (Yerima et al., 2014)
feature. There is a potential to use intents as dynamic
permissions of 2000 applications were analyzed. Over 93% feature. Feng et al. (Feng et al., 2013) used intents as dy-
of malicious applications requested network connectivity.
namic feature by monitoring them at run time.
It is evident that majority of applications request network
access, particularly the malicious ones. Therefore, it be-
hooves researchers to focus on analyzing network traffic for Hybrid features
effective Android malware detection.
Despite the effectiveness of network traffic feature in We define hybrid features as a group of static and dy-
mobile malware detection, it has not attracted researchers namic features that are used together in detection systems.
attention as much as the other dynamic features. Utilizing They are the most comprehensive features, since they
network traffic imposes the challenge of dealing with involve vetting Android application installation file as well
massive number of network records in the dataset that as analyzing behavior of the application at runtime. Blasing
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 29

et al. (Blasing et al., 2010) developed AASandbox which  Creator ID: Every developer has an ID in Google Play.
analyzes static and dynamic features. It extracts permis- They use their ID to publish the applications. In case of
sions and Java code from the APK file and uses them as detecting a malicious application, Google is able to
static features. It then installs the application; logs system identify the developer and terminates the developers ID.
calls, and uses it as dynamic feature. Wei et al. (Wei et al.)
published ProfileDroid in which they examined Android- Discussions
Manifest.xml and Java code as static features. They chose
user interaction, system calls and network traffic as dy- In this paper, we looked back at the related works with
namic features. Some of the similar works that chose respect to feature selection in Android malware detection.
hybrid features are as follows, (Zhou et al.; Spreitzenbarth We categorized feature selection of Android malware
et al.; Eder et al., 2013; Xu et al., 2013). detection into four groups (i.e. static, dynamic, hybrid and
metadata). Such categorization assists researchers make
decision on which features to choose. Additionally, they
Android applications metadata get to know selected features by other researchers. In this
section, we discuss guidelines on feature selection.
A few researchers opted to utilize Android applications Moreover, we talk about available datasets of Android
metadata for malware detection. We define metadata as malware. We also review evaluation measures used in
the information users see prior to the download and recent research works. Finally, we present current chal-
installation of the applications, such as the applications lenges and open research areas to conclude our
description, their requested permissions, their rating and discussions.
information regarding developers. Applications metadata
cannot be categorized as static or dynamic features as they
have nothing to do with applications themselves. Feature selection guidelines
In WHYPER (Pandita et al., Xie), the authors access the
permissions requested by applications in the market and Choosing appropriate features is an important step in
used Natural Language Processing (NLP) to look for sen- conducting an experiment that determines effectiveness
tences that justifies the need for the requested permissions. and results of a research work. Android applications have
It achieved 82.8% precision for three permissions (address many features. We suggest the following two approaches
book, calendar and record audio) that protect sensitive and for feature selection based on reviewed papers.
personal data.
Similarly, Teufl et al. (2013) used sophisticated knowl-  Selection based on rationalizing: As mentioned in the
edge discovery process and lean statistical methods to Section 3.1.1, permissions are static feature and the first
analyze the metadata gathered from Google Play. The au- line of defense in Android applications against the
thors argue that metadata analysis should be part of static attacker, which is a plausible reason for choosing it as a
or dynamic analysis as a complement. They collected the feature. Among static features, research community has
following data including the last time modified, category, been paying a considerable attention to permissions of
price, description, permissions, rating, and number of Android applications. This signifies that authors com-
downloads. The authors mentioned that the following data prehended effectiveness of this feature and chose them
can also be used as metadata, such as creator ID, contact based on reasoning.
email, contact website, promotional video, number of Java code is another static feature used widely in recent
screenshots, promo texts, recent changes, ID, package works. Java code is the source of malicious activities of
name, installation size, version, application type, ratings malware and undoubtedly is a focus of research works.
count and application title. The authors also used machine However, analyzing Java code is more difficult than
learning algorithms in their experiments. Definitions of analyzing permissions since attackers employ various
some of the aforementioned metadata are as follows. techniques (e.g. obfuscation, encryption) to evade
available detection methods (Petsas et al.). Thus, re-
 Last Time Modified: Applications in Google Play go searchers have been developing complex methods to
through changes and updates. The date of last modifi- detect threats in Java code (Suarez-Tangil et al., 2014; Lu
cation is a metadata. et al., 2012; Deshotels et al.). We believe that using
 Category: Google Play categorizes applications based on Java code is more complex in malware detection than
their types, such as games, applications and book. Each permissions and requires developing aggressive
game type further subcategorized as the action, methods.
adventure, arcade and board. Among dynamic features, Feizollah et al. (2013) used
 Description: Developers provide a brief description to network traffic feature to detect mobile malware. The
describe the main functionality of the applications. authors selected three network features, namely, the
 Permissions: Upon opting to install an application, it TCP size, connection duration and number of GET/POST
prompts the user with the list of permissions that the parameters. They provided justification for using each of
application requires to function properly. the parameters. Appropriate and justified selection of
 Rating: Users rate every application based on their features led to 99.94% of detection rate. Another related
experience with the application. It is helpful for new work is (Shabtai et al., 2014) in which authors use
users to decide whether to download the application. network traffic for mobile malware detection.
30 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

However, researchers have used system call much more system on self-written malware. Other researchers tried to
frequently than other dynamic features. Although all collect samples through some websites, which shared
activities are done through system calls and selecting Android malware samples, such as Contagio (2014).
them as a feature is appropriate, we argue that network Therefore, the weakness was limitation of malware sam-
traffic of Android applications is a precious source for ples that in turn made the evaluation of their system un-
mobile malware analysis and only a few of the reliable. In 2012, MalGenome (Yajin and Xuxian, 2012) data
contemporary research works have focused on such sample was released that contains 1260 malware samples
features (refer to Section 3.2.2). categorized into 49 different malware families. It is a
 Selection based on feature ranking algorithms: We collection of malware from August 2010 to October 2011.
found out that only 8 out of 100 reviewed papers The availability of such a valuable data sample filled the gap
employed feature ranking algorithms. Feature selection for most researchers. Based on our study, 24 out of 100
and ranking is task of selecting subset of original fea- papers used MalGenome as their data sample. Table 5
tures that provides the most useful and important fea- shows the distribution of MalGenome usage in the
tures based on the developed algorithms (Jensen and related works. It is worth noting that the authors released
Shen, 2008). The algorithm receives collected dataset MalGenome in 2012 and four research works used it in the
and uses mathematical calculations to rank features in same year. In 2013, the number raised to more than triple of
the dataset. Information gain algorithm has been widely 2012, which shows that researchers welcome a standard
used for feature selection based on the entropy differ- and solid data sample. As of reviewing related works, in
ence between the cases of using a feature against March 2014, seven papers, (Yerima et al., 2014; F-Secure,
otherwise (Hyo-Sik and Mi-Jung, 2013). Shabtai and 2014; Gianazza et al., 2014; Ham and Lee, 2014; Ham
Elovici used feature ranking algorithms to select sub- et al., 2014; Deshotels et al., 2556; Seo et al., 2014), used
set of features from 88 collected features. They managed MalGenome, which is half of all 2013.
to select top 10, 20 and 50 features from original dataset. However, based on the nature of malware, they change
Comparably, Shabtai et al. (2014) analyze network traffic shape and infecting technique to evade detection. There-
of Android applications. They used feature selection al- fore, it behooves researchers to update the data samples to
gorithms to select the most useful features among develop systems that are more effective. By introducing
massive features in network traffic data. Similarly, in DREBIN (Arp et al., 2014) in 2014, such need was fulfilled.
(Shabtai et al., 2010a) the authors collected 2285 DREBIN is a collection of 5560 malware from 179 different
Android applications and extracted more than 9898 families, which were collected between August 2010 and
features. They then used feature selection algorithms to October 2012. In 2014, this data sample was used in 19
choose top 50, 100, 200, 300, 500 and 800 features. research works. Thus, it shows that research community is
keen on using standardized data samples.
Table 4 compares advantages and disadvantages of
various features based on which researchers are able to Evaluation measures
make decisions.
Researchers assess the effectiveness of a proposed sys-
Datasets tem by how accurate it can detect malware using various
evaluation measures. We analyzed the measures utilized in
In this section, we want to emphasize on the importance the literature. Table 6 shows some of the references along
of dataset in experiments. Every experiment requires a with their respective measures. It is vital to discuss com-
dataset based on which authors evaluate their proposed mon evaluation measures in the research community.
system. Android malware is a relatively new research area. Below is the definition of each one.
The first Android malware was discovered in 2010 Confusion Matrix: Results of an experiment can be
(Lookout, 2010). Initially, researchers did not have a solid represented in the form of a table known as confusion
and standard dataset of samples to work with. Instead, they matrix (Davis and Goadrich). It has four categories as
tended to write their own malware and assessed their following:

 True Positive (TP): It is the number of correctly classi-


Table 4
Comparison of different types of features. fied instances as positive. It means that how successful a
system is in detecting a malware as malicious. As the
Advantages Disadvantages
true positive increases, the result is better.
Static Easy to extract Obfuscation
features
Dynamic More comprehensive Code coverage Table 5
features than static features Difficult to extract Use of MalGenome data sample in the reviewed works.
Requires rooted device
Year Number of papers
Hybrid The most comprehensive Extraction process is
features collection of features complex 2010 0
Have to extract static and 2011 0
dynamic features 2012 4
Applications Easy to extract Not widely used among 2013 13
metadata researchers 2014 (First Quarter) 7
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 31

Table 6
TP
Evaluation measures in selected reviewed papers. Precision ¼ (2)
TP þ FP
Evaluation measures Type of features Reference

Confusion matrix Static Features Wu et al. (2012)


Dynamic Features Su et al. (2012)
 Recall: It is equivalent to the true positive rate (TPR). It
Applications Metadata Pandita et al. assumes that there are malware in dataset and asks this
True positive Static Features Grace et al. question. Is the algorithm going to detect malware?
Dynamic Features Zhao et al. Therefore, it measures performance of the algorithm.
Dynamic Features Iland et al. (2011)
Hybrid Features Kim et al. (2013)
Accuracy Static Features Wu et al. (2012)
TP
Static Features Sanz et al. (2013a) Recall ¼ (3)
Dynamic Features Burguera et al. TP þ FN
Applications Metadata Pandita et al.
Precision Static Features Wu et al. (2012)
Applications Metadata Pandita et al.
True positive rate Static Features Peng et al. (2012)  F-measure: It is weighted harmonic mean of precision
Static Features Sanz et al. (2013a) and recall. Researchers seldom used this measure in
Dynamic Features Feizollah et al. (2013)
F-measure Static Features Wu et al. (2012)
research works.
Applications Metadata Pandita et al.
ROC Static Features Peng et al. (2012)
Static Features Sanz et al. (2013a)
Dynamic Features Feizollah et al. (2013)
2  precision  recall
F  measure ¼ (4)
precision þ recall
 False Positive (FP): It is the number of incorrectly
classified instances as positive. It means that the ratio of
which the algorithm considers normal data as mali-
 ROC: Receiver Operating Characteristic (ROC) curve in-
cious. As the false positive decreases, it shows that the
dicates how a detection rate change as the internal
system is more accurate.
threshold changes to generate more or fewer false
 True Negative (TN): It is the number of correctly clas-
alarm. It plots intrusion detection accuracy against false
sified instances as negative.
positive probability. Area Under the Curve (AUC) often
 False Negative (FN): It is the number of incorrectly
accompanies the ROC curve. It is the area under the ROC
classified instances as negative.
curve. Its value varies between 0 and 1. As it approaches
1, the system has better performance.
Table 7 shows the confusion matrix as a table. It is worth
noting that some researchers calculate true positive that is
detection rate and false positive that is false alarm only,
Challenges and open research areas
since they are more important measures than others.

Although there have been many research works in the


 Accuracy: It shows how accurate the system can detect
mobile malware detection field, there are some challenges
malware. As an example, accuracy of 0.6 implies that the
still available. Based on reviewed papers, we explain them
system is capable of detecting 60 malware from dataset
as follows. Additionally, we mention open research areas
of 100 malware.
that have potentials in researching.

Virtual environment
With the advent of mobile malware, researchers used
TP þ TN
Accuracy ¼ (1) virtual environment to evaluate malware behavior. As time
TP þ FP þ TN þ FN
passed by, attackers became aware of utilizing virtual en-
vironments. As a result, we are witnessing new types of
mobile malware that are knowledgeable about virtual
 Precision: It is the number of instances correctly clas- environment and they are able to detect if they are active in
sified as class X among those classified as class X. Simply such environment. F-Secure published a report about
put, the precision addresses the following question. Dendroid (F-Secure, 2014) in which it mentions the
Based on prediction, how likely is it that the prediction awareness of this malware about virtual environment. It
be true? hides its malicious behavior when it is in virtual environ-
ment. Similarly, Android.hehe appeared in 2014 that is
Table 7 capable of detecting virtual environment and emulators
Confusion matrix. and hides its malicious behaviors (Dharmdasani, 2014).
Actual positive Actual negative Research works were published recently that addressed the
malware awareness about virtual environment (Petsas
Predicted positive TP FP
Predicted negative FN TN
et al.; Vidas and Christin). There is a need in developing
detection methods to discover such malware.
32 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

Obfuscation Table 8
One of the techniques in mobile malware detection is List of all reviewed papers.

using the Java code and generating unique signatures based No. Reference Type of Year No. of
on classes, fields and methods in the code. Obfuscation is a feature tested apps
method that attackers use in order to evade the detection 1 Zhemin and Min Static 2012 1750
process by renaming classes, fields and methods. As a result, 2 Arzt et al. (2014) Static 2014 e
the generated signature is different from what detection 3 Yerima et al. (2014) Static 2013 2000
4 Desnos (2012) Static 2012 e
systems have. Some research works have addressed such
5 Apvrille and Apvrille (2013) Static 2013 e
issues and proposed methods against obfuscation such as 6 Aung and Zaw (2013) Static 2013 500
(Christodorescu et al.; Fredrikson et al., 2010; Kolbitsch 7 Grace et al. Static 2013 e
et al.). They used call graph that uses graph to draw the 8 Feng et al. (2013) Static 2013 e
9 Rastogi et al. (2014) Static 2014
flow of system calls. However, we cannot implement such e
10 Faruki et al. Static 2013 6779
solutions on available tools due to low privileges of anti- 11 Suarez-Tangil et al. (2014) Static 2014 1231
malware tools on Android. Thus, there is a need to 12 Rosen et al. Static 2013 2782
develop effective solutions for this existing challenge. 13 Peng et al. (2012) Static 2012 325,036
14 Grace et al. Static 2012 118,318
15 Lu et al. (2012) Static 2012 5486
Code coverage
16 Crussell et al. Static 2012 9400
Researchers often argue that dynamic analysis of a 17 Sarma et al. Static 2012 158,062
malware results in exposing some of its malicious behavior, 18 Samra et al. (2013) Static 2013 18,174
but not all of them. Code coverage is covering every 19 Arp et al. (2014) Static 2014 129,013
possible execution of applications code. Sasnauskas and 20 Deshotels et al. Static 2014 1100
21 Luoshi et al. Static 2013 e
Regehr (2014) mention that producing highly structured 22 Gascon et al. (2013) Static 2013 12,158
inputs that get high code coverage is an open research 23 Sanz et al. (2013b) Static 2013 3013
challenge. AppsPlayground (Rastogi et al.) is a framework 24 Walenstein et al. (2012) Static 2012 e
that analysis GUI and injects random events to trigger 25 Sanz et al. Static 2013 666
26 Wu et al. (2012) Static 2012 1738
malicious behaviors. However, its code coverage is 33%.
27 Huang et al. Static 2012 125,249
Other works such as (Bierma et al., 2014; Gilbert et al., 28 Zhou et al. Static 2012 91,093
2014) mentioned code coverage as limitation of their 29 Aafer et al. Static 2013 20,000
work or as future research. 30 Lee and Jin (2013) Static 2013 e
31 Yerima et al. (2013) Static 2013 2000
32 Shabtai et al. (2010a) Static 2010 2285
Wearable devices 33 Sahs and Khan (2012) Static 2012 2172
Wearable devices are new generation of computing 34 Zheng et al. (2013) Static 2013 150,368
technology. It is estimated that the global market revenue 35 Sanz et al. (2013a) Static 2013 2060
of such devices to cross USD $8 billion in 2014 36 Zhou et al. Static 2013 84,767
37 Huang et al. Static 2014 182
(marketsandmarkets, 2014). In addition, the wearable
38 Almohri et al. Static 2014 405
market is expected to grow by 350% in 2014, based on a 39 Zheng et al. Static 2014 24,009
report from CNBC (CNBC, 2014). The operating system of 40 Sanz et al. Static 2014 2144
such devices is a modified version of Android, which raises 41 Paturi et al. (2013) Static 2013 e
concern over security issue. Google Glass is a famous 42 Seo et al. (2014) Static 2014 1257
43 Rasthofer et al. (2014) Static 2014 11,000
wearable device developed by Google (Google, 2014d). 44 Liang et al. Static 2013 52
Security analysts expressed their concern over Google 45 Wu et al. Static 2014 e
Glass, since it is another Android platform and it is source of 46 Tchakount and Dayang (2013) Dynamic 2013 e
valuable information. The main input is the camera. The 47 Hyo-Sik and Mi-Jung (2013) Dynamic 2013 14,794
48 Yu et al. (2013) Dynamic 2013
attacker could see everything the victim sees. This could e
49 Shabtai and Elovici Dynamic 2010 43
include banking login information, two-factor authentica- 50 Chekina et al. Dynamic 2012 10
tion codes or possibly extorting money from a victim by 51 Backes et al. Dynamic 2012 e
capturing embarrassing video (Security_Watch, 2013). 52 Baliga et al. (2013) Dynamic 2013 9
Thus, it is a new research area in security field, which is 53 Rastogi et al. Dynamic 2013 3968
54 Burguera et al. Dynamic 2011 e
worth researching. 55 Yan and Yin Dynamic 2012 e
56 Dini et al. Dynamic 2012 56
New standardized dataset 57 Enck et al. (2014) Dynamic 2010 30
As discussed before, Android malware are changing 58 Portokalidis et al. Dynamic 2010 e
59 Choi et al. Dynamic 2013
rapidly. They have moved toward stealing data and e
60 Gianazza et al. (2014) Dynamic 2014 15
hijacking mobile devices. Ransomware are on the rise that 61 Ham and Lee (2014) Dynamic 2014 1257
steal users data or encrypt mobile devices and they de- 62 Ham et al. (2014) Dynamic 2014 1257
mand ransom. As an instance, researchers discovered a 63 Zhang et al. (2013) Dynamic 2013 1249
Trojan-ransom in May 2014. It uses standard HTTP 64 Su et al. (2012) Dynamic 2012 120
65 Maggi et al. Dynamic 2013 18,758
communication and asks for money. When they ask for 66 Zhao et al. Dynamic 2011 200
money from the user, they show users image using front 67 Shabtai et al. (2014) Dynamic 2014 500,000
camera of the device (Unuchek, 2014). Because of the 68 Xiaoming and Qiaoyan (2011) Dynamic 2011 e
aforementioned changes in malware, researchers need to
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 33

Table 8 (continued ) Acknowledgments


No. Reference Type of Year No. of
feature tested apps This work was supported in part by the Ministry of
69 Houmansadr et al. (2011) Dynamic 2011 e
Higher Education, Malaysia, under Grant FRGS FP034-
70 Iland et al. (2011) Dynamic 2011 18 2012A and the Ministry of Science, Technology and Inno-
71 Amos et al. (2013) Dynamic 2013 1738 vation, under Grant eScienceFund 01-01-03-SF0914.
72 Karami et al. (2013) Dynamic 2013 20
73 Damopoulos et al. (2012) Dynamic 2011 e
74 Reina et al. Dynamic 2013 1200
75 Khune and Thangakumar (2012) Dynamic 2012 e References
76 Zonouz et al.2013) Dynamic 2013 e
77 Isohara et al. (2011) Dynamic 2011 230 Aafer, Y., Du, W., Yin, H., Droidapiminer: Mining api-level features for
78 Feizollah et al. (2013) Dynamic 2013 1257 robust malware detection in android, In: 9th International Con-
79 Feizollah et al. (2014) Dynamic 2014 1000 ference on Security and Privacy in Communication Networks,
pp. 86e103. URL http://dx.doi.org/10.1007/978-3-319-04283-1_6.
80 Hoffmann et al. (2013) Dynamic 2013 e
Almohri, H. M., Yao, D., Kafura, D., Droidbarrier: know what is executing
81 Lu et al. (2014) Dynamic 2014 331
on your android, in: 4th ACM conference on Data and application
82 Lin et al. (2013) Dynamic 2013 100
security and privacy, pp. 257e264, (Daphne). http://dx.doi.org/10.
83 Shabtai et al. (2010b) Dynamic 2010 5 1145/2557547.2557571.
84 Veen (2013) Dynamic 2013 e Amos B, Turner H, White J. Applying machine learning classifiers to dy-
85 Bente (2013) Dynamic 2013 e namic android malware detection at scale. In: 9th International
86 Machiry et al. Dynamic 2013 50 Wireless Communications and Mobile Computing Conference
87 Jang et al. Dynamic 2014 709 (IWCMC); 2013. p. 1666e71. http://dx.doi.org/10.1109/
88 Spreitzenbarth et al. Hybrid 2013 36,000 IWCMC.2013.6583806.
89 Zhou et al. Hybrid 2012 204,040 Apple. Apple e ios 7. 2014. URL, www.apple.com/ios.
90 Moonsamy et al. (2013) Hybrid 2013 1227 Aptoide. Aptoide d android apps store. 2014. URL, http://www.aptoide.
91 Wei et al. Hybrid 2012 27 com/.
92 Eder et al. (2013) Hybrid 2013 1260 Apvrille L, Apvrille A. Pre-filtering mobile malware with heuristic tech-
93 Blasing et al. (2010) Hybrid 2010 e niques. Report. 2013. URL, http://www.fortiguard.com/paper/Pre-
filtering-Mobile-Malware-with-Heuristic-Techniques.
94 Kim et al. (2013) Hybrid 2013 1003
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K. Drebin: effective
95 Xu et al. (2013) Hybrid 2013 100,000
and explainable detection of android malware in your pocket. In:
96 Zheng et al. Hybrid 2012 19 Network and Distributed System Security (NDSS) Symposium; 2014.
97 Shalaginov and Franke Hybrid 2014 604 Arzt S, Rasthofer S, Fritz C, Bodden E, Bartel A, Klein J, et al. Flowdroid:
98 Guido et al. (2013) Metadata 2013 e precise context, flow, field, object-sensitive and lifecycle-aware taint
99 Teufl et al. (2013) Metadata 2013 e analysis for android apps. ACM SIGPLAN Not 2014;49(6):259e69.
100 Pandita et al. Metadata 2013 e http://dx.doi.org/10.1145/2666356.2594299.
Aung Z, Zaw W. Permission-based android malware detection. Int J Sci
Technol Res 2013;2(3):228e34.
AVG.ThreatLabs. Android/dowgin. 2013. URL, http://www.avgthreatlabs.
com/virus-and-malware-information/info/android-dowgin/.
Backes, M., Gerling, S., Hammer, C., Maffei, M., Styp-Rekowsky, P. v.,
evaluate new systems on new types of malware, rather
Appguard: enforcing user requirements on android apps, in: 19th
than old ones. It benefits the research community to pub- international conference on Tools and Algorithms for the Construc-
lish Android malware dataset similar to (Yajin and Xuxian, tion and Analysis of Systems, pp. 543e548. http://dx.doi.org/10.1007/
2012; Arp et al., 2014) along with comprehensive analysis 978-3-642-36742-7_39.
Baeza-Yates RA, Ribeiro-Neto. Modern information retrieval. Boston:
of the malware. So that researchers develop effective sys- Addison-Wesley Longman Publishing Co.; 1999.
tems based on the analysis. Baliga A, Bickford J, Daswani N. Titan: a carrier-based approach for
detecting and mitigating mobile malware. Report, AT&T. 2013. URL,
http://www.research.att.com/techdocs/TD_101245.pdf.
Conclusion Bente I. Towards a network-based approach for smartphone security.
Thesis. 2013. URL, https://137.193.200.7/doc/90756/90756.pdf.
In this review paper, we analyzed 100 papers with the Bierma M, Gustafson E, Erickson J, Fritz D, Choe YR. Andlantis: large-scale
android dynamic analysis. Report. 2014. URL, http://mostconf.org/
perspective of feature selection in Android malware 2014/papers/s3p2.pdf.
detection. We categorized features of Android malware Blasing T, Batyuk L, Schmidt A-D, Camtepe SA, Albayrak S. An android
into four groups. The first group comprised of static fea- application sandbox system for suspicious software detection. In: 5th
International Conference on Malicious and Unwanted Software
tures that are pertaining to Android installation file itself (MALWARE); 2010. p. 55e62. http://dx.doi.org/10.1109/
prior to installation on the device. The second group in- MALWARE.2010.5665792.
cludes dynamic features that are pertaining to the behavior Burguera, I., Zurutuza, U., Nadjm-Tehrani, S., Crowdroid: behavior-based
malware detection system for android, in: 1st ACM workshop on
of the application after installation. The third group is security and privacy in smartphones and mobile devices, pp. 15e26.
hybrid features that are combination of both static and http://dx.doi.org/10.1145/2046614.2046619.
dynamic features. The last group was metadata that is any Chekina, L., Mimran, D., Rokach, L., Elovici, Y., Shapira, B., Detection of
deviations in mobile applications network behavior. URL, http://arxiv.
related data on Google Play. We examined each group in org/abs/1208.0564.
details. We also delved into discussing datasets used in the Chen BX, Bilton N. Building a better battery. 2014. URL, http://www.
literature along with the evaluation measures utilized in nytimes.com/2014/02/03/technology/building-a-better-battery.html?
_r¼0.
recent works. Furthermore, we listed all the reviewed pa-
Chin, E., Felt, A. P., Greenwood, K., Wagner, D., Analyzing inter-application
pers in Table 8 for researchers to have a glance view of communication in android, In: 9th international conference on Mo-
recent works. It is worth mentioning that some papers bile systems, applications, and services, pp. 239e252. http://dx.doi.
introduced novel methods, however due to lack of malware org/10.1145/1999995.2000018.
Choi, S., Sun, K., Eom, H., Android malware detection using library api call
sample, authors could not test their systems thoroughly tracing and semantic-preserving signal processing techniques,
(e.g. (Shabtai et al., 2010b)). Report.
34 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

Christodorescu, M., Jha, S., Kruegel, C., Mining specifications of malicious Fredrikson M, Jha S, Christodorescu M, Sailer R, Yan X. Synthesizing near-
behavior, In: 6th joint meeting of the European software engineering optimal malware specifications from suspicious behaviors. In: IEEE
conference and the ACM SIGSOFT symposium on The foundations of Symposium on Security and Privacy; 2010. p. 45e60. http://
software engineering, 1287628, pp. 5e14. http://dx.doi.org/10.1145/ dx.doi.org/10.1109/sp.2010.11.
1287624.1287628. Garcia-Teodoro P, Diaz-Verdejo J, Maci-Fernndez G, Vzquez E. Anomaly-
CNBC. Wearable smart bands set for 350growth in 2014. 2014. URL, http:// based network intrusion detection: techniques, systems and chal-
www.cnbc.com/id/101410507. lenges. Comput Secur 2009;28(1):18e28. URL, http://dx.doi.org/10.
CNET. Android dominates 81 percent of world smartphone market. 2013. 1016/j.cose.2008.08.003.
URL, http://news.cnet.com/8301-1035_3-57612057-94/android- Gartner. Gartner says worldwide pc, tablet and mobile phone shipments
dominates-81-percent-of-world-smartphone-market/. to grow 5.9 percent in 2013 as anytime-anywhere-computing drives
Contagio. Contagio. 2014. URL, http://contagiodump.blogspot.com/. buyer behavior. 2013. URL, http://www.gartner.com/newsroom/id/
Crussell, J., Gibler, C., Chen, H., Attack of the clones: Detecting cloned 2525515.
applications on android markets, In: 17th European Symposium on Gascon H, Yamaguchi F, Arp D, Rieck K. Structural detection of android
Research in Computer Security, Lecture Notes in Computer Science, malware using embedded call graphs. In: ACM workshop on Artificial
pp. 37e54. URL http://dx.doi.org/10.1007/978-3-642-33167-1_3. intelligence and security; 2013. p. 45e54. http://dx.doi.org/10.1145/
Crussell, J., Gibler, C., Chen, H., Attack of the clones: Detecting cloned 2517312.2517315.
applications on android markets, In: 17th European Symposium on Gianazza A, Maggi F, Fattori A, Cavallaro L, Zanero S. Puppetdroid: a user-
Research in Computer Security, Lecture Notes in Computer Science, centric ui exerciser for automatic dynamic analysis of similar android
pp. 37e54. URL http://dx.doi.org/10.1007/978-3-642-33167-1_3. applications. 1st April 2014. 2014. URL, http://arxiv.org/abs/1402.4826.
Damopoulos D, Menesidou SA, Kambourakis G, Papadaki M, Clarke N, Gilbert C, Cronkite-Ratcliff B, Franklin J. Malbehave: classifying malware
Gritzalis S. Evaluation of anomaly-based ids for mobile devices using by observed behavior. Report. 2014. URL, https://www.connorgilbert.
machine learning classifiers. Secur Commun Netw 2012;5(1):3e14. com/.
URL, http://dx.doi.org/10.1002/sec.341. Google. Android. 2014. URL, www.android.com.
Davis, J., Goadrich, M., The relationship between precision-recall and roc Google. permission. 2014. URL, http://developer.android.com/guide/
curves, In: 23rd international conference on Machine learning, topics/manifest/permission-element.html.
pp. 233e240. http://dx.doi.org/10.1145/1143844.1143874. Google. Introducing art. 2014. URL, https://source.android.com/devices/
Deshotels, L., Notani, V., Lakhotia, A., Droidlegacy: Automated familial tech/dalvik/art.html.
classification of android malware, In: ACM SIGPLAN on Program Google. Google glass. 2014. URL, http://www.google.com/glass/start/.
Protection and Reverse Engineering Workshop, pp. 1e12. http://dx. M. Grace, Y. Zhou, Q. Zhang, S. Zou, X. Jiang, Riskranker: scalable and
doi.org/10.1145/2556464.2556467. accurate zero-day android malware detection, in: 10th international
Desnos A. Android: static analysis using similarity distance. In: 45th conference on Mobile systems, applications, and services, pp.
Hawaii International Conference on System Science (HICSS); 2012. 281e294. doi:10.1145/2307636.2307663.
p. 5394e403. http://dx.doi.org/10.1109/HICSS.2012.114. Grace, M., Zhou, Y., Wang, Z., Jiang, X., Systematic detection of capability
Dharmdasani H. Android.hehe: malware now disconnects phone calls. leaks in stock android smartphones, In: 19th Network and Distributed
2014. URL, https://www.fireeye.com/blog/threat-research/2014/01/ System Security Symposium. URL, http://www4.ncsu.edu/~zwang15/
android-hehe-malware-now-disconnects-phone-calls.html. files/NDSS12_Woodpecker.pdf.
Dini, G., Martinelli, F., Saracino, A., Sgandurra, D., Madam: A multi-level Guido M, Ondricek J, Grover J, Wilburn D, Nguyen T, Hunt A. Automated
anomaly detector for android malware, in: 6th International Confer- identification of installed malicious android applications. Digit
ence on Mathematical Methods, Models and Architectures for Com- Investig 2013;10(Suppl. (0)):S96e104. URL, http://www.sciencedirect.
puter Network Security, Lecture Notes in Computer Science, com/science/article/pii/S1742287613000571.
pp. 240e253. URL http://dx.doi.org/10.1007/978-3-642-33704-8_21. Guyon I, Elisseeff A. An introduction to variable and feature selection. J
Eder T, Rodler M, Vymazal D, Zeilinger M. Ananas e a framework for Mach Learn Res 2003;3:1157e82.
analyzing android applications. In: Eighth International Conference Ham YJ, Lee H-W. Detection of malicious android mobile applications
on Availability, Reliability and Security (ARES); 2013. p. 711e9. http:// based on aggregated system call events. Int J Comput Commun Eng
dx.doi.org/10.1109/ARES.2013.93. 2014;3(2):149e54.
Enck W, Gilbert P, Chun B-G, Cox LP, Jung J, McDaniel P, et al. Taintdroid: Ham YJ, Moon D, Lee H-W, Lim JD, Kim JN. Android mobile application
an information flow tracking system for real-time privacy monitoring system call event pattern analysis for determination of malicious
on smartphones. Commun ACM 2014;57(3):99e106. http:// attack. Int J Secur Appl 2014;8(1). 241e236.
dx.doi.org/10.1145/2494522. Hoffmann, J., Neumann, S., Holz, T., Mobile malware detection based on
Eset. Eset virusradar. 2013. URL, http://www.virusradar.com/en/Android_ energy fingerprints a dead end?, In: 16th International Symposium,
Adware.Dowgin/chart/history. RAID 2013, Lecture Notes in Computer Science, pp. 348e368. URL
M. Eslahi, R. Salleh, N. B. Anuar, Mobots: a new generation of botnets on http://dx.doi.org/10.1007/978-3-642-41284-4_18.
mobile devices and networks, in: 2012 IEEE Symposium on Computer Houmansadr, A., Zonouz, S. A., Berthier, R., A cloud-based intrusion
Applications and Industrial Electronics (ISCAIE), IEEE, pp. 262e266. detection and response system for mobile phones, In: 2011 IEEE/IFIP
doi:http://dx.doi.org/10.1109/ISCAIE.2012.6482109. 41st International Conference on Dependable Systems and Networks
F-Secure. Backdoor:android/dendroid.a. 2014. URL, http://www.f-secure. Workshops (DSN-W), pp. 31e32. http://dx.doi.org/10.1109/DSNW.
com/v-descs/backdoor_android_dendroid_a.shtml. 2011.5958860.
Faruki, P., Ganmoor, V., Laxmi, V., Gaur, M. S., Bharmal, A., Androsimilar: Huang, C.-Y., Tsai, Y.-T., Hsu, C.-H., Performance evaluation on permission-
robust statistical feature signature for android malware detection, In: based detection for android malware, In: International Computer
6th International Conference on Security of Information and Net- Symposium ICS, Smart Innovation, Systems and Technologies,
works, pp. 152e159. http://dx.doi.org/10.1145/2523514.2523539. pp. 111e120. URL http://dx.doi.org/10.1007/978-3-642-35473-1_12.
Feizollah A, Anuar NB, Salleh R, Amalina F, Maarof RR, Shamshirband S. A Huang, J., Zhang, X., Tan, L., Wang, P., Liang, B., Asdroid: Detecting stealthy
study of machine learning classifiers for anomaly-based mobile bot- behaviors in android applications by user interface and program
net detection. Malays J Comput Sci 2013;26(4):251e65. behavior contradiction, In: 36th International Conference on Software
Feizollah A, Anuar NB, Salleh R, Amalina F. Comparative study of k- Engineering, 1036e1046. URL, http://doi.acm.org/10.1145/2568225.
means and mini batch kmeans clustering algorithms in android 2568301.
malware detection using network traffic analysis. In: International Hyo-Sik H, Mi-Jung C. Analysis of android malware detection perfor-
Symposium on Biometrics and Security Technologies (ISBAST); mance using machine learning classifiers. In: International Confer-
2014. p. 198e202. ence on ICT Convergence (ICTC); 2013. p. 490e5. http://dx.doi.org/
Felt, A. P., Chin, E., Hanna, S., Song, D., Wagner, D., Android permissions 10.1109/ICTC.2013.6675404.
demystified, in: 18th ACM conference on Computer and communi- Iland D, Pucher A, Schauble T. Detecting android malware on network
cations security, pp. 627e638. http://dx.doi.org/10.1145/2046707. level. Report. UC Santa Barbara; 2011. URL, http://cs.ucsb.edu/~iland/
2046779. AndroidMalwareDetection.pdf.
Feng Y, Anand S, Dillig I, Aiken A. Apposcopy: semantics-based detection Isohara, T., Takemori, K., Kubota, A., Kernel-based behavior analysis for
of android malware. Report. 2013. URL, www.cs.utexas.edu/~isil/ android malware detection, in: 2011 Seventh International Confer-
fse14.pdf. ence on Computational Intelligence and Security (CIS), pp. 1011e1015.
Fortinet. Fortinet's fortiguard labs reports 96.5% of all mobile malware http://dx.doi.org/10.1109/CIS.2011.226.
tracked is android-based. 2014. URL, http://www.fortinet.com/ Jang, J.-W., Yun, J., Woo, J., Kim, H. K., Andro-profiler: anti-malware sys-
resource_center/whitepapers/threat-landscape-report-2014.html. tem based on behavior profiling of mobile malware, In: companion
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 35

publication of the 23rd international conference on World wide conference/usenixsecurity13/technical-sessions/presentation/


web companion, pp. 737e738. http://dx.doi.org/10.1145/2567948. pandita.
2579366. Paturi A, Cherukuri M, Donahue J, Mukkamala S. Mobile malware visual
Jensen R, Shen Q. Computational intelligence and feature selection: rough analytics and similarities of attack toolkits (malware gene analysis).
and fuzzy approaches. New Jersey, USA: John Wiley & Sons; 2008. In: International Conference on Collaboration Technologies and
Karami M, Elsabagh M, Najafiborazjani P, Stavrou A. Behavioral analysis of Systems (CTS); 2013. p. 149e54. http://dx.doi.org/10.1109/
android applications using automated instrumentation. In: IEEE 7th CTS.2013.6567221.
International Conference on Software Security and Reliability- Peng H, Gates C, Sarma B, Li N, Qi Y, Potharaju R, et al. Using probabilistic
Companion (SERE-C); 2013. p. 182e7. http://dx.doi.org/10.1109/ generative models for ranking risks of android apps. In: ACM con-
SERE-C.2013.35. ference on Computer and communications security, ACM; 2012.
Khune RS, Thangakumar J. A cloud-based intrusion detection system for p. 241e52. http://dx.doi.org/10.1145/2382196.2382224.
android smartphones. In: International Conference on Radar, Peng S, Yu S, Yang A. Smartphone malware and its propagation modeling:
Communication and Computing (ICRCC); 2012. p. 180e4. http:// a survey. IEEE Commun Surv Tutor 2014;16(2):925e41. http://
dx.doi.org/10.1109/ICRCC.2012.6450572. dx.doi.org/10.1109/SURV.2013.070813.00214.
Kim, D.-U., Kim, J., Kim, S., A malicious application detection framework Petsas, T., Voyatzis, G., Athanasopoulos, E., Polychronakis, M., Ioannidis, S.,
using automatic feature extraction tool on android market, in: 3rd Rage against the virtual machine: hindering dynamic analysis of
International Conference on Computer Science and Information android malware, In: Seventh European Workshop on System Secu-
Technology (ICCSIT2013), pp. 4e5. rity, pp. 1e6. http://dx.doi.org/10.1145/2592791.2592796.
Knoernschild K. Rich mobile application platforms for the smartphone Portokalidis, G., Homburg, P., Anagnostakis, K., Bos, H., Paranoid android:
2010. Report. Burton Group; 2010. URL, http://blogs.stlawu.edu/ versatile protection for smartphones, in: 26th Annual Computer Se-
mobile/files/2010/07/rma2010.pdf. curity Applications Conference, pp. 347e356. http://dx.doi.org/10.
Kolbitsch, C., Comparetti, P. M., Kruegel, C., Kirda, E., Zhou, X., Wang, X., 1145/1920261.1920313.
Effective and efficient malware detection at the end host, In: 18th R. in Motion, Blackberry smartphones. 2014. URL, http://us.blackberry.
conference on USENIX security symposium, pp. 351e366. com/.
La Polla M, Martinelli F, Sgandurra D. A survey on security for mobile Rasthofer S, Arzt S, Bodden E. A machine-learning approach for classi-
devices. IEEE Commun Surv Tutor 2013;15(1):446e71. http:// fying and categorizing android sources and sinks. In: Network and
dx.doi.org/10.1109/SURV.2012.013012.00028. Distributed System Security (NDSS) Symposium; 2014.
Lee S-H, Jin S-H. Warning system for detecting malicious applications on Rastogi V, Yan C, Xuxian J. Catch me if you can: evaluating android
android system. Int J Comput Commun Eng 2013;2(3):324e7. anti-malware against transformation attacks. IEEE Trans Inf
Liang, S., Keep, A. W., Might, M., Lyde, S., Gilray, T., Aldous, P., Horn, D. V., Forensics Secur 2014;9(1):99e108. http://dx.doi.org/10.1109/
Sound and precise malware analysis for android via pushdown TIFS.2013.2290431.
reachability and entry-point saturation, In: Third ACM workshop on Rastogi, V., Chen, Y., Enck, W., Appsplayground: automatic security
Security and privacy in smartphones mobile devices, pp. 21e32. analysis of smartphone applications, in: third ACM conference on
http://dx.doi.org/10.1145/2516760.2516769. Data and application security and privacy, pp. 209e220. http://dx.doi.
Lin Y-D, Lai Y-C, Chen C-H, Tsai H-C. Identifying android malicious org/10.1145/2435349.2435379.
repackaged applications by thread-grained system call sequences. Reina, A., Fattori, A., Cavallaro, L., A system call-centric analysis and
Comput Secur November 2013;39(Part B):340e50. URL, http://www. stimulation technique to automatically reconstruct android malware
sciencedirect.com/science/article/pii/S0167404813001272. behaviors, In: 6th European Workshop on Systems Security. http://
Lookout. Security alert: Geinimi, sophisticated new android trojan found security.di.unimi.it/~joystick/pubs/eurosec13.pdf.
in wild. 2010. URL, https://blog.lookout.com/blog/2010/12/29/ Rosen, S., Qian, Z., Mao, Z. M., Appprofiler: a flexible method of exposing
geinimi_trojan/. privacy-related behavior in android applications to end users, In:
Lookout. Mouabad.p: Pocket dialing for profit. 2013. URL, https://blog. third ACM conference on Data and application security and privacy,
lookout.com/blog/2013/12/09/mouabad-p-pocket-dialing-for-profit/. pp. 221e232. http://dx.doi.org/10.1145/2435349.2435380.
Lu L, Li Z, Wu Z, Lee W, Jiang G. Chex: statically vetting android apps for Sahs J, Khan L. A machine learning approach to android malware detec-
component hijacking vulnerabilities. In: ACM conference on Com- tion. In: European Intelligence and Security Informatics Conference
puter and communications security; 2012. p. 229e40. http:// (EISIC). IEEE; 2012. p. 141e7. URL, http://dx.doi.org/10.1109/EISIC.
dx.doi.org/10.1145/2382196.2382223. 2012.34.
Lu H, Zhao B, Su J, Xie P. Generating lightweight behavioral signature for Samra AAA, Yim K, Ghanem OA. Analysis of clustering technique in
malware detection in people-centric sensing. Wirel Personal Com- android malware detection. In: Seventh International Conference on
mun 2014;75(3):1591e609. URL, http://dx.doi.org/10.1007/s11277- Innovative Mobile and Internet Services in Ubiquitous Computing
013-1400-9. (IMIS); 2013. p. 729e33.
Luoshi, Z., Yan, N., Xiao, W., Zhaoguo, W., Yibo, X., A3: Automatic analysis Sands W. Walker sands mobile traffic report q3 2013. 2013. URL, http://
of android malware, In: 1st International Workshop on Cloud www.walkersandsdigital.com/Walker-Sands-Mobile-Traffic-Report-
Computing and Information Security, pp. 89e93. http://www. Q3-2013.
atlantis-press.com/php/paper-details.php?id¼9880. Sanz, B., Santos, I., Ugarte-Pedrero, X., Laorden, C., Nieves, J., Bringas, P.,
Machiry, A., Tahiliani, R., Naik, M., Dynodroid: an input generation system Anomaly detection using string analysis for android malware detec-
for android apps, in: 9th Joint Meeting on Foundations of Software tion, In: International Joint Conference SOCO13-CISIS13-ICEUTE13,
Engineering, pp. 224e234. http://dx.doi.org/10.1145/2491411. Advances in Intelligent Systems and Computing, pp. 469e478. URL
2491450. http://dx.doi.org/10.1007/978-3-319-01854-6_48.
Maggi, F., Valdi, A., Zanero, S., Andrototal: a flexible, scalable toolbox and B. Sanz, I. Santos, C. Laorden, X. Ugarte-Pedrero, P. G. Bringas, On the
service for testing mobile malware detectors, In: Third ACM work- automatic categorisation of android applications, in: 2012 IEEE Con-
shop on Security and privacy in smartphones & mobile devices, pp. sumer Communications and Networking Conference (CCNC),
49e54. http://dx.doi.org/10.1145/2516760.2516768. pp. 149e153. http://dx.doi.org/10.1109/CCNC.2012.6181075.
marketsandmarkets. Wearable electronics market worth $8.36 billion by Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas P, lvarez G. PUMA:
2018. 2014. URL, http://www.marketsandmarkets.com/PressReleases/ permission usage to detect malware in android. book section 30.
wearable-electronics.asp. Advances in intelligent systems and computing, Vol. 189. Berlin
Microsoft. The smartphone reinvented around you. 2014. URL, http:// Heidelberg: Springer; 2013. p. 289e98. URL, http://dx.doi.org/10.
www.windowsphone.com/en-us. 1007/978-3-642-33018-6_30.
Mohite S, Sonar RS. A survey on mobile malware: war without end. Int J Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Nieves J, Bringas PG, et al.
Comput Sci Bus Inf 2014;9(1):23e35. Mama: manifest analysis for malware detection in android. Cybern
Moonsamy V, Rong J, Liu S. Mining permission patterns for contrasting Syst 2013;44(6e7):469e88. URL, http://dx.doi.org/10.1080/01969722.
clean and malicious android applications. 2013. URL, http://www. 2013.803889.
sciencedirect.com/science/article/pii/S0167739X13001933. Sanz, B., Santos, I., Ugarte-Pedrero, X., Laorden, C., Nieves, J., Bringas,
Motley. Android dominates, blackberry nears the end. 2014. URL, http:// P. G., Instance-based anomaly method for android malware
www.fool.com/investing/general/2014/02/19/android-dominates- detection, In: 10th International Conference on Security and
blackberry-nears-the-end.aspx. Cryptography, pp. 387e394. http://paginaspersonales.deusto.es/
Pandita, R., Xiao, X., Yang, W., Enck, W., Xie, T., Whyper: towards isantos/publications/2013/sanz_2013_Instance.pdf.
automating risk assessment of mobile applications, In: 22nd Sarma, B. P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C., Molloy, I.,
USENIX Security Symposium, pp. 527e542. https://www.usenix.org/ Android permissions: a perspective combining risks and benefits, In:
36 A. Feizollah et al. / Digital Investigation 13 (2015) 22e37

17th ACM symposium on Access Control Models and Technologies, communications security, pp. 447e458. http://dx.doi.org/10.1145/
pp. 13e22. http://dx.doi.org/10.1145/2295136.2295141. 2590296.2590325.
Sasnauskas R, Regehr J. Intent fuzzer: crafting intents of death. 2014. Virustotal. Antivirus. 2013. URL, https://www.virustotal.com/en/file/
http://dx.doi.org/10.1145/2632168.2632169. 3684a199b0dd9504fe331015f7f23a24414773c43c89e6112f9dd2f2f00bc
Security_Watch. Google glass malware: It's coming. 2013. URL, http:// 053/analysis/.
securitywatch.pcmag.com/mobile-security/313703-google-glass- Vural, I., Venter, H., Mobile botnet detection using network forensics, In:
malware-it-s-coming. Third Future Internet Symposium, vol. 6369 of Lecture notes in
Seo S-H, Gupta A, Mohamed Sallam A, Bertino E, Yim K. Detecting mobile computer science, Springer Berlin Heidelberg, pp. 57e67. URL http://
malware threats to homeland security through static analysis. J Netw dx.doi.org/10.1007/978-3-642-15877-3_7.
Comput Appl 2014;38:43e53. URL, http://www.sciencedirect.com/ Walenstein, A., Deshotels, L., Lakhotia, A., Program structure-based
science/article/pii/S1084804513001227. feature selection for android malware analysis, In: 4th International
Shabtai, A., Elovici, Y., Applying behavioral detection on android-based Conference, MobiSec 2012, Vol. 107 of Lecture Notes of the Institute
devices, In: Mobile Wireless Middleware, Operating Systems, and for Computer Sciences, Social Informatics and Telecommunications
Applications, Lecture Notes of the Institute for Computer Sciences, Engineering, pp. 51e52. URL http://dx.doi.org/10.1007/978-3-642-
Social Informatics and Telecommunications Engineering, pp. 33392-7_5.
235e249. URL http://dx.doi.org/10.1007/978-3-642-17758-3_17. Wang, Y., Zheng, J., Sun, C., Mukkamala, S., Quantitative security risk
Shabtai A, Fledel Y, Elovici Y. Automated static code analysis for classifying assessment of android permissions and applications, In: 27th Annual
android applications using machine learning. In: International Con- IFIP WG 11.3 Conference, DBSec 2013, Vol. 7964 of Lecture Notes in
ference on Computational Intelligence and Security (CIS); 2010. Computer Science, pp. 226e241. URL http://dx.doi.org/10.1007/978-
p. 329e33. http://dx.doi.org/10.1109/CIS.2010.77. 3-642-39256-6_15.
Shabtai A, Kanonov U, Elovici Y. Intrusion detection for mobile devices Wei, X., Gomez, L., Neamtiu, I., Faloutsos, M., Profiledroid: multi-layer
using the knowledge-based, temporal abstraction method. J Syst profiling of android applications, In: 18th annual international con-
Softw 2010;83(8):1524e37. URL, http://www.sciencedirect.com/ ference on Mobile computing and networking, pp. 137e148. http://
science/article/pii/S0164121210000762. dx.doi.org/10.1145/2348543.2348563.
Shabtai A, Tenenboim-Chekina L, Mimran D, Rokach L, Shapira B, Elovici Y. Wu D-J, Mao C-H, Wei T-E, Lee H-M, Wu K-P. Droidmat: android malware
Mobile malware detection through analysis of deviations in applica- detection through manifest and api calls tracing. In: Seventh Asia
tion network behavior. Comput Secur June 2014;43:1e18. URL, http:// Joint Conference on Information Security (Asia JCIS). IEEE; 2012.
www.sciencedirect.com/science/article/pii/S0167404814000285. p. 62e9. URL, http://dx.doi.org/10.1109/AsiaJCIS.2012.18.
Shalaginov A, Franke K. Automatic rule-mining for malware detection Wu, C., Zhou, Y., Patel, K., Liang, Z., Jiang, X., Airbag: Boosting smartphone
employing neuro-fuzzy approach. Report. Norwegian Information resistance to malware infection, in: 21th Annual Network and
Security Laboratory; 2014. URL, http://tapironline.no/fil/vis/1421. Distributed System Security Symposium (NDSS'14). URL, www.yajin.
Sophos. Angry birds malware e firm fined 50,000 for profiting from fake org/papers/ndss14_airbag.pdf.
android apps. 2012. URL, http://nakedsecurity.sophos.com/2012/05/ Xiaoming, K., Qiaoyan, W., Intrusion detection model based on android,
24/angry-birds-malware-fine/. In: 2011 4th IEEE International Conference on Broadband Network
Sophos. Security threat report 2013. Report. 2013. URL, http://www. and Multimedia Technology (IC-BNMT), pp. 624e628. http://dx.doi.
sophos.com/en-us/medialibrary/PDFs/other/ org/10.1109/ICBNMT.2011.6156010.
sophossecuritythreatreport2013.pdf. Xu J, Yu Y, Chen Z, Cao B, Dong W, Guo Y, et al. Mobsafe: cloud computing
Spreitzenbarth, M., Freiling, F., Echtler, F., Schreck, T., Hoffmann, J., based forensic analysis for massive mobile applications using data
Mobile-sandbox: having a deeper look into android applications, in: mining. Tsinghua Sci Technol 2013;18(4):418e27. http://dx.doi.org/
28th Annual ACM Symposium on Applied Computing, pp. 1808e1815. 10.1109/TST.2013.6574680.
http://dx.doi.org/10.1145/2480362.2480701. Yajin Z, Xuxian J. Dissecting android malware: characterization and evo-
Su, X., Chuah, M., and Tan, G., Smartphone dual defense protection lution. In: IEEE Symposium on Security and Privacy (SP); 2012.
framework: Detecting malicious applications in android markets, in: p. 95e109. http://dx.doi.org/10.1109/SP.2012.16.
2012 Eighth International Conference on Mobile Ad-hoc and Sensor Yan, L. K., Yin, H., Droidscope: seamlessly reconstructing the os and dalvik
Networks (MSN), pp. 153e160. http://dx.doi.org/10.1109/MSN.2012. semantic views for dynamic android malware analysis, In: 21st
43. USENIX conference on Security symposium, pp. 29e29.
Suarez-Tangil G, Tapiador J, Peris-Lopez P, Ribagorda A. Evolution, Yerima SY, Sezer S, McWilliams G, Muttik I. A new android malware
detection and analysis of malware for smart devices. IEEE Commun detection approach using bayesian classification. In: IEEE 27th In-
Surv Tutor 2013;PP(99):1e27. http://dx.doi.org/10.1109/ ternational Conference on Advanced Information Networking and
SURV.2013.101613.00077. Applications (AINA); 2013. p. 121e8. http://dx.doi.org/10.1109/
Suarez-Tangil G, Tapiador JE, Peris-Lopez P, Blasco J. Dendroid: a text AINA.2013.88.
mining approach to analyzing and classifying code structures in Yerima SY, Sezer S, McWilliams G. Analysis of bayesian classification-
android malware families. Expert Syst Appl 2014;41(4, Part 1): based approaches for android malware detection. IET Inf Secur
1104e17. URL, http://www.sciencedirect.com/science/article/pii/ 2014;8(1):25e36.
S0957417413006088. Yu W, Chen Z, Xu G, Wei S, Ekedebe N. A threat monitoring system for
Symantec. Android madware and malware trends. 2013. URL, http:// smart mobiles in enterprise networks. 2013. http://dx.doi.org/
www.symantec.com/connect/blogs/android-madware-and-malware- 10.1145/2513228.2513266.
trends. Zhang Y, Yang M, Xu B, Yang Z, Gu G, Ning P, et al. Vetting undesirable
Symantec. The future of mobile malware. 2014. URL, http://www. behaviors in android apps with permission use analysis. In: ACM
symantec.com/connect/blogs/future-mobile-malware. SIGSAC conference on Computer & communications security; 2013.
Tchakount F, Dayang P. System calls analysis of malwares on android. Int J p. 611e22. http://dx.doi.org/10.1145/2508859.2516689.
Sci Technol 2013;2(9):669e74. Zhao, M., Ge, F., Zhang, T., Yuan, Z., Antimaldroid: An efficient svm-based
Techcrunch. Android Accounted For 79% Of All Mobile Malware In 2012, malware detection framework for android, In: Second International
96% In Q4 Alone. 2013. URL, http://techcrunch.com/2013/03/07/f- Conference ICICA, Communications in Computer and Information
secure-android-accounted-for-79-of-all-mobile-malware-in-2012- Science, pp. 158e166. URL http://dx.doi.org/10.1007/978-3-642-
96-in-q4-alone/. 27503-6_22.
Teufl P, Ferk M, Fitzek A, Hein D, Kraxberger S, Orthacker C. Malware Zhemin, Y., Min, Y., Leakminer: Detect information leakage on android
detection by applying knowledge discovery processes to application with static taint analysis, in: Third World Congress on Software Engi-
metadata on the android market (Google play). 1st April 2014, URL, neering (WCSE), pp. 101e104. http://dx.doi.org/10.1109/WCSE.2012.26.
http://dx.doi.org/10.1002/sec.675. Zheng M, Sun M, Lui J. Droidanalytics: a signature based analytic system
The.Register. Earn 8,000 a month with bogus apps from Russian malware to collect, extract, analyze and associate android malware. 2013. URL,
factories. 2013. URL, http://www.theregister.co.uk/2013/08/05/ http://arxiv.org/abs/1302.7212.
mobile_malware_lookout/. Zheng, C., Zhu, S., Dai, S., Gu, G., Gong, X., Han, X., Zou, W., Smartdroid: an
Unuchek R. The first mobile encryptor trojan. 2014. URL, http://securelist. automatic system for revealing ui-based trigger conditions in android
com/blog/mobile/63767/the-first-mobile-encryptor-trojan/. applications, In: second ACM workshop on Security and privacy in
Veen V v d. Dynamic analysis of android malware. Thesis. 2013. URL, smartphones and mobile devices, pp. 93e104. http://dx.doi.org/10.
http://tracedroid.few.vu.nl/thesis.pdf. 1145/2381934.2381950.
Vidas, T., Christin, N., Evading android runtime analysis via sandbox Zheng, M., Sun, M., Lui, J. C., Droidray: a security evaluation system for
detection, In: 9th ACM symposium on Information, computer and customized android firmwares, in: 9th ACM symposium on
A. Feizollah et al. / Digital Investigation 13 (2015) 22e37 37

Information, computer and communications security, pp. 471e482. Zhou, Y., Wang, Z., Zhou, W., Jiang, X., Hey, you, get off of my market:
http://dx.doi.org/10.1145/2590296.2590313. Detecting malicious apps in official and alternative android markets,
Zhou, W., Zhou, Y., Jiang, X., Ning, P., Detecting repackaged smart- In: 19th Annual Network and Distributed System Security Sympo-
phone applications in third-party android marketplaces, In: sec- sium (NDSS), pp. 5e8.
ond ACM conference on Data and Application Security and Zonouz S, Houmansadr A, Berthier R, Borisov N, Sanders W. Secloud: a
Privacy, pp. 317e326. http://dx.doi.org/10.1145/2133601.2133640. cloud-based comprehensive and lightweight security solution for
Zhou, W., Zhou, Y., Grace, M., Jiang, X., Zou, S., Fast, scalable detection of smartphones. Comput Secur September 2013;37:215e27. URL, http://
“piggybacked” mobile applications, In: Third ACM conference on Data www.sciencedirect.com/science/article/pii/S016740481300031X.
and application security and privacy, pp. 185e196. http://dx.doi.org/
10.1145/2435349.2435377.

You might also like