0% found this document useful (0 votes)
214 views7 pages

Detection of Malicious Android Apps Using Machine Learning Techniques

In our project, a code behavior signature-based malware detection framework mistreatment associate degree SVM rule is planned, which might sight malicious code and their variants effectively in runtime and extend malware characteristics information dynamically. Experimental results show that the approach incorporates a high detection rate and low rate of false positive and false negative, the power, and performance impact on the first system can even be unheeded.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
214 views7 pages

Detection of Malicious Android Apps Using Machine Learning Techniques

In our project, a code behavior signature-based malware detection framework mistreatment associate degree SVM rule is planned, which might sight malicious code and their variants effectively in runtime and extend malware characteristics information dynamically. Experimental results show that the approach incorporates a high detection rate and low rate of false positive and false negative, the power, and performance impact on the first system can even be unheeded.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

International Journal of Trendy Research in Engineering and Technology

Volume 4 Issue 7 December 2020


ISSN NO 2582-0958
____________________________________________________________________________

DETECTION OF MALICIOUS ANDROID APPS USING


MACHINE LEARNING TECHNIQUES
Cline Walter Colaco, Mandar Deepak Bagwe Siddharth Amit Bose
Final Year UG Students, Dept. of computer Engineering , Xavier Institute of Engineering, Mumbai - 400016,

ABSTRACT

Lately, the matter of dangerous malware in devices is spreading speedily, particularly those repackaged
android malware. Though understanding robot malware mistreatment dynamic analysis will give a comprehensive
read, it's still subjected to high price in setting preparation and manual efforts within the investigation. Android is the
most preferred openly available smart phone OS and its permission declaration access management mechanisms can’t
sight the behavior of malware. the matter of police investigation such malware presents distinctive challenges thanks
to the restricted resources accessible and restricted privileges granted to the user however conjointly presents
distinctive opportunities within the needed data hooked up to every application. Through this article, we tend to gift a
machine learning-based system for the detection of malware on android devices. In our project, a code behavior
signature-based malware detection framework mistreatment associate degree SVM rule is planned, which might sight
malicious code and their variants effectively in runtime and extend malware characteristics information dynamically.
Experimental results show that the approach incorporates a high detection rate and low rate of false positive and false
negative, the power, and performance impact on the first system can even be unheeded. Our system extracts variety of
options associate degreed trains a Support Vector Machine in an offline (off-device) manner, so as to leverage the
upper computing power of a server or cluster of servers.

Keywords: Malware analysis, Machine Learning, Neural Networks, Support Vector Machine.

transfer their malicious code when installation which


INTRODUCTION suggests that these Apps can't be simply detected by
Google’s technology throughout publication within the
Malware is basically a malicious program Google mechanical man Market.
software or computer code, may be a general term
won’t to talk over with a range of styles of hostile or This paper discusses, portrays and focuses on
intrusive computer code like viruses, worms, spyware, an SVM-based active learning framework for smart
Trojan horses, rootkits, and backdoors. A typical feature phone malware detection, and within the mechanical
of Malware is that it's specifically designed to wreck, man system valid the effectiveness of the strategy, tests
disrupt, steal, or normally, impose unhealthy or show that the planned methodology has sensible
illegitimate actions. Malware will virtually infect any relevancy and measurability will be complete on a range
information processing system running user programs of well-liked malware observation and might detect
(or applications), and also the propagation and bar of unknown malware. Due to its less impact on system
the malware are well studied for private computers. performance, potentially significant impact on the initial
Especially for smart phone devices, current solutions for system capability may go unnoticed. In summary,
locating malware within the mobile platform are way malware applications normally use the subsequent 3
behind the pace of the increasing quality of mobile sorts of penetration techniques for installation,
applications. A recent report has shown that there are activation, and running on the android device:
more than 87 million mobile applications presently Repackaging among the rest of many is the foremost
obtainable on the market. This quality of the mechanical common techniques for malware developers to put in
man system has LED to an enormous increase within malicious applications on a mechanical man platform.
the spreading of mechanical man malware. This These sorts of approaches commonly begin from well-
malware is principally distributed in markets operated liked legitimate Apps and misuse them as malware. The
by third parties, however even the Google mechanical developers commonly transfer well-liked Apps, take
man Market cannot guarantee that each one of its listed apart them, add their own malicious codes, so re-
applications area unit threat free. The threats for assemble and transfer the new App to official or
mechanical man embody Phishing, Banking-Trojans, different markets. However, changing the technique
Spyware, Bots, Root Exploits, SMS Fraud, Premium through which an application is made and maintained
Dialers, and pretend Installers. There have conjointly makes it tougher for malware detection. Developers
been reports regarding Download-Trojans Apps that should still use repackaging however rather than
www.trendytechjournals.com
15
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
___________________________________________________________________________
enveloping the impose code within App, they embody detection, data processing and machine learning
Associate in Nursing update element that may transfer techniques give a good thanks to dynamically extract
malicious code at runtime. Downloading is the ancient malware patterns. For smart phone-based mobile
attack technique, malware developers desire engaging computing platforms, recent years have witnessed an
users, to transfer fascinating and enticing Apps. increasing range of additional sophisticated malware
attacks like repackaging. Recent analysis consistently
characterizes existing mechanical man malware from
PROBLEM DEFINITION varied aspects, together with their installation ways,
To create an efficient system that curbs the activation mechanism moreover because the nature of
threat of android malware by correctly detecting and carried malicious payloads. supported the analysis with
mitigating any malicious APKs via combining four representative mobile security software package
permissions and API calls as features to characterize over 1200 collected malware, their experiments show
malware, and use machine learning techniques to the weakness of current malware detection solutions
automatically extract patterns to differentiate benign and need the necessity to develop next-generation anti-
and malicious Apps. mobile-malware solutions. One existing work has used
data processing and options generated from windows
workable API calls. They achieved sensible leads to a
really giant scale dataset with concerning 35,000
SCOPE transportable workable files. Another activity foot
Our software will effectively identify, detect, printing methodology additionally provides a dynamic
categorize apps and safeguard android mobile devices approach to discover self-propagating malware. All
from malicious apps thus avoiding any stealing or these existing ways have basically advanced the
misuse of the user’s data by using an easy user mechanical man malware detection; however the misuse
interface. detection isn't reconciling to the novel mechanical man
malware and continually needs frequent change of the
signatures. Here lies the analysis gap.
REVIEW OF LITERATURE
In comparison, our work is motivated by a
The initial studies on smart phone malware number of the higher than techniques and approaches,
were chiefly targeted on understanding the threats and however with a spotlight on developing straightforward
behaviors of rising malware. There has been vital work and effective malware detection approaches, while not
on the matter of police work malware on mobile looking forward to advanced dynamic runtime analysis
devices. Many approaches monitor the facility usage of and any static predefined malware signatures. Our
applications and report abnormal consumption. Others objective is to mix permissions and API calls as options
monitor system calls and arrange to discover to characterize malware and use machine learning
uncommon system call patterns. Different approaches techniques to mechanically extract patterns to
use additional ancient comparison with acknowledged differentiate benign and malicious Apps. This study
malware or different heuristics. Signatures primarily may be a static analysis that uses the options that may
based ways, introduced within the mid-90s area unit be extracted from the supply codes of the app’s .apk
ordinarily employed in malware detection. The main files.
weakness of this kind of approach is its weakness in
police work metamorphic and unseen malware. Rather
than victimization predefined signatures for malware

www.trendytechjournals.com
16
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
____________________________________________________________________________
DESCRIPTION

ANALYSIS

Fig 4.1: Use Case Diagram

Fig 4.2: Sequence Diagram

www.trendytechjournals.com
17
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
___________________________________________________________________________
The User selects the APK file which is to be specific features which seem like an anomaly or
tested and it is sent to the cloud storage. Over a red flag, collect and store them in new CSV
there the applications are stored in sequence and files and train our Machine Learning model
wait to be executed. The APK is then sent to the according to them. Finally, our last phase is to
Cloud Engine. The features of the APK such as test our model with the data and evaluate our
permissions and API calls are extracted and sent findings according to the results.
to the Machine Learning model. Thus, the
features of the APK are analyzed and based on
the findings a report is generated and sent back IMPLEMENTATION METHODOLOGY
to the User.
(Data Collection and Feature Filtering)
1. Collect all applications in separate folders
which contain benign as well as suspicious
applications respectively.
2. Using “Glob” framework in python create an array of
files is for further processing.
3. Analyze each application in the
array using “pyaxmlparser” and
“androguard” framework.
4. Extract the following things in the analysis phase:
a. Permissions
b. Activities
c. Intents
d. API calls
5. Taking these four attributes into
consideration a program maps all attributes
to a CSV file and mentions a class for each
application.
6. Once CSV files are generated, analyze them
for any redundancy present, and if found,
eliminate the entire row.
7. Another program extracts the total
permissions from these APK files.
Fig 4.3: Flow of the Project
These permissions will work as
attributes in the Dataset CSV File (Here
DESIGN if permission is present it is marked as 1
else it is marked as 0).
Firstly, we collect a number of malicious
8. An N-bit Vector extracts search line in the
and benign android applications by retrieving
CSV file, these vectors work as input to the
their APKs online. Next, we implement feature
machine learning algorithm.
extraction by enumerating the API calls,
permissions and activity of the APKs to get a
(Machine Learning)
fair idea of the behavior of the particular apps.
(SVM)
Then we create a dataset based on the extracted
features by compiling them into CSV files for 1. Import data in the form of CSV file using pandas
future use and reference. After that we select framework.
2. Using train test split divide entire dataset in a ratio of
www.trendytechjournals.com
18
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
____________________________________________________________________________
1:3 (75% of data is for training and 25% for testing).
3. Design an SVM model while keeping the inputs in
mind.
4. Select ‘rbf’ kernel as input/output is binary.
5. Analyze accuracy using confusion matrix.
(KNN)
1. Import data in form of CSV file using pandas
framework.
2. Using train test split divide entire
dataset in a ratio of 1:3 (75% of data is
for training and 25% for testing).
3. Design a KNN model keeping input in mind.
4. Use this model along with a Decision tree.
5. Use The Decision tree for further bifurcation of
malware families.

Fig 4.5: Confusion Matrix

DETAILS OF HARDWARE AND SOFTWARE


1) Hardware Details
 Cloud Engine
 Cloud Storage
 Android Device

2) Software Details
 Python 3.x
 Tensorflow 2.2.x
 Android Studio 5

Future Scope

Once model training is complete the next step is to


deploy the entire solution on a cloud/server. Then the
next step that we have in plan is to make an android app
which would make a backup of the application which is
to be tested and upload it to cloud. At the cloud end this
app is stored as blob storage and these apps would be
Fig 4.4: Model then processed one by one according to their position in
These are the statistics of our model that we have the database. After completion of the analysis a report
implemented. would be generated stating whether the app is safe or
malicious and this would be sent to the user’s mobile
device. Furthermore if the app is found to be malicious,
the report will pinpoint as to why the app is malicious
by displaying the specific anomalies found in the
permissions, API calls, intent, etc.

www.trendytechjournals.com
19
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
___________________________________________________________________________
CONCLUSION framework for android devices. J Intell Inf Syst
38, 161–190 (2012).
Hence, we have successfully proposed to
use permissions and API calls of Android [6] Justin Sahs and Latifur Khan.2012.A Machine
applications to detect malware and malicious Learning Approach to Android Malware
codes in Android based mobile platform. Ours is Detection. In Proceedings of the 2012 European
a novel approach to distinguish and detect Intelligence and Security Informatics
Android malware with different intentions. It is Conference (EISIC’12). IEEE Computer Society,
effective, that is, it is able to distinguish variant USA, 141–147.
of Android malware between distinct purposes
of them. The proposed framework extracts
[7] Zhao M., Ge F., Zhang T., Yuan Z. (2011)
AntiMalDroid: An Efficient SVM-Based
permissions from Android applications and
Malware Detection Framework for Android. In:
further combines the API calls to characterize
Liu C, Chang J, Yang A (eds) In- formation
each application as a high dimension feature
Computing and Applications. ICICA2011.
vector. By applying learning methods to the
Communications in Computer and Information
collected datasets, we can derive classification Science, vol 243. Springer, Berlin, Heidelberg.
models to classify Apps as benign or malware.
Experiments on real world data demonstrate the [8] Seung-Hyun Seo, Aditi Gupta, Asmaa Mohamed
good performance of the framework for Sallam, Elisa Bertino, Kangbin Yim, Detecting
malware detection. mobile malware threats to homeland security
through static analysis, Journal of Network and
APPENDIX Computer Applications, Volume 38, 2014, Pages
43-53, ISSN 1084-8045,
SVM-Support vector machine
UI-User Interface [9] Arp, Daniel & Spreitzenbarth, Michael &
API-Application Programming Interface Hubner, Malte & Gascon, Hugo & Rieck,
KNN- K nearest neighbors Konrad.(2014). DREBIN: Effective and
Explainable Detection of Android Malware in
CSV-Comma Separated Values
Your Pocket. Symposium on Network and
RBF-Radial basis function
Distributed System Security (NDSS).
APK-Android Application Package
[10] D. Wu, C. Mao, T. Wei, H. Lee and K. Wu,”
DroidMat: Android Malware Detection through
REFERENCES Manifest and API Calls Tracing,” 2012 Seventh
Asia Joint Conference on Information Security,
[1] Androzoo Dataset, Tokyo, 2012, pp. 62-69, DOI:
https://androzoo.uni.lu/ 10.1109/AsiaJCIS.2012.18.
https://ieeexplore.ieee.org/document/6298136/
[2] CIC Dataset 2020,
https://www.unb.ca/cic/datasets/maldroid- [11] William Enck, Machigar Ongtang, and Patrick
2020.html McDaniel.2009. On light weight mobile phone
application certification. In Proceedings of the
[3] VirusShare.com Dataset, 16th ACM conference on Computer and
https://virusshare.com/
communications security CCS’09). Association
[4] SandDroid-An automatic Android application analysis for Computing Machinery, New York, NY,
system. USA, 235–245.
http://sanddroid.xjtu.edu.cn:8080/#home
[12] Sanz, Borja & Santos, Igor & Laorden, Carlos &
[5] Shabtai, A., Kanonov, U., Elovici, Y. et al. Ugarte Pedrero, Xabier & Bringas, Pablo.
“Andromaly”: a behavioral malware detection
www.trendytechjournals.com
20
International Journal of Trendy Research in Engineering and Technology
Volume 4 Issue 7 December 2020
ISSN NO 2582-0958
____________________________________________________________________________
(2012). On the Automatic Categorization of [15] Mohammed K. Alzaylaee, Suleiman Y. Yerima,
Android Applications. Sakir Sezer, DL-Droid: Deep Learning Based
10.1109/CCNC.2012.6181075. Android Malware Detection Using Real
Devices, Computers & Security (2019), DOI:
[13] H.Wang, J.Si, H.Li and Y.Guo,” RmvDroid:
Towards A Reliable Android Malware Dataset [16] N. Peiravian and X. Zhu,” Machine Learning for
with App Metadata,” 2019 IEEE/ACM 16th Android Malware Detection Using Permission
International Conference on Mining Software and API Calls,”2013 IEEE 25th International
Repositories (MSR), Montreal, QC, Canada, Conference on Tools with Artificial Intelligence,
2019, pp.404-408, DOI: Herndon, VA, 2013, pp. 300-305,
10.1109/MSR.2019.00067. DOI:10.1109/ICTAI.2013.53.
[14] X. Li, J. Liu, Y. Huo, R. Zhang and Y. Yao, ”An [17] The Complete Android Oreo Developer Course -
Android malware detection method based on Build 23 Apps! Created by Rob Percival, Nick
Android Manifest file,” 2016 4th International Walter,
Conference on Cloud Computing and
https://www.udemy.com/course/the-
Intelligence Systems(CCIS), Beijing, 2016,
pp.239-243, DOI: 10.1109/CCIS.2016.7790261. complete-android-oreo-developer-course/

www.trendytechjournals.com
21

You might also like