Fake App Detection Using Sentiment Analysis
Fake App Detection Using Sentiment Analysis
Venkatesh S
Prasanth S
2023 Intelligent Computing and Control for Engineering and Business Systems (ICCEBS) | 979-8-3503-9458-0/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICCEBS58601.2023.10448551
Abstract —In today's environment, fraudulent actions in analysis, a new approach to detecting fraudulent apps has
mobile applications have become a major worry. It is emerged.
now critical to identify and stop fraudulent activity Sentiment analysis is the application of machine learning
within mobile apps due to their continued rapid growth and natural language processing methods to the analysis of
across multiple sectors. This abstract presents a the sentiment of user reviews and ratings. By analyzing the
comprehensive approach to fraud app detection sentiment of reviews and ratings, it is possible to detect
leveraging sentiment analysis techniques. Our goal is to patterns that may indicate fraudulent activity. For example,
identify fraudulent mobile applications within app stores fraudulent apps may have a high number of negative
by analyzing user-generated reviews and ratings. We reviews that are suspiciously similar or use fake accounts.
first compile a dataset of app reviews and get the text Sentiment analysis can help identify these patterns and flag
data ready for analysis.Sentiment analysis is employed potentially fraudulent apps for further investigation.
to assign sentiment scores to each review, providing
insights into user opinions. We present a machine Overall, fraud app detection using sentiment analysis is an
learning model trained on labeled data to identify innovative approach to detecting fraudulent apps that offers
whether an app is fraudulent based on the sentiment several advantages over traditional methods. It is faster,
scores and other relevant criteria. Standard measures more accurate, and can detect patterns that may be missed
like F1-score, ROC-AUC, accuracy, precision, and recall by manual reviews and automated scans. As the app market
are accustomed to evaluate performance. A visual continues to grow, sentiment analysis will likely become an
evaluation of Visualizations such as ROC curves and increasingly important tool for app stores and users alike.
precision-recall curves demonstrate the model's ability
to discriminate between fraudulent and non-fraudulent 1.1. OBJECTIVE
apps. Our results shed light on the effectiveness of our The objective of this study is to compare the
fraud detection system, enabling informed performance of sentiment analysis with traditional fraud
decision-making and potential improvements. This detection techniques in identifying fraudulent apps. The
project emphasizes the importance of data-driven study's main objective is to find out how likely fraud is by
approaches in addressing the ongoing challenge of app analyzing app ratings and reviews using machine learning
fraud detection. algorithms. To improve the precision of fraud identification,
the research will also investigate the use of additional data
sources, such as user demographics and app usage trends.
Keywords: Fraud Detection, Mobile Applications,
Sentiment Analysis, Natural Language Processing,
1.2. SCOPE
Machine Learning. The primary goal of this project is to analyze user
reviews and comments from popular mobile applications
across a range of categories, including social media,
I. Introduction productivity, and gaming..The sentiment analysis will be
Fraudulent apps have become a major concern for both used to identify potential instances of fraudulent behavior,
app stores and users. These apps can compromise user data, such as fake reviews or ratings, and compare these results to
steal money, and harm the reputation of legitimate apps. known instances of fraud. The accuracy of the findings and
Traditional methods of detecting fraudulent apps, such as the capacity to spot fresh fraud cases will be used to gauge
manual reviews and automated scans, are often ineffective how well sentiment analysis works in identifying fraudulent
and time-consuming. However, with the rise of sentiment activity.
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-9458-0/23/$31.00 ©2023 IEEE
1.3. SYSTEM ANALYSIS they calculate fraud aggregate rating for the user to choose
Input: The system collects user reviews and app the application or not.
metadata, preprocesses text data, applies sentiment
analysis, and extracts relevant features for each app In the paper [7]"Rank Fraud and Malware Detection in
. Google Play Using Fairplay," by G. Yugeshwaran, S. Kumar
Output: It identifies potentially fraudulent apps, M, D. A. Benitta, S. Eliyas and S. Rajan R propose a
sends alerts to administrators, generates reports fairplay approach to ransomware applications programmes
with insights on fraudulent trends, provides a they aim to recognize scam programs ,fairplay connects
user-friendly dashboard for monitoring, and review and integrate observed relations connections
incorporates user feedback for ongoing employing the behavioral apps' spoken and tone to store
improvement data. fairplay attains 93% accuracy.In the paper [8]” Fraud
app Detection of Google Play store using decision Tree” by
Joshi, K., Kumar, S., Rawat, J., Kumari, A., Gupta, A., &
Sharma, N They proposed a system by Comparing three
II.LITERATURE SURVEY types Naive Baye's classifier, logistic regression, and
decision tree classifier This model's recall, precision,
Several studies have explored the use of sentiment accuracy, and f1 score were further compared.On analysis
analysis for detecting fraudulent apps they attained a result of 85% using decision tree classifier.
In the paper [1] “Discovery of Ranking Fraud Sheikh.et al [9] paper proposes a title of “An Approach For
for Mobile Apps” by H. Zhu, H. Xiong, Y. Ge and E. Chen Detection of fraud application using sentiment
presents a platform to detect the ranking fraud in the mobile analysis”They gather various applications reviews and
applications.theyExamine three different kinds of evidence: ratings to store and preprocess the data with machine
evidence based on ratings, evidence based on rankings, and learning algorithms and classify the result using sentiment
evidence based on reviews. to see the app’s ranking,review analysis.In the paper [10] "Search Rank Fraud and Malware
and rating behavior through statistical test and also propose Detection in Google Play" by M. Rahman, M. Rahman, B.
optimization using aggregation method.In the paper [2] Carbunar and D. H. Chau proposes a fair play approach
“Neural network based approach for Ethereum fraud system to discover and maximize the traces left behind by
detection”by M. Dahiya, N. Mishra, R. Singh and Pavitra the fraudsters and Fairplay carries out review operations,
present a neural network approach for ethereum fraud combines detected review relations from the Play Store in a
detection they prefer neural networks more than the other unique way, and predicts with 95% accuracy.
algorithms such as Logistic Regression, SVM and gaussian
bayes theorem to get an accuracy of 97.09%.
III.EXISTING AND PROPOSED SYSTEM
In the paper [3] “App Assessment with three phase
A. Existing System:
Evidence using sentiment analysis”by Ramnath.et al they
Existing fraud detection techniques typically rely on
proposed that due to rapid growth in mobile industry various
transactional data, such as credit card transactions, to
mobile applications are arrived and genuineness of the app
identify potential instances of fraud. While these techniques
is not trustable by the user, so they derived three parameters
can be effective, they are limited in their ability to detect
to solve this issue if any one of the parameter is passed no
fraud in mobile applications, where transactions may be less
other parameters to be evaluated.they evaluate the results
frequent or occur outside of the traditional payment system.
with ranked evidences and ratings. In te paper [4] “ Towards
Our proposed system for fraud app detection using
a Machine Learning Approach for Detecting Click Fraud in
sentiment analysis offers a unique approach to detecting
Mobile Advertising”by R. Mouawi ,et al present a click
fraud in mobile applications. By analyzing user reviews, the
fraud detection model uses KNN, ANN, and SVM to
system can identify potential instances of fraud that may not
classify fraudulent clicks by acquiring certain features. are
be captured by traditional transactional data analysis.
tested using CFC also experimental result providing
Additionally, the system can provide insights into the user
accuracy higher than 93%.
experience of the application, allowing developers to make
improvements and address issues that may be contributing
to fraudulent activity.
In the paper [5] "AdSherlock: Efficient and Deployable
Click Fraud Detection for Mobile Applications" by C. Cao
et al present Adsherlock divides click request identification B. PROPOSED SYSTEM:
into online and offline procedures, making it a useful click The proposed methodology for fraud app detection
fraud detection model for mobile apps on the client side. using sentiment analysis would involve the following steps:
They also implement a adsherlock prototype model to Data collection: Gathering information from multiple
evaluate the application performance.In the paper "Fraud sources, including social media posts, email exchanges, and
application detection using summary risk score" by S. customer reviews, would be the first step. The sentiment
Jundhare, P. Gajare, P. Gadekar, A. Aher and S. Wankhade
analysis model would be trained on this data in order to
safeguarding the users from risk communication
detect fraud..
mechanisms. The protection against malware applications
rely on the large degree decisions made by the user.finally Data pre-processing: Before the data can be analyzed it
must be pre-processed in order to get rid of any unnecessary
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.
information, like stop words and special
characters.Additionally, the data must be transformed into a
format that the sentiment analysis model can understand.
C. MODULES
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.
learning model in order to perform sentiment analysis..
Here's an example of how to perform sentiment analysis
using the TextBlob library in Python:
D. RESULT:
Fraud detection module:
This module can be used to train a machine learning model App 1 Output:
to classify reviews as either fraudulent or legitimate. Here's
an example of how to train a logistic regression model using
scikit-learn in Python:
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.
Fig-4: Analyzing The App Reviews
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.
as F1-score, recall, accuracy, and precision. Our dataset was [2] M. Dahiya, N. Mishra, R. Singh and Pavitra, "Neural
divided into test, validation, and training sets. After training network based approach for Ethereum fraud detection,"
a model and hyperparameter tuning on the validation data, 2023 4th International Conference on Intelligent
Engineering and Management (ICIEM), London, United
We assessed our model using the test set of data, achieving
Kingdom, 2023, pp. 1-4, doi:
promising results. For true positives, true negatives, false 10.1109/ICIEM59379.2023.10166745.
positives, and false negatives, the confusion matrix offered a
clear picture. Understanding the model's performance was [3] Ramnath, M., & Rubavathi, C. Y. (2021, February). App
made easier by visualizations like ROC curves. across assessment with three phase evidence system using
different thresholds. We also considered cross-validation for sentiment analysis. In 2021 Third International Conference
robustness and benchmarked our model against industry on Intelligent Communication Technologies and Virtual
Mobile Networks (ICICV) (pp. 1180-1183). IEEE.
standards. As a result, we obtained valuable insights,
enabling us to make informed decisions and iterate on our [4] R. Mouawi, M. Awad, A. Chehab, I. H. E. Hajj and A.
system for improved fraud detection capabilities. Kayssi, "Towards a Machine Learning Approach for
Detecting Click Fraud in Mobile Advertizing," 2018
International Conference on Innovations in Information
Technology (IIT), Al Ain, United Arab Emirates, 2018, pp.
88-92, doi: 10.1109/INNOVATIONS.2018.8605973.
Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 18,2024 at 17:45:32 UTC from IEEE Xplore. Restrictions apply.