Fake Job Listing Detection Using Machine Learning Approach
Fake Job Listing Detection Using Machine Learning Approach
https://doi.org/10.22214/ijraset.2023.48865
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue I Jan 2023- Available at www.ijraset.com
Abstract: To avoid fraudulent posts for jobs on the internet, an automated tool using machine learning-based classification
techniques is proposed in the paper. Different ML Models are used for training the machine for checking fraudulent posts on
the web and the results of those models are compared for identifying the best employment scam detection model. It helps in
detecting fake job posts from an enormous number of posts.
I. INTRODUCTION
There are a lot of job advertisements on the internet, even on reputed job advertising sites, which never seem fake. But after the
election, the so-called recruiters start asking for the money and bank details. Many of the candidates fall into their trap and lose a lot
of money and their current job sometimes. So, it is better to identify whether a job advertisement posted on the site is real or fake.
Identifying it manually is very difficult and almost impossible. We can apply machine learning to train a model for fake job
classification.
It can be trained on the previous real and fake job advertisements and it can identify a fake job accurately [7].
1) According to Federal Trade Commission Americans were scammed out of $68 million due to fake business and job
opportunities in the first quarter of 2022.
2) There are a lot of job scams because of unemployment there are a lot of websites that connect a recruiter to a suitable candidate,
and sometimes fake recruiters post a job posting on the job portal with the the motive to get money this problem occurs with
many job portals later people shift to a new portal in search of a real job but the fake recruiters join this portal as well hence in
today's world it is important to detect real and fake jobs.
3) According to PwC’s Global Economic Crime and Fraud Survey 2022 shows good news: the proportion of organizations
experiencing fraud has remained relatively steady since 2018 [3].
4) However, the survey of 1,296 executives across 53 countries and regions found a rising threat from external perpetrators—bad
actors that are quickly growing in strength and effectiveness [3].
5) Nearly 70% of organizations experiencing fraud reported that the most disruptive incident came via an external attack or
collusion between external and internal sources.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1541
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue I Jan 2023- Available at www.ijraset.com
B. E-mail Spam
Useless and unnecessary mail received from unknown or known sources often arrived in the user’s mail as email spam. And the
result of this is unnecessary feed-up of space of the user. To resolve this issue various email providers like Gmail, outlook provides
spam filtering services that use neural network method.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1542
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue I Jan 2023- Available at www.ijraset.com
V. MODEL CALCULATION
The models will be evaluated based on two metrics:
1) Accuracy: This metric is defined by this formula -
As the formula suggests, this metric produces a ratio of all correctly categorized data points to all data points. This is particularly
useful since we are trying to identify both real and fake jobs unlike a scenario where only one category is important. There is
however one drawback to this metric. Machine learning algorithms tend to favor dominant classes. Since our classes are highly
unbalanced a high accuracy would only be a representative of how well our model is categorizing the negative class (real jobs).
2) F1-Score: The F1 score is a measure of a model’s accuracy on a dataset. The formula for this metric is –
F1-score is used because in this scenario both false negatives and false positives are crucial. This model needs to identify both
categories with the highest possible score since both havehigh costs associated with them.
VI. RESULTS
A. Model Evaluation and Validation
The final model used for this analysis is – SGD. This is based on the results of the metrics as compared to the baseline model. The
outcome of the baseline model and SGD are presented in the table below:
Based on these metrics, SGD has a slightly better performance than the baseline model. This is how the final model is chosen to be
SGD.
VII. CONCLUSION
A. Free-Form Visualization
A confusion matrix can be used to evaluate the quality of the project. The project aims to identify real and fake jobs.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1543
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue I Jan 2023- Available at www.ijraset.com
The confusion matrix above displays the following values – categorized label, number of data points categorized under the label,
and percentage of data represented in each category. The test set has a total of 3265 real jobs and 231 fake jobs. Based on the
confusion matrix it is evident that the model identifies real jobs 99.01% of the time. However, fraudulent jobs are identified only
73.5% of the time. Only 2% of the time has the model not identified the class correctly. This shortcoming has been discussed earlier
as well as Machine Learning algorithms tend to prefer the dominant classes.
REFERENCES
[1] Fake Job Recruitment Detection Using Machine Learning Approach- Shawni Dutta and Prof. Samir Kumar Bandyopadhyay International Journal of
Engineering Trends and Technology (IJETT) – Volume 68 Issue 4- April 2020 .
[2] Fake Job Posting Prediction using machine learning - Anshupriya Srivastava- Github
[3] PwC’s Global Economic Crime and Fraud Survey- https://www.pwc.com/gx/en/services/forensics/economic-crime-survey.html
[4] Kaggle Datasets- - https://www.kaggle.com/shivamb/real-or-fake-fake-jobposting- prediction
[5] D. E. Walters, ―Bayes’s Theorem and the Analysis of Binomial Random Variables,‖ Biometrical J., vol. 30, no. 7, pp. 817–825, 1988, doi:
10.1002/bimj.4710300710
[6] Edureka fake job lisiting detection using ML And NLP.
[7] Analytics India Mag- https://analyticsindiamag.com/classifying-fake-and-real-job- advertisements-using-machine-learning/
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1544