Analyzing The Performance of Novel Logistic Regression Over Linear Regression Algorithms
Analyzing The Performance of Novel Logistic Regression Over Linear Regression Algorithms
employment scam from a large volume of dataset. The among classes [14]. It supports multiple classification tasks.
author presented various algorithms such as KNN, RF The linear combination of features are taken and presented
[9]and Logistic regression (LR), Bi-LSTM algorithm to to non-linear sigmoid function.
identify Bogus jobs from fake posts available on the internet
Algorithm for Logistic Regression
[10]. This approach has attracted researchers and gained the
ability to perform numerous approaches on employment
Step1: Import a dataset with a variety of job advertisements.
scam. [11] handled an approach to identify employment Step2: Perform pre-processing.
scam from legitimate advertisement. For this analysis, Step3: Gather features from the input.
EMSCAD dataset is chosen which contains multiple fake Step4: Involves the classification process based on the
job posts among organization advertisements. ML selection.
algorithms used in this study are NB, RF, DT and LR. The Step 5: The accuracy value is computed for ten samples.
research done by [12] performed Vulnerables in the form of
B.Linear Regression
text identification. For feature selection and classification
NB is used. A supervised learning technique used to perform regression
tasks and this statistical method builds relationships among
The increase in unemployment rate is used by fraudsters and
two variables [15]. The performance of this regression
makes them vulnerable in the name of an organization with
model is high when compared to other statistical learning
details of vacancies and salary packages. This in turn
even though it is complicated in nature. It utilizes training
attracts job seekers and deceives them. The purpose of the
dataset to justify linear influence of different attributes on a
study is to identify and detect fraudsters using two different
target value; such linear influences are represented as
ML algorithms with improved accuracy rate.The limitation
coefficients of each attribute resulting in a regression model
of this study is job boards need to balance the need to detect
[16]. Through this model unknown target values are
fake jobs with the legal requirements of privacy and data
identified with chosen attributes in addition with calculated
protection laws, which can limit the amount of data that can
coefficients.
be used to train detection models.
Algorithm for Linear Regression
II.MATERIALS AND METHODS
Step 1: Import and read the dataset.
A novel logistic regression classifier is a part of the study.,
Linear Regression, and a sample size of 20. The Python
Step 2: Unwanted content is removed from the dataset using
compiler is utilized for detecting counterfeit job postings.
several techniques, including noise removal.
We utilized IBM SPSS software, version 26, for the
statistical analysis in this investigation. The G Power Step 3: Grouping the data set's classes
software was utilized using 80% statistical power, a 0.05
significance level, and a 95% confidence level to determine Step Four: Choose a hyperplane
the required sample size for each group.
Jupyter Notebook software is utilized for organizing and Step 5: Classification is done.
executing designated tasks. The platform is employed for
multiple operations, and Deep Learning on Windows 11 has III.Statistical Analysis
been its primary usage for me. A machine with 5 processors
and 8 GB of RAM made up the hardware arrangement. The Utilizing SPSS (IBM 2021). We statistically analyzed the
system employed a 64-bit architecture. The programming proposed model using the Python Google Colab tool to
assess the algorithm's efficiency. Impostors pretending to be
was conducted using the Python language.The dataset from
well-known companies are promoting fake job openings.
Kaggle undergoes processing in the background to run code The accuracy variables are the dependent variables. The
and produce precise outcomes. results of the Novel Logistic regression and Linear
Regression methods were compared using an independent t-
A. Logistic Regression (LR) test (Elliott and Woodward, 2019).
A simple and efficient method for binary and linear IV.RESULTS AND DISCUSSION
classification problems[13]. The Logistic Regression
classifier is primarily utilized for binary classification tasks The accuracy of logistic regression and linear
and estimating class probabilities due to its association with
regression is compared in Table 1, which features the
logistic data distribution (Hoss Belyadi 2021).It’s simple to
realize and functions well with linearly separable classes. results. Based on the numbers in Table 2, we can see that
The purpose of the classifier is to find decision boundaries the Linear Regression technique achieved an average
accuracy of 92.94% with a standard deviation of 3.04, Table 2 displays the group's mean and standard deviation, as well as the
accuracy of the Logistic Regression and Linear Regression algorithms,
whereas the Logistic Regression method achieved an
which were determined to be 95.79% and 92.94%, respectively. Linear
average accuracy of 95.79%. Compared to logistic regression exhibited a notably decreased standard error of 0.39776 in
regression, linear regression showed a significantly lower comparison to logistic regression.
standard error of.96178. Table 3 shows a significant
difference in accuracy between the two suggested phases
and the typical single stage, as indicated by the independent
sample test.The average detection accuracy is typically
within one standard deviation. The t-test yielded a
significant value. P=.002, indicating statistical significance
at a level below 0.05.
In Figure 1, we can see how the Logistic Regression and Linear Regression
methods compare in terms of accuracy. The accuracy is represented by the
Table 1: Accuracy comparison between Conventional and Proposed Y-axis, while the X-axis shows the various Logistic Regression and Linear
methods Regression techniques.
V.CONCLUSION
significance.
[17] used 13 different parameters to identify fraudsters with
an accuracy around 98%. The study revealed the salary [1] R. Wang, N. Cao, Y. Guo, S. Ji, and S. Kumar, “A Comparative
range, type of organization, profile of the company, Analysis of Fraudulent Recruitment Advertisement Detection Methods
in the IoT Environment,” Journal of Sensors, vol. 2022. pp. 1–11,
educational qualification and finally role of jobseekers. [18] 2022. doi: 10.1155/2022/4583512.
EMSCAD dataset to identify fake jobs with [2] S. Lal, R. Jiaswal, N. Sardana, A. Verma, A. Kaur, and R. Mourya,
“ORFDetector: Ensemble Learning Based Online Recruitment Fraud
intercombination of ML algorithms such as Support vector Detection,” 2019 Twelfth International Conference on Contemporary
mechanism (SVM) and RF has gained accuracy over 97%. Computing (IC3). 2019. doi: 10.1109/ic3.2019.8844879.
[3] S. Vidros, C. Kolias, and G. Kambourakis, “Online recruitment
[19] to classify text and identify fake news which gained services: another playground for fraudsters,” Computer Fraud &
accuracy of 96%. [20] proposed SVM algorithm to build a Security, vol. 2016, no. 3. pp. 8–13, 2016. doi: 10.1016/s1361-
3723(16)30025-2.
website which predicts fake job posts with an average
[4] D. V. G. Krishnan, S. Hemamalini, P. Cheraku, K. H. Priya, S.
accuracy of 97%. With an accuracy of 96%, [21] classified Ganesan, and D. R. Balamanigandan, “Attack Detection using DL
and identified email spam using logistic regression. In order based Feature Selection with Improved Convolutional Neural
Network,” International Journal of Electrical and Electronics
to oppose spam mail detection, [22] suggested a hybrid Research, vol. 11, no. 2, pp. 308–314, May 2023.
classifier approach using the DNN and Bi-LSTM classifier [5] P. Johri, J. K. Verma, and S. Paul, Applications of Machine Learning.
Springer Nature, 2020.
technique and achieved an accuracy of 98.67%. [6] Latha, M. Vasavi, C. K. Kumar, Balamanigandan, J. B. Guttikonda,
and R. Kumar, “Machine Learning based precision agriculture using
The nonlinear problem solving is at least rate with linear ensemble classification with TPE model,” Journal of Machine and
Computing, pp. 261–268, Jan. 2024.
decision surface. In future, online fake job post dataset with [7] S. Dutta and S. K. Bandyopadhyay, “Fake Job Recruitment Detection
a lower energy consumption method to get higher Using Machine Learning Approach,” International Journal of
Engineering Trends and Technology, vol. 68, no. 4. pp. 48–53, 2020.
performance can be made. Developing accurate fake job doi: 10.14445/22315381/ijett-v68i4p209s.
detection models requires significant resources, including [8] P. A Anto Sagaya and B. R, “A novel method for lung nodule
segmentation and lung cancer severity categorization using deep
data labeling, model development and training, and ongoing learning models,” in 2023 International Conference on Sustainable
monitoring and updates to stay ahead of evolving fraud Communication Networks and Application (ICSCNA), IEEE, Nov.
2023. doi: 10.1109/icscna58489.2023.10370539.
tactics. This makes upcoming researchers to understand
[9] R. Balamanigandan, P. K. Poonguzhali, A. Udhayakumar, A.
computational analysis of fraudsters detection and creates a Vijayalakshmi, Mahaveerakannan, and S. Govindaraju, “Detection of
path for further research on employment scam identification. disk filtration attacks using random forest (RF) algorithm comparing
with ML algorithms for improving accuracy,” in 2023 International
Deep learning techniques [9], [23]can be applied to analyze Conference on Sustainable Communication Networks and Application
large datasets of job postings and applicant resumes to (ICSCNA), IEEE, Nov. 2023. doi:
10.1109/icscna58489.2023.10370732.
[10] C. S. Anita, P. Nagarajan, G. A. Sairam, P. Ganesh, and G.
Deepakkumar, “Fake job detection and analysis using machine
learning and deep learning algorithms,” Revista Geintec-Gestão
Inovação e Tecnologias, vol. 11, no. 2, pp. 642–650, 2021.
[11] J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text
classification with Naïve Bayes,” Expert Systems with Applications,
vol. 36, no. 3. pp. 5432–5435, 2009. doi: 10.1016/j.eswa.2008.06.054.
[12] S. Vidros, C. Kolias, G. Kambourakis, and L. Akoglu, “Automatic
Detection of Online Recruitment Frauds: Characteristics, Methods,
and a Public Dataset,” Future Internet, vol. 9, no. 1, p. 6, Mar. 2017.
[13] S. Ashenden, “The Era of Artificial Intelligence, Machine Learning,
and Data Science in the Pharmaceutical Industry.” 2021. doi:
10.1016/c2019-0-01262-9.
[14] “Handbook of Statistics,” Handbook of Statistics - Machine Learning:
Theory and Applications. pp. i–iii, 2013. doi: 10.1016/b978-0-444-
53859-8.00022-9.
[15] S. C. Babu and S. N. Gajanan, “Effects of individual, household, and
community indicators on child’s nutritional status—application of
simple linear regression,” Food Security, Poverty and Nutrition Policy
Analysis. pp. 295–334, 2022. doi: 10.1016/b978-0-12-820477-
1.00026-7.
[16] H. S. A. A. S. Guido Dartmann, “Big Data Analytics for Cyber-
Physical Systems.” 2019. doi: 10.1016/c2018-0-00208-x.
[17] A. Mehboob and M. S. I. Malik, “Smart Fraud Detection Framework
for Job Recruitments,” Arabian Journal for Science and Engineering,
vol. 46, no. 4. pp. 3067–3078, 2021. doi: 10.1007/s13369-020-04998-
2.
[18] B. Alghamdi and F. Alharby, “An Intelligent Model for Online
Recruitment Fraud Detection,” Journal of Information Security, vol.
10, no. 03. pp. 155–176, 2019. doi: 10.4236/jis.2019.103009.
[19] S. Aphiwongsophon and P. Chongstitvatana, “Detecting Fake News
with Machine Learning Method,” 2018 15th International Conference
on Electrical Engineering/Electronics, Computer,
Telecommunications and Information Technology (ECTI-CON). 2018.
doi: 10.1109/ecticon.2018.8620051.
[20] A. Raza, S. Ubaid, F. Younas, and F. Akhtar, “Fake E Job Posting
Prediction Based on Advance Machine Learning Approachs,”
International Journal of Research Publication and Reviews. pp. 689–
695, 2022. doi: 10.55248/gengpi.2022.3.2.7.
[21] C. Abhinav, “Spam Mail Detection using Machine Learning,”
International Journal for Research in Applied Science and
Engineering Technology, vol. 10, no. 6. pp. 2327–2329, 2022. doi:
10.22214/ijraset.2022.44315.
[22] I. ’a AbdulNabi and Q. Yaseen, “Spam Email Detection Using Deep
Learning Techniques,” Procedia Computer Science, vol. 184. pp.
853–858, 2021. doi: 10.1016/j.procs.2021.03.107.
[23] B. Ramachandran and K. Subramaniam, “Secure and efficient data
forwarding in untrusted cloud environment,” Cluster Comput., vol. 22,
no. S2, pp. 3727–3735, Mar. 2019.