Cyber Hacking Breaches Prediction and Detection
Cyber Hacking Breaches Prediction and Detection
2023 2nd International Conference on Vision Towards Emerging Trends in Communication and Networking Technologies (ViTECoN) | 979-8-3503-4798-2/23/$31.00 ©2023 IEEE | DOI: 10.1109/VITECON58111.2023.10157462
D. Implementation
IV. RESULTS
Figure 6 After applying k-means clustering
In Figure 5 the training examples are shown as dots. Here all We test various predictive models using the prediction methods
different categories are mixed up before applying the k-means covered. The two classifiers that perform the best are K-Means
clustering. In the figure 6 the centroids are in star shape. Here and MLP.
after applying k-means clustering the different categories are
divided into different clusters. The clusters are grouped in
categories according to the similar properties.
Algorithms Accuracy
Multi-layer perceptron model:
1 Decision Tree 90.86
Multi-layer perceptron (MLP) contains multiple dense layers
which converts any input dimension to desired dimension. It is 2 Random Forest 91.80
a neural network which combines the neurons together in which
the output of some neurons is the input of other neurons. This 3 K means 94.19
model has three layers, Input layer, hidden layer and output
layer. Input layer takes the input and forward it for the further 4 MLP 98.72
process and the hidden layer do the same process and forward it Table 2 Comparison of Results
to the output layer. The output layer gives the predictive model
result. Among Decision Tree, Random Forest, k-means, MLP
Input Layer: classifiers, and we present the findings in table 2. K Means and
The input layer takes the dataset as input known as visible layer MLP gives the better accuracy. we present the findings which is
also because it can be shown in the network. According to the comparison of results shown in the table 2. Decision tree
unit neuron the neural network is drawn. The data is accepted algorithm gives the 90.86 accuracy and Random Forest gives the
by this layer and passed to the rest of network. 91.80 whereas K means gives 94.19 which is better and MLP
Hidden Layer: gives the better accuracy compared to other machine learning
Hidden layers come after input layer. These are called as hidden models.
layers because they are not directly connected to the input. It
adds the weights to the input and transfer them to output by
Authorized licensed use limited to: Florida Institute of Technology. Downloaded on August 05,2023 at 14:29:52 UTC from IEEE Xplore. Restrictions apply.
Parking Spaces”, Journal of Pharmaceutical Negative
Results, vol. 13, no. 4, pp. 1010–1013, Nov. 2022.
[9] Sivakumar Depuru , Anjana Nandam , P.A. Ramesh , M.
Saktivel , K. Amala , Sivanantham. (2022). Human
Emotion Recognition System Using Deep Learning
Technique. Journal of Pharmaceutical Negative Results,
13(4), 1031–1035.
https://doi.org/10.47750/pnr.2022.13.04.141 (Original
work published November 4, 2022)
[10] S. Depuru, P. Hari, P. Suhaas, S. R. Basha, R. Girish and
P. K. Raju, "A Machine Learning based Malware
Figure 8 Comparison of results Classification Framework," 2023 5th International
Conference on Smart Systems and Inventive Technology
VI. CONCLUSION (ICSSIT), Tirunelveli, India, 2023, pp. 1138-1143, doi:
10.1109/ICSSIT55814.2023.10060914
A method for assessing the risk of hacker intrusions, [11] S. Depuru, K. Vaishnavi, B. Manogna, K. J. Sri, A.
addressing the issue of unreported intrusions, and estimating Preethi and C. Priyanka, "Hybrid CNNLBP using Facial
the exposure of enterprises. Since machine and deep Emotion Recognition based on Deep Learning
learning techniques are increasingly being employed for a Approach," 2023 Third International Conference on
Artificial Intelligence and Smart Energy (ICAIS),
variety of purposes, including cyber security, it is imperative
Coimbatore, India, 2023, pp. 972-980, doi:
to determine whether and which category of algorithms can 10.1109/ICAIS56108.2023.10073918.
deliver adequate results. Spam detection, malware analysis,
[12] Ayyagari, R. (2012). An exploratory analysis of data
and intrusion detection are three important aspects of cyber breaches from 2005-2011: Trends and insights. Journal
security that are explored for these methodologies. of Information Privacy and Security
According to our research, there are still a number of [13] Algarni, A. M., Malaiya, Y. K. (2016, May). A
problems with current machine learning algorithms that consolidated approach for estimation of data security
reduce their value for cyber security. The dataset containing breach costs. In 2016 2nd International Conference on
300 instances is trained by using machine learning Information Management (ICIM) (pp. 26-39). IEEE.
algorithms like Random Forest model, decision tree model, [14] Kafali, Jones, J., Petruso, M., Williams, L., Singh, M. P.
MLP, K-Means model. Through this method has achieved (2017, May). How good is a security policy against real
breaches? A HIPAA case study. In 2017 IEEE/ACM
MLP and K-Means higher accuracy and yielded better
39th International Conference on Software Engineering
output. The proposed system can be efficiently applied to (ICSE) (pp. 530-540). IEEE.
detect the breaches and predict them.
[15] Sen, R., Borle, S. (2015). Estimating the contextual risk
REFERENCES of data breach: An empirical approach. Journal of
Management Information Systems, 32(2), 314-341.
[16] Bertino, E., & Ferrari, E. (2018). Big data security and
[1] M. Xu, K. M. Schweitzer, R. M. Bateman, and S. Xu, privacy,”. In A comprehensive guide through the Italian
“Modeling and predicting cyber hacking breaches,” database research over the last 25 years (pp. 425–439).
IEEE Trans. Inf. Forensics Security, vol. 13, no. 11, pp. Springer. Gray, J., Gerlitz, C., & Bounegru, L. (2018).
2856–2871, 2018. [17] Smith, T. T. (2016). Examining Data Privacy Breaches
[2] IBM. (2019). Cost of a data breach report. IBM in Healthcare
Security, 76. [Online]. Available [18] A. Bachar, N. E. Makhfi, O.E. Bannay, "Towards a
https://www.ibm.com/downloads/cas/ZBZLY7KL behavioral network intrusion detection system based on
[3] Fernandez Maimo et al., “A self-adaptive deep the SVM model", in 2020 1st international conference
learning-based system for anomaly detection in 5G on innovation research in applied science engineering
networks,” IEEE Access, vol. 6, pp. 7700–7712, 2018. and technology (IRASET), Meknes, Morocco, 2020, pp.
[4] Kantarcioglu M and Ferrari E (2019) Research 1-7
Challenges at the Intersection of Big Data, Security and [19] A. Bachar, N. E. Makhfi, O.E. Bannay, "Towards a
Privacy. behavioral network intrusion detection system based on
[5] Verizon, “Data breach investigations report,” 2019. the SVM model", in 2020 1st international conference
[Online]. Available: on innovation research in applied science engineering
https://enterprise.verizon.com/resources/reports/dbir/ and technology (IRASET), Meknes, Morocco, 2020, pp.
[6] H. Hammouchi, O. Cherqi, G. Mezzour, M. Ghogho, 1-7
and M. El Koutbi, “Digging deeper into data breaches: [20] L. Bilge, Y. Han, and M. Dell’Amico, "Riskteller:
An exploratory data analysis of hacking breaches over Predicting the risk of cyber incidents", in Proc. of the
time,” Procedia Computer Science, vol. 151, pp. 1004– 2017 ACM SIGSAC conf. on Computer and
1009, 2019. Communications Security, 2017, pp. 1299–1311.
[7] rack T. Majority of malware analysts aware of data [21] S. Sarkar, M. Almukaynizi, J. Shakarian, and P.
breaches not disclosed by their employers. Shakarian, "Predicting enterprise cyber incidents using
http://www.threattracksecurity.com/press-re social network analysis on dark web hacker forums",
lease/majority-of-malware-analysts-aware-of-data- The Cyber Defense Review, pp. 87–102, 2019
breaches-not-dis closed-by-their-employers.aspx [22] M. Lopez-Martin, B. Carro, J. I.Arribas, and A.
[8] K. Pujitha , Kattamanchi Prem Krishna , K. Amala , Sanchez-Esguevillas, "Network intrusion detection with
Annavarapu Yasaswini , Sivakumar Depuru , a novel hierarchy of distances between embeddings of
Authorized licensed use limited to: Florida Institute of Technology. Downloaded on August 05,2023 at 14:29:52 UTC from IEEE Xplore. Restrictions apply.
Kopparam Runvika, “Development of Secured Online hash IP addresses",Knowledge-based Syst.,vol 219,2021