Machine learning methods for secure internet of things against cyber threats synopsis (1)
Machine learning methods for secure internet of things against cyber threats synopsis (1)
1. Group Id GROUP
GROUP 1
2. Group Member
Kunal Rajendra Patil
Vedant Santosh Kshirsagar
Aniket Maruti palle
Shreeraj Anil Pawar
3. Project Title
Machine learning methods for securing internet of things from cyber threats
4.Project Option
5.Internal Guide
Prof. Saisudha Dorabala
7.Technical Keywords
SVM (Support Vector Machine)
NB (Naive bayes)
DT(Decission tree)
1
SYNOPSIS
Project Title :- Machine learning methods for securing internet of things from
cyber threats
Introduction –
Cyberspace holds an enormous quantity of information that is astute in gathering threat
intelligence useful for cybersecurity experts in preventing cyberattacks and protecting an
organization’s network system. The use of social media can not displace the need for security
experts in an in-depth analysis of certain types of attacks, detecting anomalies in network traffic,
worms, port scans, etc. However, analyzing social media data can provide meaningful insights in
detecting new patterns of cyber attack and security threats such as data breach, carding, and
hijacking. Twitter is a source of real-time information that has been used to provide meaningful
insights by researchers in emergencies including; terrorist attacks, natural hazards, and one-off
events. Hence, Twitter is an important source to gather information relevant to cybersecurity.
The motivation of this paper is to analyze the impact of data breach on organizations, visualize
the geographical spread of cyber attacks in the United States, and identify the major types of
cyberattacks. Finally, we trained a model in classifying texts related to cyber threats, leveraging
on socio-personal and technical indicators from the deep web and surface web to provide
cybersecurity experts and law enforcement agencies with artifacts valuable in prevention and
prosecution strategies.
Problem Statement –
To use of social media cannot replace the requirement for security experts to conduct in-depth
analyses of specific sorts of attacks, such as detecting anomalies in network traffic, worms, and
port scans, among other things. Analysing social media data, on the other hand, can help
discover new patterns of cyber threat and security threats including data theft, carding, and
hijacking.
Abstract –
One of the most challenging cyberthreats, insider threats frequently result in considerable
financial damage for enterprises. Although the issue of insider threat detection has been studied
for a long time in the security and data mining communities, conventional machine learning-
based detection approaches, which heavily rely on feature engineering, struggle to accurately
capture the difference in behaviour between insiders and regular users due to a number of
difficulties related to the underlying data characteristics, such as high-dimensionality,
complexity, heterogeneity, sparsity, and lack of lagging indicators. Advanced deep learning
algorithms offer a new paradigm for learning end-to-end models from complex data. In this
2
SYNOPSIS
succinct survey, we first introduce a dataset that is frequently used for insider threat
identification and then cover the most recent research on deep learning for this purpose.
Goals/Objectives –
1. To Detect Cyber Threat using machine learning techniques.
2. To classify and Train dataset using Different Machine Learning algorithm.
3. To analysing social media data can provide meaningful insights in detecting new patterns of
cyberattack and security threats such as data breach, carding, and hijacking.
System Architecture –
3
SYNOPSIS
Explanation –
1. Input as Dataset – Input as dataset , First Load dataset of Cyber Threat.
2. Data Pre-processing – Data preprocessing is an essential task for cleaning the data,
removing raw data, missing values, and preparing it for a machine learning model, which
improves the model's accuracy and efficiency.
3. Feature Extraction – Using the feature extraction technique, we can create new features
that are a linear mixture of current features. When compared to the original feature
values, the new set of features will have different values. The main goal is to utilize fewer
features to obtain the same quantity of data.
Algorithm –
SVM (Support Vector Machine) :
Support Vector Machine (SVM) is a controlled approach for machine learning that is suitable for
both classification and regression difficulties. It is employed largely in classification issues,
however. Each data item is defined in the SVM algorithm n-dimensional space point (where n is
a number of features) each feature value is the value of a specific coordinate. Then we carry out
Support Vectors are merely individual observation coordinates. The SVM is a boundary between
both the two classes (hyper planes / rows). Categorization by finding the hyper-plane that
distinguishes the classes very well.
Conclusion –
In this Proposed System, we present a unique approach of collecting data on kaggle website to
analyse information about cyber threats and issues an early warning/detection system. Using
only Twitter data for predicting cyber threats. A sentiment analysis on hacker forums to predict
cyber threats. Machine learning algorithm used to detect cyber threats.
4
SYNOPSIS
References –
[1] Wang, S. (2010). Crawling Deep Web using a GA-based set covering algorithm.
[2] Zhou, S., Long, Z., Tan, L., & Guo, H. (2018). Automatic identification of indicators of
compromise using neural-based sequence labelling. arXiv preprint arXiv:1810.10156.
[3] Guo, M.,& Wang, J. A. (2009, April). An ontology-based approach to model common
vulnerabilities and exposures in information security. In ASEE Southest Section Conference.
[4] Ninth Annual Cost if Cybercrime Study unlocking The Value of Improved Cybersecurity
Protection .The Cost of Cybercrime Contents.
[5] Ranade, P., Mittal, S., Joshi, A., & Joshi, K. (2018, November). Using deep neural networks
to translate multi-lingual threat intelligence. In 2018 IEEE International Conference on
Intelligence and Security Informatics (ISI) (pp. 238-243). IEEE.
[6] Dong, Y., Guo, W., Chen, Y., Xing, X., Zhang, Y., & Wang, G. (2019). Towards the
detection of inconsistencies in public security vulnerability reports. In 28th USENIX Security
Symposium (USENIX Security 19) (pp. 869-885).
[7] Rodriguez, A., & Okamura, K. (2020). Social Media Data Mining for Proactive Cyber
Defense. Journal of Information Processing, 28, 230- 238.