Keerthika and Divyatharshini 3rd Cs Report
Keerthika and Divyatharshini 3rd Cs Report
Submitted to Alagappa University in partial fulfillment of the requirements for the award of the
degree in
Submitted by
KEERTHIKA R,
&
DIVYATHARSHINI
2024-2025
PURATCHI THALAIVAR DR.MGR ARTS AND SCIENCE COLLEGE FOR
WOMEN
(Affiliated to Alagappa University, Karaikudi)
Uchipuli-623534
BONAFIDE CERTIFICATE
This is to certify that the project work entitled “EVIDENCE PRODUCTION BLOCKCHAIN
(REG NO: 0000000) & DIVYATHARSHINI M (REG NO: 000000) is a bonafide work done
DECLARATION
(REG NO: 0000000) & DIVYATHARSHINI M (REG NO: 0000000) under the guidance of
Mrs.SHANTHI Department of Computer Science for the partial fulfillment of the degree of
Computer science. We hereby declare that this project work has not been carried out from any other
project work and has not been submitted for the similar degree.
&
First of all, we would like to thank the ALMIGHTY God for giving us the strength wisdom for
Next we thank our beloved parent for their love, encourage and continuous support.
We wish to express our deep sense of gratitude to our principal MRS. CHITHRA for giving us
We express our sincere thanks to Mrs. SHANTHI head of the Department of Information
Technology, who encourage us with his constant support and guidance throughout the course of
study.
We take this opportunity to express our grateful thanks to our internal guide Mr. YESUDOSS
B.Tech,ME for her encouragement and valuable guidance in completing this project timely and
successfully.
We have great pleasure in giving our thanks Full words to all other staff members who have
We convey our great pleasure in thanking all the rest of teaching and non-teaching staff
members for their kind co-operation. We thank our friend who helped us in every situation and gave
Last but not least, we thank to my parents for supporting us to complete my degree.
&
Blockchain technology is a ground-breaking innovation that has drawn a lot of interest in recent years
because of its potential to upend established data storage and exchange methods. It is a distributed,
decentralized digital ledger that securely and openly records transactions. Blockchain technology,
which was first developed for Bitcoin, the first decentralized cryptocurrency, is now being used in a
number of sectors, including banking, healthcare, and supply chain managementA blockchain is
essentially a database made up of a sequence of blocks. Once a block is included in the chain, it
cannot be changed or removed. Each block comprises a series of transactions. A network of nodes
that collaborate to validate transactions and add them to the blockchain maintains the chain. The
blockchain is controlled by no central authority or middleman thanks to the decentralized nature of
this network of nodes.Security is one of the biggest benefits of blockchain technology. Blockchain
data is spread throughout the network, making it impossible to alter with or hack. It is practically
hard to modify the data since numerous network nodes verify each transaction. Also, the use of
cryptographic techniques guarantees the encryption and security of the data.The transparency of
blockchain technology is another benefit. Every transaction is documented on the blockchain, making
it accessible to everyone using the network. As a result, there is no requirement for a central authority
or middleman to validate transactions. Instead, the network itself validates transactions, improving
both the effectiveness and efficiency of the process.The potential uses for blockchain technology are
numerous and diverse. Blockchain technology is being utilized in the banking sector to develop new
varieties of virtual currencies like Bitcoin and Ethereum. Also, it is utilized to simplify payment
procedures and lower transaction fees. Blockchain technology is being utilized in the healthcare
industry to securely store and exchange medical data.
1
CHAPTER 2
SYSTEM PROPOSAL
2
2.2 PROPOSED SYSTEM:
Our proposed approach, In this system, the industrial network attack detection
dataset was taken as input. The input data was taken from the dataset repository.
To store secure and trusted mechanism Block chain using to store the industrial iot
data and high security purpose to use encryption and decryption technique. Next to
check the validation process then, we have to implement the data pre-processing
step. In this step, we have to handle the missing values for avoid wrong prediction,
and to encode the label for input data. Then, we have to split the dataset into test
and train. The data is splitting is based on ratio. In train, most of the data’s will be
there. In test, smaller portion of the data’s will be there. Training portion is used to
evaluate the model and testing portion is used to predicting the model. (i.e.)
Machine learning algorithm to detect industrial network attack. Finally, the
experimental results shows that some performance metrics such as accuracy and
prediction status.
2.2.1 ADVANTAGES
3
Methodology
This paper provides a state-of-the-art literature review on economic analysis and
pricing models for data collection and wireless communication in Internet of
Things (IoT). Wireless Sensor Networks (WSNs) are the main component of IoT
which collect data from the environment and transmit the data to the sink nodes.
For long service time and low maintenance cost, WSNs require adaptive and robust
designs to address many issues, e.g., data collection, topology formation, packet
forwarding, resource and power optimization, coverage optimization, efficient task
allocation, and security. For these issues, sensors have to make optimal decisions
from current capabilities and available strategies to achieve desirable goals. This
paper reviews numerous applications of the economic and pricing models, known
as intelligent rational decision-making methods, to develop adaptive algorithms
and protocols for WSNs. Besides, we survey a variety of pricing strategies in
providing incentives for phone users in crowdsensing applications to contribute
their sensing data. Furthermore, we consider the use of some pricing models in
Machine-to-Machine (M2M) communication. Finally, we highlight some
important open research issues as well as future research directions of applying
economic and pricing models to IoT.
Advantages:
If it work conceivable catch any possible attack.
If it work conceivable catch any possible attack that we have n’t seen before
Disadvantages:
Too Many False Negatives.
Run to failure prediction is low.
4
2.3.2 The Effect of IoT New Features on Security and Privacy: New
Threats, Existing Solutions, and Challenges Yet to Be Solved,2016
Author: Wei Zhou, Yan Jia, Anni Peng, Yuqing Zhang, and Peng Liu
Methodology
Internet of Things (IoT) is an increasingly popular technology that enables physical
devices, vehicles, home appliances, etc. to communicate and even inter-operate
with one another. It has been widely used in industrial production and social
applications including smart home, healthcare, and industrial automation. While
bringing unprecedented convenience, accessibility, and efficiency, IoT has caused
acute security and privacy threats in recent years. There are increasing research
works to ease these threats, but many problems remain open. To better understand
the essential reasons of new IoT threats and the challenges in current research, this
survey first proposes the concept of “IoT features”. Then, we discuss the security
and privacy effects of eight IoT features including the threats they cause, existing
solutions to threats and research challenges yet to be solved. To help researchers
follow the up-to-date works in this field, this paper finally illustrates the
developing trend of IoT security research and reveals how IoT features affect
existing security research by investigating most existing research works related to
IoT security from 2013 to 2017.
Advantages:
Monitor any data source, including user logs, devices, networks, and servers.
Disadvantages:
it can be intimidating.
2.3.3 Personal Data Trading Scheme for Data Brokers in IoT Data
5
Marketplaces
Methodology
With the widespread use of the Internet of Things, data-driven services take the
lead of both online and off-line businesses. Especially, personal data draw heavy
attention of service providers because of the usefulness in value-added services.
With the emerging big-data technology, a data broker appears, which exploits and
sells personal data about individuals to other third parties. Due to little
transparency between providers and brokers/consumers, people think that the
current ecosystem is not trustworthy, and new regulations with strengthening the
rights of individuals were introduced. Therefore, people have an interest in their
privacy valuation. In this sense, the willingness-to-sell (WTS) of providers
becomes one of the important aspects for data brokers; however, conventional
studies have mainly focused on the willingnessto-buy (WTB) of consumers.
Therefore, this paper proposes an optimized trading model for data brokers who
buy personal data with proper incentives based on the WTS, and they sell valuable
information from the refined dataset by considering the WTB and the dataset
quality. This paper shows that the proposed model has a global optimal point by
the convex optimization technique and proposes a gradient ascentbased algorithm.
Consequently, it shows that the proposed model is feasible even if the data brokers
spend cost to gather personal data.
Advantages:
Change of detecting unknown attack.
Anomoly Detection more efficient than signature detection, if signature
detection file is large.
6
Disadvantages:
Cannot use anamoly detection must be used with signature detection.
Reliability is unclear.
Advantages:
Rate of missing report is low.
Simple and Effective method.
7
Disadvantages:
Needs to be trained, and trained model carefully otherwise tends to be
false positive.
Low Accuracy rate.
Advantages:
Flexibility, fault tolerance, high sensing fidelity, low-
cost and rapid deployment.
Disadvantages:
Sensor nodes are prone to failures.
8
2.3.6 Machine Learning for Anomaly Detection: A
Systematic Review,2017
Anomaly detection has been used for decades to identify and extract anomalous
components from data. Many techniques have been used to detect anomalies. One
of the increasingly significant techniques is Machine Learning (ML), which plays
an important role in this area. In this research paper, we conduct a Systematic
Literature Review (SLR) which analyzes ML models that detect anomalies in their
application. Our review analyzes the models from four perspectives; the
applications of anomaly detection, ML techniques, performance metrics for ML
models, and the classification of anomaly detection. In our review, we have
identified 290 research articles, written from 2000-2020, that discuss ML
techniques for anomaly detection. After analyzing the selected research articles, we
present 43 different applications of anomaly detection found in the selected
research articles. Moreover, we identify 29 distinct ML models used in the
identification of anomalies.
Advantages:
Simplest and Easiest Data Mining Aproach.
DisAdvantages
Handling of Anamoly detection is difficult
9
Author: Sheharbano Khattak, Naurin Rasheed Ramay, Kamran Riaz Khan, Affan A
Methodology
A number of detection and defense mechanisms have emerged in the last decade to
tackle the botnet phenomenon. It is important to organize this knowledge to better
understand the botnet problem and its solution space. In this paper, we structure
existing botnet literature into three comprehensive taxonomies of botnet behavioral
features, detection and defenses. This elevated view highlights opportunities for
network defense by revealing shortcomings in existing approaches. We introduce
the notion of a dimension to denote different criteria which can be used to classify
botnet detection techniques. We demonstrate that classification by dimensions is
particularly useful for evaluating botnet detection mechanisms through various
metrics of interest. We also show how botnet behavioral features from the first
taxonomy affect the accuracy of the detection approaches in the second taxonomy.
Advantages:
Change of Detecting unknown attack.
May be more efficient.
Disadvantages:
Must be used with signature detection.
Anamoly implises unusual activity.
10
CHAPTER 3
SYSTEM DIAGRAMS
Remove
unwanted Preprocessing
column
Training
Handling Data
Missing Values Data Splitting
Testing
Data
Classification
Result
DT Generation
Accuracy
Performance precision
F1 Score
Recall
11
3.2 Flow Diagram
Data
Splitting
Prediction Classification
DT
Performance
Analysis
Report
Generation
12
3.3 Use Case Diagram
Dataset Selection
Load Dataset
Block Chain
Encryption Fernet
Preprocessing
Data Splitting
Classification
Prediction
Result Generation
13
3.4 Activity Diagram
Input Data
Block chain
Encryption Fernet
Preprocessing
Data splitting
Classification
Prediction
Performance metrics
14
3.5 Sequence Diagram
Unwanted
Select Column
Dataset removal
Train
Secure
Test
Algorithm
implementation
Missing
data
15
3.6. Class Diagram
Result Classification
Generation
Accuracy
Precision DT
F1 Score Prediction
Recall
16
3.7 ER DIAGRAM
Select
Missing
values Label encode
Load Import
Data
Preprocessing
selection
Prediction
F1 Score
Test Train
17
CHAPTER-4
IMPLEMENTATION
4.1 MODULES:
Data selection
Block chain
Encryption – Decryption
Data preprocessing
Data Splitting
Classification
Performance metrics
18
Blockchain is a system of recording information in a way that
makes it difficult or impossible to change, hack, or cheat the
system.
A blockchain is essentially a digital ledger of transactions that is
duplicated and distributed across the entire network of computer
systems on the blockchain.
Each block in the chain contains a number of transactions, and
every time a new transaction occurs on the blockchain, a record of
that transaction is added to every participant’s ledger.
The decentralised database managed by multiple participants is
known as Distributed Ledger Technology (DLT).
Blockchain is a type of DLT in which transactions are recorded with
an immutable cryptographic signature called a hash.
Cryptography deals with the encryption of plaintext into cipher text and
decryption of cipher text into plaintext.
Python supports a cryptography package that helps us encrypt and decrypt
data. The fernet module of the cryptography package has inbuilt functions
for the generation of the key, encryption of plaintext into cipher text, and
decryption of cipher text into plaintext using the encrypt and decrypt
methods respectively.
The fernet module guarantees that data encrypted using it cannot be
further manipulated or read without the key.
19
4.2.4 DATA PREPROCESSING:
Data pre-processing is the process of removing the unwanted data from the
dataset.
Missing data removal: In this process, the null values such as missing values
and Nan values are replaced by 0.
Missing and duplicate values were removed and data was cleaned of any
abnormalities.
During the machine learning process, data are needed so that learning can
take place.
In addition to the data required for training, test data are needed to evaluate
the performance of the algorithm but here we have training and testing dataset
separately.
In our process, we have to divide as training and testing into x_train,
y_train, x_test, y_test.
20
Data splitting is the act of partitioning available data into two portions,
usually for cross-validator purposes.
One Portion of the data is used to develop a predictive model and the other
to evaluate the model's performance.
4.2.6. CLASSIFICATION
Decision Tree
21
Accuracy
Precision
Precision explains how many of the correctly predicted cases actually turned out to
be positive. Precision is useful in the cases where False Positive is a higher concern
than False Negatives.
F1 Score
It gives a combined idea about Precision and Recall metrics. It is maximum when
Precision is equal to Recall.
Recall
Recall explains how many of the actual positive cases we were able to predict
correctly with our model
22
CHAPTER 5
SYSTEM REQUIREMENTS
5.1 HARDWARE REQUIREMENTS:
O/S : Windows 7.
Language : Python
Front End : Anaconda Navigator – Spyder
5.3.1 Python
Python is one of those rare languages which can claim to be both simple and
powerful. You will find yourself pleasantly surprised to see how easy it is to
concentrate on the solution to the problem rather than the syntax and structure of
the language you are programming in. The official introduction to Python is
Python is an easy to learn, powerful programming language. It has efficient high-
level data structures and a simple but effective approach to object-oriented
programming. Python's elegant syntax and dynamic typing, together with its
23
interpreted nature, make it an ideal language for scripting and rapid application
development in many areas on most platforms. I will discuss most of these features
in more detail in the next section.
Easy to Learn
As you will see, Python is extremely easy to get started with. Python has an
extraordinarily simple syntax, as already mentioned.
High-level Language
When you write programs in Python, you never need to bother about the
low-level details such as managing the memory used by your program, etc.
24
Portable
Due to its open-source nature, Python has been ported to (i.e. changed to
make it work on) many platforms. All your Python programs can work on any of
these platforms without requiring any changes at all if you are careful enough to
avoid any system-dependent features.
You can even use a platform like Kivy to create games for your computer
and for iPhone, iPad, and Android.
Interpreted
Python, on the other hand, does not need compilation to binary. You just run
the program directly from the source code. Internally, Python converts the source
code into an intermediate form called bytecodes and then translates this into the
native language of your computer and then runs it. All this, actually, makes using
Python much easier since you don't have to worry about compiling the program,
25
making sure that the proper libraries are linked and loaded, etc. This also makes
your Python programs much more portable, since you can just copy your Python
program onto another computer and it just works!
Object Oriented
Extensible
If you need a critical piece of code to run very fast or want to have some
piece of algorithm not to be open, you can code that part of your program in C or
C++ and then use it from your Python program.
Embeddable
You can embed Python within your C/C++ programs to give scripting
capabilities for your program's users.
Extensive Libraries
The Python Standard Library is huge indeed. It can help you do various
things involving regular expressions, documentation generation, unit testing,
threading, databases, web browsers, CGI, FTP, email, XML, XML-RPC, HTML,
WAV files, cryptography, GUI (graphical user interfaces), and other system-
26
dependent stuff. Remember, all this is always available wherever Python is
installed. This is called the Batteries Included philosophy of Python.
Besides the standard library, there are various other high-quality libraries
which you can find at the Python Package Index.
27
software design in the module. This is also known as ‘module testing’. The
modules of the system are tested separately. This testing is carried out during the
programming itself. In this testing step, each model is found to be working
satisfactorily as regard to the expected output from the module. There are some
validation checks for the fields. For example, the validation check is done for
verifying the data given by the user where both format and validity of the data
entered is included. It is very easy to find error and debug the system.
2. Interface error
28
4. Performance errors.
White Box testing is a test case design method that uses the control structure of
the procedural design to drive cases. Using the white box testing methods, we
Derived test cases that guarantee that all independent paths within a module have
been exercised at least once.
5.4.4 SOFTWARE TESTING
STRATEGIES VALIDATION TESTING:
29
User acceptance of the system is the key factor for the success of the
system. The system under consideration is tested for user acceptance by
constantly keeping in touch with prospective system at the time of developing
changes whenever required.
OUTPUT TESTING:
After performing the validation testing, the next step is output asking the user
about the format required testing of the proposed system, since no system
could be useful if it does not produce the required output in the specific format.
The output displayed or generated by the system under consideration. Here
the output format is considered in two ways. One is screen and the other is
printed format. The output format on the screen is found to be correct as the
format was designed in the system phase according to the user needs. For the
hard copy also output comes out as the specified requirements by the user.
Hence the output testing does not result in any connection in the system.
30
CHAPTER 6
CONCLUSION
A secure framework based on trust management and blockchain to deal with the
issues caused by MDs at various levels in IIoT networks. Using high security
purpose to encryption and decryption process .Our Machine learning algorithm
give high performance result. The accuracy, Precision, Recall and F1 score have
reached high confidence result and accurate prediction status.
31
CHAPTER 7
FUTURE ENHANCEMENT
32
CHAPTER 8
SAMPLE
CODING
Install Required :
Libraries
blockchain = Blockchain()
33
import numpy as np
# Make predictions (you would use your actual evidence data here)
predictions = model.predict(X)
Blockchain Integration with Machine Learning: Integrate the blockchain with
machine learning to store the evidence data and model predictions:
34
CHAPTER 9
SAMPLE SCREENSHOTS
Dataset
Block chain
35
Preprocessing
36
Label Encoding
37
38
Performance
39
Prediction
Confusion Matrix
40
CHAPTER 10
REFERENCES
[1] A. Karpatne, G. Atluri, J. H. Faghmous, M. Steinbach, A. Banerjee, A.
Ganguly, S. Shekhar, N. Samatova, and V. Kumar, “Theoryguided data science: A
new paradigm for scientific discovery from data,” IEEE Transactions on
Knowledge and Data Engineering, vol. 29, no. 10, pp. 2318–2331, 2017.
doi:10.1109/TKDE.2017.2720168.
[2] Y. Mehmood, F. Ahmad, I. Yaqoob, A. Adnane, M. Imran, and S. Guizani,
“Internet-of-Things Based Smart Cities: Recent Advances and Challenges,” IEEE
Communications Magazine, vol. 55, no. 9, pp. 16–24, 2017. doi:
10.1109/MCOM.2017.1600514.
[3] H. Oh, S. Park, G. M. Lee, H. Heo, and J. K. Choi, “Personal data trading
scheme for data brokers in iot data marketplaces,” IEEE Access, vol. 7, pp. 40120–
40132, 2019. doi:10.1109/ACCESS.2019.2904248.
[4] Merolla,J.V.Arthur,R.Alvarez-Icaza,A.S.Cassidy,J.Sawada, F. Akopyan, B. L.
Jackson, N. Imam, C. Guo, Y. Nakamura, et al., “A million spiking-neuron
integrated circuit with a scalable communication network and interface,” Science,
vol. 345, no. 6197, pp. 668– 673, 2014. doi:10.1126/science.1254642.
[5] C.-X. Wang, F. Haider, X. Gao, X.-H. You, Y. Yang, D. Yuan, H. M.
Aggoune, H. Haas, S. Fletcher, and E. Hepsaydir, “Cellular architecture and key
technologies for 5g wireless communication networks,” IEEE communications
magazine, vol. 52, no. 2, pp. 122– 130, 2014. doi:10.1109/MCOM.2014.6736752.
[6] E. Bertino and N. Islam, “Botnets and internet of things security,” Computer,
no. 2, pp. 76–79, 2017. doi:10.1109/MC.2017.62.
[7] L. Zhou, D. Wu, J. Chen, and Z. Dong, “When computation hugs intelligence:
Content-aware data processing for industrial iot,” IEEE Internet of Things Journal,
41
vol. 5, no. 3, pp. 1657–1666, 2017. doi:10.1109/JIOT.2017.2785624.
[8] J. Huang, L. Kong, G. Chen, M.-Y. Wu, X. Liu, and P. Zeng, “Towards secure
industrial iot: Block chain system with credit-based consensus mechanism,” IEEE
Transactions on Industrial Informatics, 2019. doi:10.1109/TII.2019.2903342.
[9] F. Al-Turjman and S. Alturjman, “Context-sensitive access in industrial
internet of things (iiot) healthcare applications,” IEEE Transactions on Industrial
Informatics, vol. 14, no. 6, pp. 2736–2744, 2018. doi:10.1109/TII.2018.2808190.
[10] J. Wan, J. Li, M. Imran, and D. Li, “A blockchain-based solution for
enhancing security and privacy in smart factory,” IEEE Transactions on Industrial
Informatics, vol. 15, pp. 3652–3660, June 2019. doi:10.1109/TII.2019.2894573.
[11] J. A. Shamsi and M. A. Khojaye, “Understanding privacy violations in big
data systems,” IT Professional, vol. 20, no. 3, pp. 73–81, 2018.
doi:10.1109/MITP.2018.032501750.
[12] X.Li,Q. Wang, X.Lan, X.Chen, N.Zhang, andD. Chen, “Enhancing cloud-
based iot security through trustworthy cloud service: An integration of security and
reputation approach,” IEEE Access, vol. 7, pp. 9368–9383, 2019.
doi:10.1109/ACCESS.2018.2890432
[13] H. Moosavi and F. M. Bui, “Delay-aware optimization of physical layer
security in multi-hop wireless body area networks,” IEEE Transactions on
Information Forensics and Security, vol. 11, no. 9, pp. 1928–1939, 2016.
doi:10.1109/TIFS.2016.2566446.
[14] Z. Chen, W. Dong, H. Li, P. Zhang, X. Chen, and J. Cao, “Collaborative
network security in multi-tenant data center for cloud computing,” Tsinghua
Science and Technology, vol. 19, no. 1, pp. 82– 94, 2014.
doi:10.1109/TST.2014.6733211.
[15] P. Danzi, A. E. Kalør, ˇC. Stefanovi´c, and P. Popovski, “Delay and
communication tradeoffs for blockchain systems with lightweight iot clients,”
42
IEEE Internet of Things Journal, vol. 6, no. 2, pp. 2354– 2365, 2019.
doi:10.1109/JIOT.2019.2906615.
43