0% found this document useful (0 votes)

141 views

1 Data Leakage

This document is a project report on data leakage detection that was submitted by Ranjana Singh Maravi to Prof. Dr. Shikha Agrawal. The report proposes techniques for detecting data leakage from agents without modifying the original data. It develops a guilt model to assess the likelihood an agent leaked data based on overlap between the agent's data and leaked data. Algorithms are presented for distributing data to agents in a way that improves chances of identifying leakers, such as distributing fake objects. The goal is to detect data leakage from agents and identify guilty agents without watermarking or altering original data.

Uploaded by

Ranjana singh maravi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views

1 Data Leakage

Uploaded by

Ranjana singh maravi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Rajiv Gandhi Proudyogiki Vishwavidyalaya,

Bhopal
University Institute of Technology,
Narsingharh bypass road,Near gandhi nagar,Bhopal(M.P)

SESSION:2019-20

PROJECT REPORT ON
DATA LEAKAGE DETECTION

DEPARTMENT OF COMPUTER SCIENCE &

ENGINEERING

Submitted To- Submitted by:

Prof. Dr. Shikha Agrawal Ranjana Singh Maravi
Roll no. 0101CS171086
Semester- 6th
ABSTRACT
A data distributor has given sensitive data to a set of supposedly trusted agents (third
parties). Some of the data is leaked and found in an unauthorized place (e.g., on the web or
somebody’s laptop). The distributor must assess the likelihood that the leaked data came
from one or more agents, as opposed to having been independently gathered by other
means. We propose data allocation strategies (across the agents) that improve the probability
of identifying leakages. These methods do not rely on alterations of the released data (e.g.,
watermarks). In some cases we can also inject “realistic but fake” data records to further
improve our chances of detecting leakage and identifying the guilty party.
INTRODUCTION-

In the course of doing business, sometimes sensitive data must be handed over to
supposedly trusted third parties. For example, a hospital may give patient records to
researchers who will devise new treatments. Similarly, a company may have partnerships
with other companies that require sharing customer data. Another enterprise may outsource
its data processing, so data must be given to various other companies. Our goal is to detect
when the distributor’s sensitive data has been leaked by agents, and if possible to identify
the agent that leaked the data. Perturbation is a very useful technique where the data is
modified and made “less sensitive” before being handed to agents

EXISTING SYSTEM

 We consider applications where the original sensitive data cannot be perturbed.

Perturbation is a very useful technique where the data is modified and made “less

sensitive” before being handed to agents. For example, one can add random noise to

certain attributes, or one can replace exact values by ranges.

Graph showing perturbation

 However, in some cases it is important not to alter the original distributor’s data.
 Traditionally, leakage detection is handled by watermarking, e.g., a unique code is
embedded in each distributed copy.

Creating a watermark

 If that copy is later discovered in the hands of an unauthorized party, the leaker can be
identified.

 Watermarks can be very useful in some cases, but again, involve some modification of the
original data.

 Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious.

Disadvantages of Existing Systems:

 Watermarks can be very useful in some cases, but again, involve some modification of the
original data. Furthermore, watermarks can sometimes be destroyed if the data recipient is
malicious. E.g. A hospital may give patient records to researchers who will devise new
treatments. Similarly, a company may have partnerships with other companies that require
sharing customer data. Another enterprise may outsource its data processing, so data must be
given to various other companies. We call the owner of the data the distributor and the
supposedly trusted third parties the agents.
PROPOSED SYSTEM:

 Our goal is to detect when the distributor's sensitive data has been leaked by agents, and if
possible to identify the agent that leaked the data.
 Perturbation is a very useful technique where the data is modified and made "less sensitive"
before being handed to agents. We develop unobtrusive techniques for detecting leakage of a
set of objects or records.
 Unobstrusive Techniques: Unobtrusive technique is a technique of data collection. They
describe methodologies which do not involve direct elicitation of data from the research
subjects. The unobtrusive approach often seeks unusual data sources.
 We develop a model for assessing the "guilt" of agents.
 We also present algorithms for distributing objects to agents, in a way that improves our
chances of identifying a leaker.
 Finally, we also consider the option of adding "fake" objects to the distributed set. Such
objects do not correspond to real entities but appear realistic to the agents.
In a sense, the fake objects acts as a type of watermark for the entire set, without modifying
any individual members. If it turns out an agent was given one or more fake objects that
were leaked, then the distributor can be more confident that agent was guilty.

Typical Block Diagram Showing the Process of Data Loss In Blocking spam.
Problem Setup and Notation:

 A distributor owns a set T={t1,…,tm}of valuable data objects. The distributor wants to share
some of the objects with a set of agents U1,U2,…Un, but does not wish the objects be leaked
to other third parties. The objects in T could be of any type and size, e.g., they could be
tuples in a relation, or relations in a database. An agent Ui receives a subset of objects,
determined either by a sample request or an explicit request:

1. Sample request
2. Explicit request

Guilt Model Analysis:

our model parameters interact and to check if the interactions match our intuition, in this
section we study two simple scenarios as Impact of Probability p and Impact of Overlap
between Ri and S. In each scenario we have a target that has obtained all the distributor’s
objects, i.e., T = S.

Algorithms:

1. Evaluation of Explicit Data Request Algorithms

In the first place, the goal of these experiments was to see whether fake objects in the
distributed data sets yield significant improvement in our chances of detecting a guilty
agent. In the second place, we wanted to evaluate our e-optimal algorithm relative to a
random allocation.

2. Evaluation of Sample Data Request Algorithms

With sample data requests agents are not interested in particular objects. Hence, object
sharing is not explicitly defined by their requests. The distributor is “forced” to allocate
certain objects to multiple agents only if the number of requested objects exceeds the
number of objects in set T. The more data objects the agents request in total, the more
recipients on average an object has; and the more objects are shared among different agents,
the more difficult it is to detect a guilty agent.
Hardware Requirements
 SYSTEM : Pentium IV 2.4 GHz
 HARD DISK : 40 GB
 FLOPPY DRIVE : 1.44 MB
 MONITOR : 15 VGA colour
 MOUSE : Logitech.
 RAM : 256 MB
 KEYBOARD : 110 keys enhanced.

Software Requirements
 Operating system :- Windows XP Professional
 Front End :- Microsoft Visual Studio .Net 2005
 Coding Language :- C#
 Database :- SQL SERVER 2000

MODULE DESCRIPTION:

1) Login / Registration:

This is a module mainly designed to provide the authority to a user in order to

access the other modules of the project. Here a user can have the accessibility authority after

the registration.

2) DATA TRANSFER:

This module is mainly designed to transfer data from distributor to agents. The same
module can also be used for illegal data transfer from authorized to agents to other agents

3) GUILT MODEL ANALYSIS:

This module is designed using the agent – guilt model. Here a count value(also
called as fake objects) are incremented for any transfer of data occurrence when agent
transfers data. Fake objects are stored in database.
4)AGENT-GUILT MODEL:
This module is mainly designed for determining fake agents. This module uses fake
objects (which is stored in database from guilt model module) and determines the guilt agent
along with the probability. A graph is used to plot the probability distribution of data which
is leaked by fake agents

ACTIVITY DIAGRAM :

DISTRIBUTE DATA VIEW DATA DISTRIBUTED

TO AGENTS TO AGENTS

FIND GUILT
AGENTS

FIND PROBABILITY OF
DATA LEAKAGE
ARCHITECTURE-DIAGRAM

CONCLUSION

 The likelihood that an agent is responsible for a leak is assessed, based on the overlap
of his data with the leaked data and the data of other agents, and based on the
probability that objects can be “guessed” by other means. The algorithms we have
presented implement a variety of data distribution strategies that can improve the
distributor’s chances of identifying a leaker. We have shown that distributing objects
judiciously can make a significant difference in identifying guilty agents, especially in
cases where there is large overlap in the data that agents must receive.

Data Leakage Detection Complete Project Report
85% (33)
Data Leakage Detection Complete Project Report
59 pages
Quiz App Thesis
58% (12)
Quiz App Thesis
73 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
17 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
6 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
4 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
19 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
4 pages
CN Presentation
No ratings yet
CN Presentation
11 pages
Data Leakage Detection: Using UNOBTRUSIVE Technique
No ratings yet
Data Leakage Detection: Using UNOBTRUSIVE Technique
28 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
6 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
35 pages
Seminar On Data Leakage Detection: Submitted To: Submitted by
No ratings yet
Seminar On Data Leakage Detection: Submitted To: Submitted by
17 pages
10I17-IJAET1117310 v6 Iss5 2033-2040 PDF
No ratings yet
10I17-IJAET1117310 v6 Iss5 2033-2040 PDF
8 pages
Data Leakage Detection: Disadvantages of Existing Systems
No ratings yet
Data Leakage Detection: Disadvantages of Existing Systems
4 pages
Data Allocation Strategies For Detecting Data Leakage
No ratings yet
Data Allocation Strategies For Detecting Data Leakage
4 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
15 pages
Document Fraud Detecting System
No ratings yet
Document Fraud Detecting System
5 pages
Development of Data Leakage Detection Using Data Allocation Strategies
No ratings yet
Development of Data Leakage Detection Using Data Allocation Strategies
7 pages
Documentation 2023
No ratings yet
Documentation 2023
34 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
15 pages
Data Leakage Detection: Team Members
No ratings yet
Data Leakage Detection: Team Members
15 pages
Data Leakage Detection
40% (5)
Data Leakage Detection
21 pages
Data Leak
No ratings yet
Data Leak
31 pages
Data 1
No ratings yet
Data 1
21 pages
DATA LEAKAGE DETECTION-document
No ratings yet
DATA LEAKAGE DETECTION-document
37 pages
Allocation Strategies For Detecting and Identifying The Leakage and Guilty Agents
No ratings yet
Allocation Strategies For Detecting and Identifying The Leakage and Guilty Agents
4 pages
Report
No ratings yet
Report
69 pages
DLD - Literature Survey
No ratings yet
DLD - Literature Survey
2 pages
Data Leakage Detection: A Seminar Report ON
0% (1)
Data Leakage Detection: A Seminar Report ON
32 pages
A Model For Data Leakage Detection
No ratings yet
A Model For Data Leakage Detection
4 pages
Data Leakage Detection Formulae Lists
No ratings yet
Data Leakage Detection Formulae Lists
4 pages
Data Leakage Detection by R.Kartheek Reddy 09C31D5807 (M.Tech CSE)
No ratings yet
Data Leakage Detection by R.Kartheek Reddy 09C31D5807 (M.Tech CSE)
20 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
20 pages
Data Leakage Detection
100% (1)
Data Leakage Detection
81 pages
Data Leakage Detection Abstract by Coreieeeprojects
No ratings yet
Data Leakage Detection Abstract by Coreieeeprojects
5 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
18 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
5 pages
Data Leakaged Etection
No ratings yet
Data Leakaged Etection
3 pages
Data Leakage Detectionusing Distribution Method
No ratings yet
Data Leakage Detectionusing Distribution Method
6 pages
Ijser: Data Leakage Detection Using Cloud Computing
No ratings yet
Ijser: Data Leakage Detection Using Cloud Computing
6 pages
Data Leakage Detection Abstract
No ratings yet
Data Leakage Detection Abstract
3 pages
Ijcsit 2016070453
No ratings yet
Ijcsit 2016070453
5 pages
A Novel Data Leakage Detection: Priyanka Barge, Pratibha Dhawale, Namrata Kolashetti
No ratings yet
A Novel Data Leakage Detection: Priyanka Barge, Pratibha Dhawale, Namrata Kolashetti
3 pages
I Jcs CN 2013030107
No ratings yet
I Jcs CN 2013030107
9 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
33 pages
A Secure Data Leakage Detection Protection Scheme Over Lan Network
No ratings yet
A Secure Data Leakage Detection Protection Scheme Over Lan Network
13 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
4 pages
Data Leakage Detection - Final 26 April
100% (2)
Data Leakage Detection - Final 26 April
62 pages
Data Leakage Ieee - 100243
No ratings yet
Data Leakage Ieee - 100243
7 pages
Data Leakage Detection
No ratings yet
Data Leakage Detection
48 pages
V3I6201499a33 PDF
No ratings yet
V3I6201499a33 PDF
5 pages
Data Leakage Detection1
No ratings yet
Data Leakage Detection1
13 pages
Data Leakage
No ratings yet
Data Leakage
9 pages
DPF DATA LEAKAGE USING CLOUD COMPUTING
No ratings yet
DPF DATA LEAKAGE USING CLOUD COMPUTING
2 pages
Data Distributor
No ratings yet
Data Distributor
77 pages
A Survey On The Various Techniques of Data Leakage Detection
No ratings yet
A Survey On The Various Techniques of Data Leakage Detection
2 pages
Data Leakage Detection
100% (1)
Data Leakage Detection
22 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Science
From Everand
Data Science
Chloe Martin
No ratings yet
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
From Everand
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
Steven Taylor
No ratings yet
3rd Quarter - INSTALLING OS-CONFIGURE COMPUTER SYSTEMS
No ratings yet
3rd Quarter - INSTALLING OS-CONFIGURE COMPUTER SYSTEMS
54 pages
IBM FlashSystem 5000 Data Sheet
No ratings yet
IBM FlashSystem 5000 Data Sheet
12 pages
1st Sem - Sec B - Computer Assignment
No ratings yet
1st Sem - Sec B - Computer Assignment
4 pages
Computer Notes
No ratings yet
Computer Notes
61 pages
Introduction To Symantec Endpoint Management: Andrew Blackham
No ratings yet
Introduction To Symantec Endpoint Management: Andrew Blackham
17 pages
Tesda Portfolio
No ratings yet
Tesda Portfolio
107 pages
Msystems LTD.: HP Prodesk 490 G1 Microtower PC
No ratings yet
Msystems LTD.: HP Prodesk 490 G1 Microtower PC
4 pages
Vista Folder File Info
No ratings yet
Vista Folder File Info
5 pages
Computer Architecture UET 2023
No ratings yet
Computer Architecture UET 2023
40 pages
Suyash Os Notes
No ratings yet
Suyash Os Notes
13 pages
Cloud Computing(unit-2)
No ratings yet
Cloud Computing(unit-2)
11 pages
Developing with Docker 1st Edition Jaroslaw Krochmalski - The complete ebook version is now available for download
100% (2)
Developing with Docker 1st Edition Jaroslaw Krochmalski - The complete ebook version is now available for download
47 pages
OS Unit-1 Notes
67% (3)
OS Unit-1 Notes
8 pages
Sonosync More
No ratings yet
Sonosync More
2 pages
FP0R PLC Katalog PDF
No ratings yet
FP0R PLC Katalog PDF
20 pages
MANUAL DEALER EN Ver.2.6
No ratings yet
MANUAL DEALER EN Ver.2.6
77 pages
Gplus Pcl6 Driver v29x Ig en
No ratings yet
Gplus Pcl6 Driver v29x Ig en
80 pages
Embedded Systems Design - Lecture Notes, Study Material and Important Questions, Answers
No ratings yet
Embedded Systems Design - Lecture Notes, Study Material and Important Questions, Answers
29 pages
Docker Introduction
100% (1)
Docker Introduction
59 pages
Object-Based Storage: IEEE Communications Magazine September 2003
No ratings yet
Object-Based Storage: IEEE Communications Magazine September 2003
8 pages
Report On 64 Bit Processor
No ratings yet
Report On 64 Bit Processor
7 pages
Users Guide EHDOC X136 en 410
No ratings yet
Users Guide EHDOC X136 en 410
92 pages
Bece Ict 2017 Questions
No ratings yet
Bece Ict 2017 Questions
12 pages
000-007 3-0
No ratings yet
000-007 3-0
43 pages
HRW - 20220711 - Students Not Products Report Final-IV - Inside Pages and Cover
No ratings yet
HRW - 20220711 - Students Not Products Report Final-IV - Inside Pages and Cover
107 pages
Composer 10.5.0 User Guide
No ratings yet
Composer 10.5.0 User Guide
36 pages
Downloadfile 135
No ratings yet
Downloadfile 135
63 pages
Communications of ACM 201602
No ratings yet
Communications of ACM 201602
132 pages
Controller Editor Ableton Live Template Manual English
No ratings yet
Controller Editor Ableton Live Template Manual English
114 pages

1 Data Leakage

Uploaded by

1 Data Leakage

Uploaded by

Rajiv Gandhi Proudyogiki Vishwavidyalaya,

DEPARTMENT OF COMPUTER SCIENCE &

Submitted To- Submitted by:

 We consider applications where the original sensitive data cannot be perturbed.

certain attributes, or one can replace exact values by ranges.

Graph showing perturbation

 Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious.

Disadvantages of Existing Systems:

Guilt Model Analysis:

1. Evaluation of Explicit Data Request Algorithms

2. Evaluation of Sample Data Request Algorithms

This is a module mainly designed to provide the authority to a user in order to

3) GUILT MODEL ANALYSIS:

DISTRIBUTE DATA VIEW DATA DISTRIBUTED

You might also like