0% found this document useful (0 votes)

28 views

04 - Chapter 3 - Privacy

This document discusses information security and privacy concerns related to big data and data mining. It addresses the four key users in the data mining process: data providers, data collectors, data miners, and decision makers. For data providers, major concerns are controlling data sensitivity and privacy, while data collectors aim to collect useful data while preserving privacy. Techniques like limiting access, trading privacy for benefits, and providing false data can help address these issues. Privacy-preserving data mining also aims to safeguard sensitive information during data mining.

Uploaded by

Taif Alkaabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

04 - Chapter 3 - Privacy

Uploaded by

Taif Alkaabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 61

INFORMATION SECURITY IN BIG

DATA ---PRIVACY AND DATA MINING

CONTENT
1.Introduction
2.Data Provider
3.Data Collector
4.Data Miner
5.Decision Maker
6. Future Research Areas
7.Conclusion
8.References

2
INTRODUCTION
 Data mining is the process of discovering interesting patterns
and knowledge from large amounts of data
 Data mining has been successfully applied to many domains,
such as business intelligence, Web search, scientific discovery,
digital libraries, etc.
 The term ``data mining'' is often treated as a synonym for
another term ``knowledge discovery from data'' (KDD) which
highlights the goal of the mining process.

3
4
2. Big data and privacy concerns

11
https://www.youtube.com/watch?v=uaaC57tcci0

Social Dillema
 Communications technologies and Big Data analysis have
facilitated the intrusion of privacy by devising and strengthening
audio-visual surveillance and “dataveilance”
 Governments have used these technologies for continuous and
massive collection and collation of data from our private spaces.
Big Data phenomena are a constellation of data storage and
processing extensions to modern communications technologies
that have given rise to further, new modes of privacy intrusions
that were not anticipated when much more primitive
communications and eavesdropping technologies gave rise to
the existing privacy laws.

13
Big data defined

What exactly is big data?

The definition of big data is data that contains greater variety, arriving in
increasing volumes and with more velocity. This is also known as the three Vs.
Put simply, big data is larger, more complex data sets, especially from new data
sources. These data sets are so voluminous that traditional data processing
software just can’t manage them. But these massive volumes of data can be used
to address business problems you wouldn’t have been able to tackle before.
(Oracle)
The three Vs of big data

Volume
The amount of data matters. With big data, you’ll have to process high volumes of low-density, unstructured data. This
can be data of unknown value, such as Twitter data feeds, clickstreams on a web page or a mobile app, or sensor-
enabled equipment. For some organizations, this might be tens of terabytes of data. For others, it may be hundreds of
petabytes.

Velocity
Velocity is the fast rate at which data is received and (perhaps) acted on. Normally, the highest velocity of data streams
directly into memory versus being written to disk. Some internet-enabled smart products operate in real time or near
real time and will require real-time evaluation and action.

Variety
Variety refers to the many types of data that are available. Traditional data types were structured and fit neatly in a
relational database. With the rise of big data, data comes in new unstructured data types. Unstructured and
semistructured data types, such as text, audio, and video, require additional preprocessing to derive meaning and
support metadata.

https://www.oracle.com/ae/big-data/what-is-big-data/
https://www.marketsandmarkets.com/
Big data can be used:
- to identify more general trends and correlations

- it can also be processed in order to directly affect

individuals.
It is not the volume, velocity, variety or veracity what
worries me, but the uses of the information.
- The uses of the data are not determined before collection.

17
Risks

Big Data may also pose significant risks for the protection of
personal data and the right to privacy:
a) the sheer scale of data collection, tracking and profiling;
b) the security of data;
c) the transparency, which implies sufficient information given to
individuals;
d) inaccuracy, discrimination, exclusion and economic
imbalance;
e) increased possibilities of government surveillance

18
THE CHALLENGE OF BIG DATA FOR DATA
PROTECTION

It is no exaggeration to say that we are nothing more than a

collection of data to most of the institutions—and many of the
people—with whom we deal.
Big data poses enormous challenges for data protection— both by
processors and regulators. It simultaneously changes the
context and raises the stakes for Data protection.

19
Big data also shows the importance of
harmonization, or even standardization, in data
protection standards. As personal data are universally
collected and shared across sectorial and national
boundaries, inconsistent data protection laws pose
increasing threats to individuals, institutions, and
society

20
Perhaps the greatest impact of big data is the pressure it
brings for new thoughtful, informed, multinational
debate about the key principles that should
undergird data protection. Most data protection
laws continue to rely on the 1980 OECD Guidelines

21
Data Mining and Society

How does data mining impact society? What steps can data mining take to preserve
the
privacy of individuals? Do we use data mining in our daily lives without even knowing
that we do? These questions raise the following issues:
Social impacts of data mining: With data mining penetrating our everyday lives, it is
important to study the impact of data mining on society.How can we use data mining
technology to benefit society? How can we guard against its misuse? The improper
disclosure or use of data and the potential violation of individual privacy and data
protection rights are areas of concern that need to be addressed.

Privacy-preserving data mining: Data mining will help scientific discovery, business
management, economy recovery, and security protection (e.g., the real-time discovery
of intruders and cyberattacks). However, it poses the risk of disclosing an
individual’s personal information. Studies on privacy-preserving data publishing and
data mining are ongoing. The philosophy is to observe data sensitivity and preserve
people’s privacy while performing successful data mining.
Invisible data mining: We cannot expect everyone in society to learn and master
data mining techniques. More and more systems should have data mining functions
built within so that people can perform data mining or use data mining results
simply by mouse clicking, without any knowledge of data mining algorithms.
Intelligent search engines and Internet-based stores perform such invisible data
mining by incorporating data mining into their components to improve their
functionality and performance. This is done often unbeknownst to the user. For
example, when purchasing items online, users may be unaware that the store is
likely collecting data on the buying patterns of its customers, which may be used to
recommend other items for purchase in the future.
 Individual's privacy may be violated due to the
unauthorized access to personal data.
 To deal with the privacy issues in data mining, a sub-field
of data mining, referred to as privacy preserving data
mining (PPDM) .
 The aim of PPDM is to safeguard sensitive information
from unsanctioned disclosure, and preserve the utility of
the data.

24
The 4 type of users in Data Mining process-
 Data Provider: the user who owns some data that are desired

by the data mining task.

 Data Collector: the user who collects data from data

providers and then publish the data to the data miner.

 Data Miner: the user who performs data mining tasks on the

data.
 Decision Maker: the user who makes decisions based on

the data mining results in order to achieve certain goals

25
26
DATA PROVIDER
 CONCERN
 The major concern of a data provider is whether he can control
the sensitivity of the data he provides to others.

 On one hand, the provider should be able to make his very

private data, inaccessible to the data collector.

 On the other hand, if the provider has to provide some data to

the data collector, he wants to hide his sensitive information as
much as possible and get enough compensations for the
possible loss in privacy.

27
APPROACHES TO PRIVACY PROTECTION

1. LIMIT THE ACCESS

 Security tools that are developed for internet

environment to protect data
 Anti-Tracking extensions. Popular anti-tracking extensions
include Disconnect , Do Not Track Me ,Ghostery etc
 Advertisement and script blockers -Example tools
include AdBlock Plus, NoScript, FlashBlock, etc.
 Encryption tools-MailCloack and TorChat

28
2.TRADE PRIVACY FOR BENEFIT

 The data provider maybe willing to hand over some of his

private data in exchange for certain benefit.

 Such as better services or monetary rewards. The data provider

needs to know how to negotiate with the data collector, so that
he will get enough compensation for any possible loss in privacy

29
3. PROVIDE FALSE DATA

 Using ``sockpuppets'' to hide one's true activities

 Using a fake identity to create phony information

 Using security tools to mask one's identity

30
DATA COLLECTOR
 CONCERN

 The major concern of data collector is to guarantee

that the modified data contain no sensitive
information but still preserve high utility.

31
APPROACHES

 1. BASICS OF privacy preserving data publication

PPDP
 PPDP mainly studies anonymization approaches for publishing
useful data while preserving privacy. Each record consists of the
following 4 types of attributes:
 Identifier (ID): Attributes that can directly and uniquely identify
an individual, such as name, ID number and mobile no.
 Quasi-identifier (QID): Attributes that can be linked with external
data to re-identify individual records, such as gender, age and zip
code.
 Sensitive Attribute (SA): Attributes that an individual wants to
conceal, such as disease and salary.
 Non-sensitive Attribute (NSA): 32 Attributes other than ID,QID and
Original Table 2-Anonymous Table
2.PRIVACY PRESERVING PUBLISHING OF SOCIAL NETWORK DATA

 PPDP in the context of social networks mainly deals with

anonymizing graph data

 Which is much more challenging than anonymizing relational

table data.

34
3. ATTACK MODEL

35
PRIVACY MODELS

 If a network satisfies k-NMF anonymity then for each edge e,

there will be at least k - 1 other edges with the same number of
mutual friends as e. It can be guaranteed that the probability of
an edge being identified is not greater than 1/k.

1
a= 2 mutual friends
a c
b=2 mutual friends
b
c=2 mutual friends
2 3 d=2 mutual friends
d e=2 mutual friends
f=2 mutual friends
f e
4
So 6-NMF
36
DATA MINER

 CONCERN

 The primary concern of data miner is how to prevent sensitive

information from appearing in the mining results.

 To perform a privacy-preserving data mining, the data miner

usually needs to modify the data he got from the data collector.

37
APPROACHES

38
1. PRIVACY PRESERVING ASSOCIATION RULE MINING

 Various kinds of approaches have been proposed to perform

association rule hiding .
 Heuristic distortion approaches

 Heuristic blocking approaches

 Probabilistic distortion approaches

 Exact database distortion approaches

 Reconstruction-based approaches
39
2. PRIVACY PRESERVING CLASSIFICATION

 Classification is a form of data analysis that extracts models

describing important data classes

 To realize privacy-preserving decision tree mining,

 Dowd et al. proposed a data perturbation technique based on
random substitutions.
 Brickell and Shmatikov present a cryptographically secure
protocol for privacy-preserving construction of decision trees .

41
DECISION MAKER

 CONCERN

• The privacy concerns of the decision maker are following:

 how to prevent unwanted disclosure of sensitive mining

results
 how to evaluate the credibility of the received mining results

42
APPROACHES

 Legal measures.
For example, making a contract with the data miner to forbid
the miner from disclosing the mining results to a third party

 The decision maker can utilize methodologies from data

provenance, credibility analysis of web information, or other
related research fields

43
DATA PROVENANCE
 The information that helps determine the derivation history of
the data, starting from the original source

 Two kinds of information

 the ancestral data from which current data evolved

 the transformations applied to ancestral data that helped to

produce current data.

 With such information, people can better understand the data

and judge the credibility of the data.

44
WEB INFORMATION CREDIBILITY

 5 ways Internet users to differentiate false information from the

truth:
1. Authority: the real author of false information is usually unclear.
2. Accuracy: false information does not contain accurate data
3. Objectivity: false information is often prejudicial.
4. Currency: for false information, the data about its source, time and place
of its origin is incomplete, out of date, or missing.
5. Coverage: false information usually contains no effective links to other
information online.

45
Privacy and Security Constraints

 Individual Privacy
 Nobody should know more about any entity after the data
mining than they did before
 Approaches: Data Obfuscation, Value swapping
 Organization Privacy
 Protect knowledge about a collection of entities
 Individual entity values may be known to all parties
 Which entities are at which site may be secret

46
Privacy constraints don’t prevent data mining

 Goal of data mining is summary results

 Association rules
 Classifiers
 Clusters
 The results alone need not violate privacy
 Contain no individually identifiable values
 Reflect overall results, not individual organizations
The problem is computing the results without access to
the data!

47
Example:
Association Rules

 Assume data is horizontally partitioned

 Each site has complete information on a set of entities
 Same attributes at each site
 If goal is to avoid disclosing entities, problem is easy
 Basic idea: Two-Phase Algorithm
 First phase: Compute candidate rules
 Frequent globally  frequent at some site
 Second phase: Compute frequency of candidates

49
Privacy-Preserving Data Mining: Who?

 Government / public agencies. Example:

 The Centers for Disease Control want to identify disease outbreaks
 Insurance companies have data on disease incidents, seriousness, patient
background, etc.
 But can/should they release this information?
 Industry Collaborations / Trade Groups. Example:
 An industry trade group may want to identify best practices to help
members
 But some practices are trade secrets
 How do we provide “commodity” results to all (Manufacturing using
chemical supplies from supplier X have high failure rates), while still
preserving secrets (manufacturing process Y gives low failure rates)?

50
Privacy-Preserving Data Mining: Who?

 Multinational Corporations
 A company would like to mine its data for globally valid
results
 But national laws may prevent transborder data sharing
 Public use of private data
 Data mining enables research studies of large populations
 But these populations are reluctant to release personal
information

51
Outline

 Privacy and Security Constraints

 Types: Individual, collection, result limitation
 Sources: Regulatory, Contractual, Secrecy
 Classes of solutions
 Data obfuscation
 Summarization
 Data separation
 When do we address these issues?

52
Technical Solutions

 Data Obfuscation based techniques

 Reconstructing distributions for developing classifiers
 Association rules from modified data
 Data Separation based techniques
 Overview of Secure Multiparty Computation
 Secure decision tree construction
 Secure association rules
 Secure clustering
 What if the secrets are in the results?

53
Individual Privacy:
Protect the “record”

 Individual item in database must not be disclosed

 Not necessarily a person
 Information about a corporation
 Transaction record
 Disclosure of parts of record may be allowed
 Individually identifiable information

54
Individually Identifiable Information

 Data that can’t be traced to an individual not viewed

as private
 Remove “identifiers”
 But can we ensure it can’t be traced?
 Candidate Key in non-identifier information
 Unique values for some individuals
Data Mining enables such tracing!

55
Collection Privacy

 Disclosure of individual data may be okay

 Telephone book
 De-identified records
 Releasing the whole collection may cause problems
 Trade secrets – corporate plans
 Rules that reveal knowledge about the holder of data

56
Collection Privacy Example:
Corporate Phone Book
 Telephone Directory discloses
how to contact an individual
 Intended use
 Data Mining can find more
 Relative sizes of departments
 Use to predict corporate plans?
Data
 Possible Solution: Obfuscation Mining
 Fake entries in phone book
 Doesn’t prevent intended use
 Key: Define Intended Use
 Not always easy! Unexpectedly High
Number of
Energy Traders
Sources of Constraints

 Regulatory requirements
 Contractual constraints
 Posted privacy policy
 Corporate agreements
 Secrecy concerns
 Secrets whose release could jeopardize plans
 Public Relations – “bad press”

58
European Union Data Protection Directives

 Directive 95/46/EC
 Passed European Parliament 24 October 1995
 Goal is to ensure free flow of information
 Must preserve privacy needs of member states
 Effective October 1998
 GDPR - General Data Protection Regulation
 seeks to regulate the use and disclosure of the personal data of all individuals within the 28 EU
member states. Though passed into law in May 2016, it does not become enforceable until May
25, 2018.
 Unlike most privacy regulations in the U.S., the EU defines the term “personal data” broadly—
it includes “any information relating to an identified or identifiable natural person (the ‘data
subject’).”
 This means that even the most basic contact information, such as business card details or simply a
name and email address, falls under the GDPR’s protections. Public sources of information, such
as a residential phone listing, are not exempted from the GDPR’s restrictions.

59
Technology Threats to Data Privacy

• The growing popularity and development of data mining technologies

bring serious threat to the security of individual's sensitive information.

• An emerging research topic in data mining, known as privacy-preserving

data mining (PPDM), has been extensively studied in recent years.

• The basic idea of PPDM is to modify the data in such a way so as to

perform data mining algorithms effectively without compromising the
security of sensitive information contained in the data.
REFERENCES

 Lei Xu , Chunxiao Jiang , Jian Wang, Jain Yuan and Yong

Ren, Information Security in Big Data-Privacy and Data Mining
,Access, IEEE (Volume:2)

 J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and

Techniques.San Mateo, CA, USA: Morgan Kaufmann, 2006.

61
Thank you

Ernst Klee, The Good Old Days - The Holocaust As Seen by Its Perpetrators and Bystanders-Konecky & Konecky (1991)
100% (9)
Ernst Klee, The Good Old Days - The Holocaust As Seen by Its Perpetrators and Bystanders-Konecky & Konecky (1991)
344 pages
Ko-Lin Chin - Chinatown Gangs - Extortion, Enterprise, and Ethnicity (Studies in Crime and Public Policy) (2000)
No ratings yet
Ko-Lin Chin - Chinatown Gangs - Extortion, Enterprise, and Ethnicity (Studies in Crime and Public Policy) (2000)
248 pages
Ratib Al Attas
No ratings yet
Ratib Al Attas
9 pages
Contents of Statutory Report
No ratings yet
Contents of Statutory Report
3 pages
Big Data Privacy A Technological Review PDF
No ratings yet
Big Data Privacy A Technological Review PDF
25 pages
Digital Privacy Today
From Everand
Digital Privacy Today
Sterling Blackwood
No ratings yet
Big Data
No ratings yet
Big Data
32 pages
Big Data Privacy
No ratings yet
Big Data Privacy
30 pages
A Review On Big Data Privacy and Security
No ratings yet
A Review On Big Data Privacy and Security
6 pages
Data Science and Ethical Issues
No ratings yet
Data Science and Ethical Issues
42 pages
Data Privacy for Everyone: A Simple Guide to Big Ideas
From Everand
Data Privacy for Everyone: A Simple Guide to Big Ideas
NOVA MARTIAN
No ratings yet
Big Data and Data Security 2
No ratings yet
Big Data and Data Security 2
6 pages
Data Privacy
No ratings yet
Data Privacy
16 pages
Adobe Scan Jul 28, 2023 (10)
No ratings yet
Adobe Scan Jul 28, 2023 (10)
2 pages
Seminar 1
No ratings yet
Seminar 1
8 pages
1.1 The Process of KDD
No ratings yet
1.1 The Process of KDD
34 pages
Ethical Data Use
From Everand
Ethical Data Use
Elian Wildgrove
No ratings yet
Chapter 5 - Big Data Implementation Part 3 (Security)
No ratings yet
Chapter 5 - Big Data Implementation Part 3 (Security)
28 pages
Data Protection 101: A Beginner's Guide to Digital Security
From Everand
Data Protection 101: A Beginner's Guide to Digital Security
Neil King
No ratings yet
Dokumen - Pub Guide To Data Privacy Models Technologies Solutions 9783031128363 9783031128370
No ratings yet
Dokumen - Pub Guide To Data Privacy Models Technologies Solutions 9783031128363 9783031128370
323 pages
Data Privacy and Big Data: A Foundational Guide
From Everand
Data Privacy and Big Data: A Foundational Guide
Aadinath Pothuvaal
No ratings yet
The Intricacies of Online Privacy and Data Protection
From Everand
The Intricacies of Online Privacy and Data Protection
Akinsola Abayomi
No ratings yet
Fortify Your Data Privacy
From Everand
Fortify Your Data Privacy
Michael A Hudak
No ratings yet
64-SLRO-63 (1)
No ratings yet
64-SLRO-63 (1)
7 pages
Data Decoded - Understanding Big Data and Its Everyday Applications
From Everand
Data Decoded - Understanding Big Data and Its Everyday Applications
Michael Reed
No ratings yet
FSC- UNIT-IV MATERIAL
No ratings yet
FSC- UNIT-IV MATERIAL
7 pages
Privacy-Preserving Data Mining: Methods, Metrics, and Applications
No ratings yet
Privacy-Preserving Data Mining: Methods, Metrics, and Applications
21 pages
Data Protection Laws
From Everand
Data Protection Laws
Mark Chambers
No ratings yet
Research Proposal
No ratings yet
Research Proposal
17 pages
Privacy Preserving Data Mining
No ratings yet
Privacy Preserving Data Mining
10 pages
Data Privacy Compliance
From Everand
Data Privacy Compliance
Zuri Deepwater
No ratings yet
data_provacy_m1_source2
No ratings yet
data_provacy_m1_source2
91 pages
unit6
No ratings yet
unit6
27 pages
TAVANI (1999) Informational Privacy, Data Mining, and The Internet
No ratings yet
TAVANI (1999) Informational Privacy, Data Mining, and The Internet
10 pages
Data Mining Sreevidhya@Students
No ratings yet
Data Mining Sreevidhya@Students
13 pages
Module: Data Handling and Decision Making Lesson: Regulatory, Legal and Ethical Issues of Big Data
100% (1)
Module: Data Handling and Decision Making Lesson: Regulatory, Legal and Ethical Issues of Big Data
17 pages
Notes module
No ratings yet
Notes module
80 pages
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
The Data Whisperer - Making Sense of Big Data
From Everand
The Data Whisperer - Making Sense of Big Data
Keaton Rivers
No ratings yet
IRJCS:: Information Security in Big Data Using Encryption and Decryption
No ratings yet
IRJCS:: Information Security in Big Data Using Encryption and Decryption
6 pages
Digital Privacy Literacy
From Everand
Digital Privacy Literacy
Mei Gates
No ratings yet
Digital Privacy Concerns
From Everand
Digital Privacy Concerns
Michael Johnson
No ratings yet
Privacy in the Digital Age
From Everand
Privacy in the Digital Age
Roberto Miguel Rodriguez
No ratings yet
Data Mining
No ratings yet
Data Mining
14 pages
Protection of Bigdata Privacy: Seminar Report
No ratings yet
Protection of Bigdata Privacy: Seminar Report
33 pages
02 Synopsis
No ratings yet
02 Synopsis
16 pages
Data literacy notes
No ratings yet
Data literacy notes
5 pages
Professional Practices: Free Powerpoint Templates
No ratings yet
Professional Practices: Free Powerpoint Templates
17 pages
Ict Ethics And Logistics: Ethical hacking, #2
From Everand
Ict Ethics And Logistics: Ethical hacking, #2
Elias Mutegi
No ratings yet
Big Data Notes
No ratings yet
Big Data Notes
17 pages
21RH5A0511
No ratings yet
21RH5A0511
13 pages
Big Data Security and Privacy A Review On Issues C
No ratings yet
Big Data Security and Privacy A Review On Issues C
7 pages
Tran 2019
No ratings yet
Tran 2019
12 pages
ChatGPT and Data security
From Everand
ChatGPT and Data security
Stefan Mielich
No ratings yet
Bigdata
No ratings yet
Bigdata
3 pages
Mining Frequent Itemsets in Privacy-Preserving Data Streams Current Status, Challenges, and Research Directions
No ratings yet
Mining Frequent Itemsets in Privacy-Preserving Data Streams Current Status, Challenges, and Research Directions
9 pages
Monika Binjhade (20MT11)
No ratings yet
Monika Binjhade (20MT11)
19 pages
Week 2
No ratings yet
Week 2
6 pages
Unit-4 Ethical Considerations in Data Privacy.-1
No ratings yet
Unit-4 Ethical Considerations in Data Privacy.-1
14 pages
Privacy Enhancing Technologies
No ratings yet
Privacy Enhancing Technologies
18 pages
Privacy Preserving Data Mining
No ratings yet
Privacy Preserving Data Mining
10 pages
Data Mining for Beginners: A Programmer’s Guide
From Everand
Data Mining for Beginners: A Programmer’s Guide
Agasti Khatri
No ratings yet
Data Management and Protection On Social Media
No ratings yet
Data Management and Protection On Social Media
18 pages
BDA UNIT-1 (Lecture-1)
No ratings yet
BDA UNIT-1 (Lecture-1)
5 pages
External Examiners List Class IV
No ratings yet
External Examiners List Class IV
8 pages
E-Tkt Priyanka
No ratings yet
E-Tkt Priyanka
3 pages
Yoma 86
No ratings yet
Yoma 86
87 pages
Affidavit of Matthew K. O'Neill
No ratings yet
Affidavit of Matthew K. O'Neill
6 pages
YPAS - Annual Report - 2017 (Rugi) PDF
No ratings yet
YPAS - Annual Report - 2017 (Rugi) PDF
164 pages
Malolos Convention 1935 Constitution 1973 Constitution 1987 Constitution
No ratings yet
Malolos Convention 1935 Constitution 1973 Constitution 1987 Constitution
7 pages
Variance
No ratings yet
Variance
13 pages
Seat Allotment of Aktu
No ratings yet
Seat Allotment of Aktu
1 page
Stump E. - Dante's Hell, Aquinas's Moral Theory, and Love of God. (Canadian Journal of Philosophy, 16 (2) ) - 1986
No ratings yet
Stump E. - Dante's Hell, Aquinas's Moral Theory, and Love of God. (Canadian Journal of Philosophy, 16 (2) ) - 1986
18 pages
Jonathan Henshaw (Editor), Craig A. Smith (Editor), Norman Smith (Editor) - Translating The Occupation - The Japanese Invasion of China, 1931-45-UBC Press (2021)
No ratings yet
Jonathan Henshaw (Editor), Craig A. Smith (Editor), Norman Smith (Editor) - Translating The Occupation - The Japanese Invasion of China, 1931-45-UBC Press (2021)
480 pages
Fourteenth Sunday in Ordinary Time - Year C: Music Recommendations
No ratings yet
Fourteenth Sunday in Ordinary Time - Year C: Music Recommendations
8 pages
Characters of Ice Candy Man
100% (1)
Characters of Ice Candy Man
5 pages
PREPARE 1 Grammar Plus Unit 01
No ratings yet
PREPARE 1 Grammar Plus Unit 01
2 pages
CH1 HW
No ratings yet
CH1 HW
13 pages
Wa0008.
No ratings yet
Wa0008.
7 pages
Working Together To Safeguard Children
100% (1)
Working Together To Safeguard Children
393 pages
Engage Level 3 Question Bank Answer Key
No ratings yet
Engage Level 3 Question Bank Answer Key
16 pages
Lb11. Nazaro Vs Ecc
No ratings yet
Lb11. Nazaro Vs Ecc
2 pages
Ielts Reading Summary Completion Questions
No ratings yet
Ielts Reading Summary Completion Questions
4 pages
RPH Reviewer
No ratings yet
RPH Reviewer
6 pages
Wiley CPA Examination Review Focus Notes Financial Accounting and Reporting 5th Edition Less Antman instant download
100% (1)
Wiley CPA Examination Review Focus Notes Financial Accounting and Reporting 5th Edition Less Antman instant download
42 pages
Biodata Form For Marriage
No ratings yet
Biodata Form For Marriage
3 pages
Present and Past II
No ratings yet
Present and Past II
3 pages
Regular & Irregular Verb
No ratings yet
Regular & Irregular Verb
12 pages
3361 Vol-II
No ratings yet
3361 Vol-II
101 pages

04 - Chapter 3 - Privacy

Uploaded by

04 - Chapter 3 - Privacy

Uploaded by

INFORMATION SECURITY IN BIG

DATA ---PRIVACY AND DATA MINING

What exactly is big data?

- it can also be processed in order to directly affect

It is no exaggeration to say that we are nothing more than a

by the data mining task.

providers and then publish the data to the data miner.

the data mining results in order to achieve certain goals

 On one hand, the provider should be able to make his very

 On the other hand, if the provider has to provide some data to

1. LIMIT THE ACCESS

 Security tools that are developed for internet

 The data provider maybe willing to hand over some of his

 Such as better services or monetary rewards. The data provider

 Using ``sockpuppets'' to hide one's true activities

 Using a fake identity to create phony information

 Using security tools to mask one's identity

 The major concern of data collector is to guarantee

 1. BASICS OF privacy preserving data publication

 PPDP in the context of social networks mainly deals with

 Which is much more challenging than anonymizing relational

 If a network satisfies k-NMF anonymity then for each edge e,

 The primary concern of data miner is how to prevent sensitive

 To perform a privacy-preserving data mining, the data miner

 Various kinds of approaches have been proposed to perform

 Heuristic blocking approaches

 Probabilistic distortion approaches

 Exact database distortion approaches

 Classification is a form of data analysis that extracts models

 To realize privacy-preserving decision tree mining,

• The privacy concerns of the decision maker are following:

 The decision maker can utilize methodologies from data

 Two kinds of information

 the transformations applied to ancestral data that helped to

 With such information, people can better understand the data

 5 ways Internet users to differentiate false information from the

 Goal of data mining is summary results

 Assume data is horizontally partitioned

 Government / public agencies. Example:

 Privacy and Security Constraints

 Data Obfuscation based techniques

 Individual item in database must not be disclosed

 Data that can’t be traced to an individual not viewed

 Disclosure of individual data may be okay

• The growing popularity and development of data mining technologies

• An emerging research topic in data mining, known as privacy-preserving

• The basic idea of PPDM is to modify the data in such a way so as to

 Lei Xu , Chunxiao Jiang , Jian Wang, Jain Yuan and Yong

 J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and

You might also like