0% found this document useful (0 votes)
36 views

Anonymization of Multidimensional Data

The document discusses anonymization techniques for multidimensional data. It covers classifying privacy preserving methods and data in multidimensional datasets. It also discusses techniques like k-anonymity, l-diversity, and t-closeness for anonymizing quasi-identifiers to protect sensitive data while maintaining data utility. The document outlines challenges in anonymizing quasi-identifiers due to issues like high dimensionality, background knowledge, and correlations with sensitive data.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Anonymization of Multidimensional Data

The document discusses anonymization techniques for multidimensional data. It covers classifying privacy preserving methods and data in multidimensional datasets. It also discusses techniques like k-anonymity, l-diversity, and t-closeness for anonymizing quasi-identifiers to protect sensitive data while maintaining data utility. The document outlines challenges in anonymizing quasi-identifiers due to issues like high dimensionality, background knowledge, and correlations with sensitive data.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Anonymization of Multidimensional Data

• Introduction
• Classification of Privacy Preserving Methods
• Classification of Data in Multi-dimensional
Dataset
• Group Based Anonymization Technique
– k-Anonymity
– l-Diversity
– t-Closeness

Program Name: Program Code:


Introduction

• Static Anonymization
• Multi-Dimensional or Relational Data
• Easy way to represent Data
• Easy target for Privacy Attacks

Program Name: Program Code:


Classification of Privacy Preserving Methods

Program Name: Program Code:


Classification of Data in Multi-dimensional Dataset

• Protecting Explicit Identifier


• Protecting Quasi Identifier
• Protecting Sensitive Data

Program Name: Program Code:


Protecting Explicit Identifier
• A Multi dimensional Dataset consists of attributes of different data
types, such as numeric, Strings and so on.
• Data anonymization methods should focus on semantics of the data
and not the syntax.
• Understand the semantics of the data in the context of application,
So as to apply correct/appropriate anonymization technique on the
data.

Program Name: Program Code:


Protecting Explicit Identifier

Program Name: Program Code:


Protecting Explicit Identifier
• EI identify the record owner explicitly, and as such should be masked
completely.
• While masking these fields, it is critical to take care of two aspects.
– Referential Integrity
– Consistently masking across databases.

Program Name: Program Code:


Protecting Explicit Identifier

Program Name: Program Code:


Protecting Explicit Identifier

Program Name: Program Code:


Protecting Explicit Identifier

Program Name: Program Code:


Protecting Explicit Identifier

Program Name: Program Code:


Protecting Explicit Identifier

Substitution Mechanism

Program Name: Program Code:


Protecting Explicit Identifier Substitution Mechanism

Program Name: Program Code:


Protecting Quasi-Identifier
• Masking EI alone is not sufficient, as an adversary
can still use QI to re-identify the record owner.
• This linking is called record linkage where a record
the from the database is linked with a record in a
external data source.
• Hospital Record
ID F_Name L_Name Gender Address DOB ZIP Disease
12432 M MA 21/02/1946 01880 Cancer

Program Name: Program Code:


Protecting Quasi-Identifier
• Voters List
Voter ID F_Name L_Name Gender Address DOB ZIP
893423 Weld M MA 21/02/1946 01880

• There are two aspects that need to be considered while anonymizing


QI:
– The analytical utility of the QI needs to be preserved.
– the correlation QI attributes with sensitive data needs to be maintained to
support the utility of anonymized data.

Program Name: Program Code:


Protecting Quasi-Identifier
• Challenges in Protecting QI:
– Identifying the boundary between QI and SD and
anonymizing the QI is probably the toughest problem to
solve in Privacy Preservation.
– The main challenges in anonymizing QI attributes are:
• High Dimensionality
• Background Knowledge of the adversary
• Availability of External Knowledge
• Correlation with SD to ensure Utility
• Maintaining Analytical Utility

Program Name: Program Code:


Protecting Quasi-Identifier
• To Protecting QI, we need to use group based
anonymization techniques:
– k-Anonymity
– l-Diversity
– t-closeness

Program Name: Program Code:


Protecting Sensitive Data
• If sensitive data in original form, then it provides a
channel for re-identification

Program Name: Program Code:


Protecting Sensitive Data

Program Name: Program Code:

You might also like