SlideShare a Scribd company logo
Main
Category
Four main categories of data
3.
Derived
4.
Inferred
1.
Provided
2.
Observed
Provided data are data which originate from direct actions taken by an individual, whereby he or she is fully aware of the
actions that lead to data origination. Examples include: data disclosed by individuals in the context of a loan application
(“initiated data”), data created when buying a product with a credit card (“transactional data”), or data shared (actively) via
an online social network (“posted data”). While the individuals concerned may be unaware of the implications of providing
these data, the fact that these data are being created should be obvious – or at least intuitive.
Observed data are data which have been observed by others and recorded in a digital format. These data can be recorded
either at the moment of their creation, or transmitted to a digital carrier after observation. Examples include: data
originating from online cookies, data generated by sensors, and passively created observational data (e.g., data captured
by CCTV cameras combined with facial recognition). While individuals may be made aware of the creation of observed
data (e.g., due to active engagement), much of the creation of observed data may go unnoticed.
Derived data are data generated from other data, after which they become new data elements related to a particular
individual. Derived data are said to be created in a fairly “mechanical” fashion using simple reasoning and basic
mathematics to detect patterns within a data set and create classifications. While these classifications may later be used
for predictive purposes, they are not themselves based on probabilistic reasoning. Examples include: computational data
(e.g. a calculation of customer profitability based on the ratio between number of visits and the items bought) and
notational data (e.g. the detection of common attributes among “profitable” customers which are then used to classify
potential customers).
Inferred data are the product of probability-based analytic processes. They are a result of the detection of correlations
which are used to create predictions of behaviour. These predictions are then used to categorise individuals. Examples of
include: statistical data (e.g., credit risk scores, life expectancy scores) and advanced analytical data (e.g., likelihood of
future health outcomes based on an analysis of large and diverse medical data sets). Typically, the individuals to whom
these data relate are not involved in their creation and may remain unaware of any inferences made.
Based on: “The Origins of Personal Data and its Implications for Governance” By Martin Abrams, The Information Accountability Foundation, 21 March 2014”
A taxonomy of personal data by origin
RichardClaassens
Sub
Category
Example Level of Individual
Awareness
A taxonomy of personal data by origin
3.1.
Computational
3.3.
Notational
4.1.
Statistical
4.2.
Advanced Analytical
o Credit ratios
o Average purchase per visit
o Risk of developing a disease based on a single genetic variation
o Classification based on common attributes of buyers
o Medical condition based on diagnostic tests
o Credit score
o Response score
o Fraud scores
o Life expectancy
o Risk of developing a disease based multi-factor analysis
o College success score based on multi-variable Big Data analysis at age 9
1.1.
Initiated
o Applications
o Registrations
o Public records such as licenses
o Credit card purchases
o Medical history as provided by individual
High
1.2.
Transactional
1.3.
Posted
o Bills paid
o Inquiries responded to
o Blood pressure or weight as recorded in clinical care setting
o Public records such as court proceedings
o Speeches in public settings
o Social network postings
o Photo services
o Video sites
High
High
2.1.
Engaged
2.2.
Not Anticipated
2.3.
Passive
o Cookies on a website
o Loyalty card
o Enabled location sensors on personal devices
o Fitness tracking using wearable device
o Data from sensor technology on my Car
o Time paused over a pixel on the screen of a tablet
o Facial images from CCTV
o Obscured web technologies
o Wi-Fi readers in buildings that establish location
Medium
Low
Low
Medium to Low
Medium to Low
Low
Low
Based on: “The Origins of Personal Data and its Implications for Governance” By Martin Abrams, The Information Accountability Foundation, 21 March 2014”
Main
Category
3.
Derived
4.
Inferred
1.
Provided
2.
Observed
RichardClaassens

More Related Content

Similar to A taxonomy of personal data by origin (20)

PPTX
Behavioral Big Data & Healthcare Research: Talk at WiDS Taipei
Galit Shmueli
 
PDF
Risks, Harms and Benefits Assessment Tool (Updated as of Jan 2019)
UN Global Pulse
 
PDF
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
Pranav Godse
 
PDF
2010 report data security survey
Carlo Del Bo
 
PPTX
Data mining
udhaysweety
 
PPTX
Behavioral Big Data & Healthcare Research
Galit Shmueli
 
PPTX
Privacy Implications of Biometric Data - Kevin Nevias
Kevin Nevias
 
PDF
Surveillance Systems And Studies That Should Be...
Ann Johnson
 
PDF
Evidence Based Healthcare Design
Carmen Martin
 
PDF
DATA PRIVACY IN AN AGE OF INCREASINGLY SPECIFIC AND PUBLICLY AVAILABLE DATA: ...
Ted Myerson
 
PDF
Big data analytics for life insurers
dipak sahoo
 
PDF
Big_data_analytics_for_life_insurers_published
Shradha Verma
 
PDF
Anonos NTIA Comment Letter letter on ''Big Data'' Developments and How They I...
Ted Myerson
 
PDF
Data Collection Tool Used For Information About Individuals
Christy Hunt
 
PPTX
Panel Cyber Security and Privacy without Carrie Waggoner
mihinpr
 
PDF
wp-analyzing-breaches-by-industry
Numaan Huq
 
PPT
Data mining by_ashok
Ashok Kumar
 
PDF
Cp34550555
IJERA Editor
 
PDF
Trust & Predictive Technologies 2016
Edelman
 
PDF
Your Supply Chain partner for electronic components.pdf
Jakir30
 
Behavioral Big Data & Healthcare Research: Talk at WiDS Taipei
Galit Shmueli
 
Risks, Harms and Benefits Assessment Tool (Updated as of Jan 2019)
UN Global Pulse
 
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
Pranav Godse
 
2010 report data security survey
Carlo Del Bo
 
Data mining
udhaysweety
 
Behavioral Big Data & Healthcare Research
Galit Shmueli
 
Privacy Implications of Biometric Data - Kevin Nevias
Kevin Nevias
 
Surveillance Systems And Studies That Should Be...
Ann Johnson
 
Evidence Based Healthcare Design
Carmen Martin
 
DATA PRIVACY IN AN AGE OF INCREASINGLY SPECIFIC AND PUBLICLY AVAILABLE DATA: ...
Ted Myerson
 
Big data analytics for life insurers
dipak sahoo
 
Big_data_analytics_for_life_insurers_published
Shradha Verma
 
Anonos NTIA Comment Letter letter on ''Big Data'' Developments and How They I...
Ted Myerson
 
Data Collection Tool Used For Information About Individuals
Christy Hunt
 
Panel Cyber Security and Privacy without Carrie Waggoner
mihinpr
 
wp-analyzing-breaches-by-industry
Numaan Huq
 
Data mining by_ashok
Ashok Kumar
 
Cp34550555
IJERA Editor
 
Trust & Predictive Technologies 2016
Edelman
 
Your Supply Chain partner for electronic components.pdf
Jakir30
 

More from Richard Claassens CIPPE (20)

PPTX
Privacy het nieuwe groen | KNVI afdeling IT-audit | definitief
Richard Claassens CIPPE
 
PDF
Is privacywetgeving een blokkade voor technologisch gedreven innovatie?
Richard Claassens CIPPE
 
PDF
Data Masking | waar in het IT-systeemlandschap? ...
Richard Claassens CIPPE
 
PDF
Taken van de functionaris voor gegevensbescherming
Richard Claassens CIPPE
 
PDF
Positie van de functionaris voor gegevensbescherming (FG)
Richard Claassens CIPPE
 
PDF
Pripare methodology-handbook-final-feb-24-2016
Richard Claassens CIPPE
 
PDF
Benoeming van een functionaris voor gegevensbescherming (FG)
Richard Claassens CIPPE
 
PPTX
Privacy het nieuwe groen KNVI definitief
Richard Claassens CIPPE
 
PPT
Establishing SOA and SOA Governance 23032010 Amsterdam
Richard Claassens CIPPE
 
PPTX
Verkenning internet of things
Richard Claassens CIPPE
 
PDF
Semantische interoperabiliteit met behulp van een bedrijfsbrede taxonomie
Richard Claassens CIPPE
 
PPTX
Heidag Architectuur | presentatie van verkenningen
Richard Claassens CIPPE
 
PPTX
Verkenning geo services
Richard Claassens CIPPE
 
PPTX
Ontwerpmodel Internet Of Things Diensten
Richard Claassens CIPPE
 
PPT
Software packaged software principles publiek
Richard Claassens CIPPE
 
PPT
Kennismaking sfdc v1
Richard Claassens CIPPE
 
PPT
Authenticatie
Richard Claassens CIPPE
 
PPT
Cloud computing lunchsessie (v2)
Richard Claassens CIPPE
 
PPT
Cloud computing overzicht
Richard Claassens CIPPE
 
PPT
Establishing Soa And Soa Governance Hsa
Richard Claassens CIPPE
 
Privacy het nieuwe groen | KNVI afdeling IT-audit | definitief
Richard Claassens CIPPE
 
Is privacywetgeving een blokkade voor technologisch gedreven innovatie?
Richard Claassens CIPPE
 
Data Masking | waar in het IT-systeemlandschap? ...
Richard Claassens CIPPE
 
Taken van de functionaris voor gegevensbescherming
Richard Claassens CIPPE
 
Positie van de functionaris voor gegevensbescherming (FG)
Richard Claassens CIPPE
 
Pripare methodology-handbook-final-feb-24-2016
Richard Claassens CIPPE
 
Benoeming van een functionaris voor gegevensbescherming (FG)
Richard Claassens CIPPE
 
Privacy het nieuwe groen KNVI definitief
Richard Claassens CIPPE
 
Establishing SOA and SOA Governance 23032010 Amsterdam
Richard Claassens CIPPE
 
Verkenning internet of things
Richard Claassens CIPPE
 
Semantische interoperabiliteit met behulp van een bedrijfsbrede taxonomie
Richard Claassens CIPPE
 
Heidag Architectuur | presentatie van verkenningen
Richard Claassens CIPPE
 
Verkenning geo services
Richard Claassens CIPPE
 
Ontwerpmodel Internet Of Things Diensten
Richard Claassens CIPPE
 
Software packaged software principles publiek
Richard Claassens CIPPE
 
Kennismaking sfdc v1
Richard Claassens CIPPE
 
Cloud computing lunchsessie (v2)
Richard Claassens CIPPE
 
Cloud computing overzicht
Richard Claassens CIPPE
 
Establishing Soa And Soa Governance Hsa
Richard Claassens CIPPE
 
Ad

Recently uploaded (20)

PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
PDF
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
PPTX
Machine Learning Benefits Across Industries
SynapseIndia
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
Market Wrap for 18th July 2025 by CIFDAQ
CIFDAQ
 
PDF
OpenInfra ID 2025 - Are Containers Dying? Rethinking Isolation with MicroVMs.pdf
Muhammad Yuga Nugraha
 
PPTX
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
PDF
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
PPTX
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
ShapeBlue
 
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
PDF
Trading Volume Explained by CIFDAQ- Secret Of Market Trends
CIFDAQ
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PDF
UiPath on Tour London Community Booth Deck
UiPathCommunity
 
PDF
Alpha Altcoin Setup : TIA - 19th July 2025
CIFDAQ
 
PDF
Rethinking Security Operations - Modern SOC.pdf
Haris Chughtai
 
PDF
How a Code Plagiarism Checker Protects Originality in Programming
Code Quiry
 
PDF
Bitcoin+ Escalando sin concesiones - Parte 1
Fernando Paredes García
 
PDF
Arcee AI - building and working with small language models (06/25)
Julien SIMON
 
PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
Machine Learning Benefits Across Industries
SynapseIndia
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Market Wrap for 18th July 2025 by CIFDAQ
CIFDAQ
 
OpenInfra ID 2025 - Are Containers Dying? Rethinking Isolation with MicroVMs.pdf
Muhammad Yuga Nugraha
 
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
ShapeBlue
 
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
Trading Volume Explained by CIFDAQ- Secret Of Market Trends
CIFDAQ
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
UiPath on Tour London Community Booth Deck
UiPathCommunity
 
Alpha Altcoin Setup : TIA - 19th July 2025
CIFDAQ
 
Rethinking Security Operations - Modern SOC.pdf
Haris Chughtai
 
How a Code Plagiarism Checker Protects Originality in Programming
Code Quiry
 
Bitcoin+ Escalando sin concesiones - Parte 1
Fernando Paredes García
 
Arcee AI - building and working with small language models (06/25)
Julien SIMON
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
Ad

A taxonomy of personal data by origin

  • 1. Main Category Four main categories of data 3. Derived 4. Inferred 1. Provided 2. Observed Provided data are data which originate from direct actions taken by an individual, whereby he or she is fully aware of the actions that lead to data origination. Examples include: data disclosed by individuals in the context of a loan application (“initiated data”), data created when buying a product with a credit card (“transactional data”), or data shared (actively) via an online social network (“posted data”). While the individuals concerned may be unaware of the implications of providing these data, the fact that these data are being created should be obvious – or at least intuitive. Observed data are data which have been observed by others and recorded in a digital format. These data can be recorded either at the moment of their creation, or transmitted to a digital carrier after observation. Examples include: data originating from online cookies, data generated by sensors, and passively created observational data (e.g., data captured by CCTV cameras combined with facial recognition). While individuals may be made aware of the creation of observed data (e.g., due to active engagement), much of the creation of observed data may go unnoticed. Derived data are data generated from other data, after which they become new data elements related to a particular individual. Derived data are said to be created in a fairly “mechanical” fashion using simple reasoning and basic mathematics to detect patterns within a data set and create classifications. While these classifications may later be used for predictive purposes, they are not themselves based on probabilistic reasoning. Examples include: computational data (e.g. a calculation of customer profitability based on the ratio between number of visits and the items bought) and notational data (e.g. the detection of common attributes among “profitable” customers which are then used to classify potential customers). Inferred data are the product of probability-based analytic processes. They are a result of the detection of correlations which are used to create predictions of behaviour. These predictions are then used to categorise individuals. Examples of include: statistical data (e.g., credit risk scores, life expectancy scores) and advanced analytical data (e.g., likelihood of future health outcomes based on an analysis of large and diverse medical data sets). Typically, the individuals to whom these data relate are not involved in their creation and may remain unaware of any inferences made. Based on: “The Origins of Personal Data and its Implications for Governance” By Martin Abrams, The Information Accountability Foundation, 21 March 2014” A taxonomy of personal data by origin RichardClaassens
  • 2. Sub Category Example Level of Individual Awareness A taxonomy of personal data by origin 3.1. Computational 3.3. Notational 4.1. Statistical 4.2. Advanced Analytical o Credit ratios o Average purchase per visit o Risk of developing a disease based on a single genetic variation o Classification based on common attributes of buyers o Medical condition based on diagnostic tests o Credit score o Response score o Fraud scores o Life expectancy o Risk of developing a disease based multi-factor analysis o College success score based on multi-variable Big Data analysis at age 9 1.1. Initiated o Applications o Registrations o Public records such as licenses o Credit card purchases o Medical history as provided by individual High 1.2. Transactional 1.3. Posted o Bills paid o Inquiries responded to o Blood pressure or weight as recorded in clinical care setting o Public records such as court proceedings o Speeches in public settings o Social network postings o Photo services o Video sites High High 2.1. Engaged 2.2. Not Anticipated 2.3. Passive o Cookies on a website o Loyalty card o Enabled location sensors on personal devices o Fitness tracking using wearable device o Data from sensor technology on my Car o Time paused over a pixel on the screen of a tablet o Facial images from CCTV o Obscured web technologies o Wi-Fi readers in buildings that establish location Medium Low Low Medium to Low Medium to Low Low Low Based on: “The Origins of Personal Data and its Implications for Governance” By Martin Abrams, The Information Accountability Foundation, 21 March 2014” Main Category 3. Derived 4. Inferred 1. Provided 2. Observed RichardClaassens