0% found this document useful (0 votes)

79 views5 pages

Real-Time De-Identification of Healthcare Data Using Ephemeral Pseudonyms

Ashish Shukla, Mohit Kumar Sahni, Sourav Aggarwal, Bipin Kumar Rai Abstract: Information explosion is radically changing our perception of the surroundings and healthcare data is at the core of it. The nature of healthcare data being extremely sensitive poses a threat of invasion of privacy of individuals if stored or exported without taking proper security measures. Deidentification involves pseudonymization or anonymization of data which are methods to disassociate an individual’s identity temporarily or permanently respectively. These methods can be used to provide secrecy to user’s healthcare data. A commonly overlooked weakness of Pseudonymization technique is Inference attacks. This paper discusses an approach to deidentify Enterprise Healthcare Records (EHR) using chained hashing for generating short-lived pseudonyms to minimize the effect of inference attacks and also outlines a re-identification mechanism focusing on information self-determination. Keywords:De-identification, Electronic Healthcare Records, Pseudonymization, Inference Attack. Volume & Issue No. = Volume 7, Issue 2, March - April 2018 pages = 021-025 , url = http://www.ijettcs.org/Volume7Issue2/IJETTCS-2018-02-26-2.pdf

Uploaded by

International Journal of Application or Innovation in Engineering & Management

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views5 pages

Real-Time De-Identification of Healthcare Data Using Ephemeral Pseudonyms

Uploaded by

International Journal of Application or Innovation in Engineering & Management

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Web Site: www.ijettcs.org Email: [email protected], [email protected]

Volume 7, Issue 2, March - April 2018 ISSN 2278-6856

Real-time De-identification of Healthcare Data

Using Ephemeral Pseudonyms
[1]
Ashish Shukla, [2]Mohit Kumar Sahni, [3]Sourav Aggarwal, [4]Bipin Kumar Rai
[1] [2] [3]
Student, Computer Science & Engineering Department, ABES IT, Ghaziabad
[4]
Research Scholar, Banasthali University & Associate Professor, Information Technology Department, ABES IT, Ghaziabad

later so all traces of the patient should be removed and the

Abstract: Information explosion is radically changing our data is made fully anonymous by manually reviewing the
perception of the surroundings and healthcare data is at the files and their fields to determine which fields are required
core of it. The nature of healthcare data being extremely for instructional purposes and which required fields can be
sensitive poses a threat of invasion of privacy of individuals if used for re-identification of patient. In practice, such fields
stored or exported without taking proper security measures. De- are rewritten to retain useful meaning while not disclosing
identification involves pseudonymization or anonymization of
any private information.[1]
data which are methods to disassociate an individual’s identity
temporarily or permanently respectively. These methods can be Anonymization has following three principles-
used to provide secrecy to user’s healthcare data. A commonly Let there be a relation T(a1,a2 ……, ad) for which QT is the
overlooked weakness of Pseudonymization technique is set of Quasi-identifiers for relation T. where for i = (1, ...,
Inference attacks. This paper discusses an approach to de- m) ai ∈ QT.Then,
identify Enterprise Healthcare Records (EHR) using chained
hashing for generating short-lived pseudonyms to minimize the 1.1.1. k-anonymity[2] - Qti for ti∈ T should be
effect of inference attacks and also outlines a re-identification
mechanism focusing on information self-determination. indistinguishable from at least k-1, tj∈ T where
j∈(1,...,d) and j != i. The process of enforcing k-
Keywords:De-identification, Electronic Healthcare anonymity is called k-anonymization in which T is
Records, Pseudonymization, Inference Attack. partitioned into groups gj such that j ∈ (1...h) and
| gj | < k, here |x| means the size of x.tuples in gjare made
1. INTRODUCTION identical to the QT in process of k-anonymization.
Electronic Health Records (EHRs) provides us many
advantages such as better communication between 1.1.2. l-diversity[2] - Only providing k-anonymity may
healthcare services and patients, no-need of carrying cause inference of an individual’s values in the sensitive
previous reports, reduced costs of treatment and also serves values (SA), this is called value disclosure. To prevent
as a repository to retrieve data for research purpose. value disclosure each anonymized group must contain at
Healthcare data is inherently extremely sensitive by its least l well-represented values. Here well-represented value
nature. The leakage of the same can result in social as well means distinct and leads to the principle called distinct l-
as economic losses to the individual. Thus securing EHR is diversity. which requires each anonymized group to contain
extremely important. Securing data follows two approaches at least l distinct SA values.
namely Encryption and de-identification. Although
Encryption is the conventional and most reliable way of 1.1.3. Recursive (c, l) diversity[2] - Given parameters c,l,
assuring the data security it has significant drawbacks like which are specified by data publishers, a group gj is (c,l)-
the overhead of decrypting data for any analysis or real-life diverse when r1< c × (rl + rl+1 + ...+ rn), where ri,i ∈
usage. An alternative approach is de-identification of data {1,...,n} is the number of times the i-th
which is essentially disassociation of personal identifiers frequent SA value appears in gj, and n is the
from data. It should be noted that de-identification is not a domain size of gj. T is (c,l)-diverse when every gj, j = 1,...,h
technique of securing data itself, instead, it is a technique of is (c,l)-diverse.
protecting an individual’s privacy. De-identification
follows two approaches Anonymization and 1.2. Pseudonymization - Pseudonymization is a de-
Pseudonymization. identification technique in which we introduce a
pseudonym in place of the attributes that directly or
1.1. Anonymization - Anonymization is a de- indirectly identify an individual. IHE defines it as a
identification technique that dis-associates all identifiers technique that uses controlled replacements to allow
from the data. For example, creating a teaching file for longitudinal linking and authorized re-identification.[1] Let
radiological images illustrating a specific condition requires there be a relation T(a1,a2 ……, ad) for which QT is the set
anonymization of the data.[1] Here the important point is of Quasi-identifiers for relation T. where for i = (1, ..., m) ai
that there is no requirement to be able to identify the patient
Volume 7, Issue 2, March – April 2018 Page 21
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 7, Issue 2, March - April 2018 ISSN 2278-6856

∈ QT then pseudonymization is essentially replacing QT mechanism for such cases as it involves dealing with
with PT where PT = (P1, P2 …, Pm ) Keeping another relations having varying attributes. To resolve this there
relation PT ⟶ QT for re-identification. must exist a standard API or protocol that has values in a
The definitions of de-identification techniques itself clarify predefined format.
that being unable to re-associate data with any individual
Anonymization is not suitable for all the purposes in EHR. 1.5. Inference Attacks and Pseudonymization -
It is the reason why Pseudonymization is often the Pseudonymized data is prone to inference attacks. The
recommended process for providing privacy to users. biggest loophole being persistent pseudonym usage.
Pseudonymization is also advised to be used by EU General Inference attacks relate to data mining techniques. If an
Data Protection Regulation (GDPR) which will be enforced adversary can infer the identity associated with some
on May 25, 2018. pseudonymized data with high confidence then the data is
Few significant pseudonymization approaches are said to be leaked. As pseudonymization is not a technique
following - of encryption and rather relies on hiding the identity of
individuals, it is highly liable to this attack. Statistical
1.2.1. Peterson’s approach[6] - Robert L Peterson suggested frequency analysis attacks are a very basic example of
a key-based approach to provide access control and inference attacks. Dataset aggregation techniques are also
encryption of medical information. The patient holds a used heavily by attackers in order to derive an inference
Personal Key (PEK). This approach also involves assigning from existing datasets.
a static pseudonym to the individuals. There exists a Global If there is a relation T(a1,a2 ……, ad). for which QT is the set
Key (GK) which uniquely identifies the patient in the of Quasi-identifiers for relation T. and there exists another
pseudonymized records when used jointly with PEK. The relation D(d1,d2 ……, dd) which contains identification
records are secured by encryption on database using PEK information about the individuals belonging to relation T.
thus the entire security of information is revolving around if for i = (1, ..., m) ai ∈ QT and ai ∈ D then we can
the encryption of information. If PEK is stolen then this associate an identity based on the other attributes in the
approach is rendered ineffective against attackers. same tuple belonging to D.
One such example for EHR is evident with DT as Voter
1.2.2. Slamanig and Stingl’s Approach[7] - This approach List. If the pseudonymization was done on basis of YOB,
suggests storage of User Information and Medical Data on ZIP, and Sex then for a particular state the total number of
different databases. These two are mapped with the help of possible pseudonyms can be in the range of 10,000s.[3]
some central components. Same as Peterson’s approach, Which is significantly low and the actual identity can be
Slamanig’s approach also suggests storing data in derived using further inferences. This particular inference
encrypted form and giving the encryption key to the attack was exploited heavily and caused the creation of
patient. It focuses on access control as well but doesn’t HIPAA (Health Insurance Portability and Accountability
ensure the security of data if the data is to be shared with a Act of 1996). Nevertheless, inference attacks are still
3rd party (e.g. for research purpose). prevalent as although the process of formation of
Similar approaches were suggested by Pommerening and pseudonyms has significantly changed but the underlying
Thielscher as well.[8] All of the approaches seem to be loopholes remain the same and the persistence in
greatly affected by the problem of inference attacks as the pseudonyms poses a wide threat to user’s privacy.
used pseudonyms are persistent and eventually start to Based on these facts it’s obvious that intuitive
work as a unique identifier as the patient’s information pseudonymization methods are almost certain to fail in
grows larger. Thus a need for variable or ephemeral order to provide privacy. Successful pseudonymization
pseudonyms arises to weaken the inference attacks. requires a deep knowledge of the data.[4] It is necessary to
design models keeping in mind that other datasets may be
1.3. Pseudonym Generation Techniques - Primarily we used in association with the existing records to derive
use two pseudonym generation techniques namely identities.
Hashingand Tokenization, Hashing is computationally
more expensive and leaves no traceback of the information 2. PROPOSED SOLUTION
it has been generated from whereas tokenization is a The solution assumes that there exists an authorized body
method that creates a pseudonym that retains the data it that regulates the identification information and provides a
originated from and requires much less computation. unique identifier for each resident. Let the identifier be
Although tokenization and hashing both have their represented by Ui , The patient is represented by ti ∈ T
respective use cases but generally tokenization increases where T is set of all patient’s identification records. The
the possibilities of inference attacks. system consists of 3 Nodes namely Accession Node, Key
Node and Data Node. Accession Node enrolls the user in
1.4. Real-time de-identification - Real-time de- Healthcare system only once. It extracts Qti = (q1i , q2i …. ,
identification refers to de-identification of data as it qni ) (Quasi Specifiers for ti) from ti and transmits it to Key
streams. This is a basic requirement if we are dealing with Node. Key Node applies ‘Ephemeral Pseudonym
data that needs to be de-identified as it’s generated and Generation algorithm - Initialize’ (EPGA-Init) on Qti
EHR falls under such category. It’s hard to create a secure which produces gi (ith group) and gui (unique ID in gi) for ti
Volume 7, Issue 2, March – April 2018 Page 22
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 7, Issue 2, March - April 2018 ISSN 2278-6856

and initializes a report_schema for insertion of records in 4. if cQueue is null:

form of record IDs in Data Node corresponding to a 4.a.return gi.count+1
Hi which is a hash of gui and gi. Another relation is 5. else:
maintained for retrieval of gui through mapping of 5.a. count ←dequeue(cQueue)
biometrics of patients at Key Node. Whole communication 5.b. serialize(cQueue)
on the network is protected using ECDH (Elliptic Curve 5.c. update gi.cQueue in database
Diffie Hellman). There should exist a mapping of Ei 5.d. return count
(Ephemeral IDs) corresponding to each Hi, Ei will be used
by healthcare services to insert and retrieve data for a 2.1.2. EPGA-D - EPGA-D Algorithm de-identifies the
patient. Ei will be generated for de-identification purposes in streaming data and fulfills the purpose of real-time de-
EPGA-D. Each Ei is only one time usable thus it gives a identification of streaming data. If the data is being
strong protection against caching of pseudonyms and produced by a producer on a stream processing platform
makes it hard to infer the identity of an individual from e.g. Kafka in a predefined format e.g. FHIR (Fast
records. To reassociate the identity of individuals with Healthcare Interoperability Resources) then we can apply
Uithe user must provide his consent by providing the gui. EPGA-D on producer-end if the producer is reliable else on
consumer-end on Data Node to de-identify data in real-
2.1. Ephemeral Pseudonym Generation Algorithm - time. The de-identification of a patient report is partially
EPGA is divided in to three parts i.e. Initialize (EPGA- influenced by safe harbor method[5] which suggests
Init), De-identification (EPGA-D) and Re-identification suppression of 18 identifiers like Names, Locations, Dates
(EPGA-R). directly relating to an individual, Telephone numbers, Fax
numbers etc. The key difference being that EPGA-D
2.1.1. EPGA-Init - EPGA-Init Algorithm generates a assigns a short-lived pseudonym as the report’s ID called
global pseudonym Hi against which we will store the Ephemeral ID (Ei ) along with suppression of identifiers
report_schema which will contain the Record IDs of the suggested in safe harbor method. The Ephemeral ID is
reports and other de-identified documents. In EPGA-Init generated by user’s consent on report producer’s end after
generalize_or_suppress function returns the generalized providing gui. Upon receiving the pseudonymized data with
form of an identifier else a null string if identifier should be Ei on Data Node, the Data Node generates a random
suppressed. Hm is a highly collision resistant Hashing identifier RHiand replaces Ei with RHi. RHi is updated in the
algorithm (e.g. SHA256). Kgi stands for ith group’s key. report_schema corresponding to the patient’s Hi who
The getLast function takes the argument as group id and generated the Ei.
returns the de-serialized object associated with that gid else In order to generate Ei patient can send the request for the
returns Null if group id doesn’t exist in Key Node’s generation of Ei to Key Node through an authenticated
Database. medium by providing his guiand Ui.

● EPGA-Init(Qti): ● createEi(gui , Ui):

1. gQti ← generalize_or_suppress(qi : qi∈ Qti ) 1. retrieve Qti from identification body through Ui.
2. Kgi ←‘\0’ 2. gQti ←generalize_or_suppress(qi: qi∈ Qti )
3. Kgi || qi : qi ∈ gQti 3. gi ←Hm(concat(qi ) : qi ∈ gQti )
4. gi ←Hm(Kgi ) 4. creates a random identifier Ei and associate it
5. count ← getLast( gi ) with the Hi .
6. gui ←randomize_count(count) 5. return Ei
7. Hi ←HMAC( gi || gui , key = Kgi )
8. return Hi , gui We further subdivide the EPGA-D algorithm into two parts
i.e. @Producer and @Consumer where Producer is the
To define getLast function we assume that there must exist segment that should be used on the stream’s end which
a cQueue associated with each group id in Key Node’s produces the de-identified report and Consumer is the
database which stores the counts of revoked gui stream’s end which receives the de-identified report i.e.
corresponding to gi to avoid overflow in group unique ID’s Data Node.
counts. randomize_count takes count as seed and maps the
count to another number within a defined prime number’s @Producer
range. It only introduces randomness in generated group ● EPGA-D(Report):
unique IDs. 1. Request patient to generate Ei .
2. Ei ←createEi(gui , Ui )
● getLast(gi): 3. gReport ←generalize_or_suppress(Report)
1. retrieve gi row from database. 4. gReport.id ← Ei
2. if gi doesn’t exist in database: 5. Stream gReport on data pipeline.
2.a. gi.count = 0
2.b. return 0
3. cQueue ←deserialize(gi .cQueue)
Volume 7, Issue 2, March – April 2018 Page 23
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 7, Issue 2, March - April 2018 ISSN 2278-6856

@Consumer distributed resources which will require inference from

● EPGA-D(Report): multiple sources making it even harder to identify patient
1. create random identifier RHi. through inference attacks.
2. Ei ←Report.id The stored pseudonyms are never shared with any of the
3. Request Hi corresponding to Ei from Key Node. third-party services in the whole mechanism instead a
4. On receiving Hi request Key Node deletes Ei from short-lived pseudonym is shared which makes caching of
the map and returns corresponding Hi. pseudonym corresponding to Uiineffective.
5. Report.id ← RHi
6. Save RHi in report_schema corresponding to Hi
4. FUTURE WORK
Based on the algorithm we can create an architecture for
2.1.3. EPGA-R - EPGA-R Algorithm re-associates the
scalable EHR using appropriate messaging queues and
identity of an individual with a Report with the explicit
consent of the patient. The patient generates a short-lived stream processing platforms. Although the proposed
one-time usable Ephemeral Group Unique ID (Egui) by solution provides a robust mechanism for de-identification
providing his gui, Uiand Lifetime of Egui. In case the of data but it lacks the safe storage of data. An adversary’s
patient does not provide the lifetime of Egui a default malevolent attempt can be aimed at destroying the integrity
timeout must be set up to prevent misuse of Eguithrough of the data which would render the de-identified data
malevolent attempts. useless for the patient. Perhaps a blockchain based
In order to generate Egui patient can send the request for the immutable storage can address this problem but the
generation of Ei to Key Node through an authenticated proposed solution lacks it.
medium by providing his gui, Ui and optionally the time to
live (ttl) for Egui. APPENDIX
T - Relation containing all patients.
● createEgui(gui , Ui , ttl = default_time): D - Relation containing de-identification information of all
1. retrieve Qti from identification body through Ui.
patients.
2. gQti ←generalize_or_suppress(qi: qi∈ Qti ) P - Relation containing pseudonyms for all patients.
3. gi ←Hm(concat(qi ) : qi ∈ gQti ) ti - ith patient belonging to relation T
4. creates a random identifier Egui and associate it Ui- Basic identity information of ti.
with the Hi . gi- Group ID of ti.
5. Set ttl of Egui.
gui- Unique ID in group for ti.
6. return Egui
Qti- List of Quasi Specifiers for ti.
gQti- Generalized or suppressed list of Quasi Specifiers for
Let us assume there exists a ‘Service’ which wants to re-
identify the patient. ti.
Egui- Ephemeral Unique ID in group for ti.
● EPGA-R(gui , Ui): Hi- Globally Unique ID for ti to map Report IDs.
1. Service requests patient to generate Ei. RHi- Unique Global ID for ith report.
2. Egui ←createEgui(gui , Ui, optional_ttl ) Hm- Highly collision resistant Hashing algorithm
3. Service requests patient to provide Ui. || - Concatenation symbol.
4. Service sends Ui and Egui to DataNode. Ei- Ephemeral ID for ith report.
5. Data Node requests Key Node to return Hi HMAC - Hash based Message Authentication Coding
corresponding to Egui and Ui . function.
6. KeyNode returns the Hi to Data Node and deletes Kgi- Key for creating Hi through HMAC for ith patient.
Egui.
7. DataNode returns requested data associated with REFERENCES
Hi to the Service. [1] IHE IT Infrastructure Technical Committee, Integrating
the healthcare enterprise (IHE IT Infrastructure Book),
3. CONCLUSION June 6,2014, pp. 170.
EPGA can be used to implement real-time de-identification [2] Aris Gkoulalas-Divanis Grigorios Loukides, Overview
of healthcare data. It provides the patient information self- of patient Data Anonymization, September 13, 2012, pp.
determination as EPGA-D and EPGA-R both revolve 9-11.
[3] Latanya Sweeney, Only You, Your Doctor, and Many
around the group unique ID gui which is exclusively known
Others May Know, Sept. 29, 2015.
to user. gui works as a proof-of-consent for the algorithm.
[4] Phil Factor, Pseudonymization and the Inference Attack
EPGA-D provides a fairly complex relation between report (Redgate Hub), August 01, 2017.
ID and Hi which makes it hard to find a straight relation [5] Guidance Regarding Methods for De-identification of
between reports and patient pseudonyms making inference Protected Health Information in Accordance with the
attacks less effective. To reduce the effect of inference Health Insurance Portability and Accountability Act
attacks even more we can split report_schemas on (HIPAA) Privacy Rule , September 4, 2012.
Volume 7, Issue 2, March – April 2018 Page 24
International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 7, Issue 2, March - April 2018 ISSN 2278-6856

[6] Peterson, R.L., Encryption system for allowing

immediate universal access to medical records while
maintaining complete patient control over privacy. US
Patent Application Publication, No.: US 2003/0074564
A1 , 2003.
[7] Daniel slamanig, Christian stingl , ‘Privacy aspect of e-
health’ the 3rd international conference on availability,
reliability and security, IEEE computer society, 2008.
[8] Bipin Kumar Rai, Dr. A.K. Srivastava,
Pseudonymization Techniques for Providing Privacy
and Security in EHR, IJETTCS, July, 22, 2017.

AUTHORS
Ashish Shukla is an undergraduate
Computer Science & Engineering student
pursuing B.Tech at ABES IT, Ghaziabad.
His primary area of interest is Information
Security and Data Sciences.
([email protected])

Mohit Kumar Sahni is an undergraduate

Computer Science & Engineering student
pursuing B.Tech at ABES IT, Ghaziabad. His
primary area of interest is Big Data and Data
Analytics. ([email protected])

Sourav Aggarwal is an undergraduate

Computer Science & Engineering student
pursuing B.Tech at ABES IT, Ghaziabad.
His primary area of interest is Deep
Learning and Data Science.
([email protected])

Bipin Kumar Rai, received the B.Tech(CSE)

from UPTU (BIT Muzaffarnagar) Lucknow,
UP and M.Tech(CSE) from RGPV Bhopal,
(SSSIST, Sehore) MP in 2004 and 2009,
respectively. During 2004-2006 & 2008-
2014 he taught in different engineering
colleges. He is with ABES IT as Associate Professor now.
His primary area of interest is Information
Security.([email protected])

Volume 7, Issue 2, March – April 2018 Page 25

Unit 5 Part 3
0% (1)
Unit 5 Part 3
7 pages
Pawar 2018
No ratings yet
Pawar 2018
6 pages
Survey On Anonymization Techniques in Big Data and Privacy Models
No ratings yet
Survey On Anonymization Techniques in Big Data and Privacy Models
20 pages
WINSEM2024-25_BCSE318L_TH_VL2024250501719_CAT-2-QP-_-KEY (1)
No ratings yet
WINSEM2024-25_BCSE318L_TH_VL2024250501719_CAT-2-QP-_-KEY (1)
8 pages
Information Security
No ratings yet
Information Security
42 pages
2012electronic Medical Records Privacy Preservation
No ratings yet
2012electronic Medical Records Privacy Preservation
6 pages
A Survey On Data Anonymization For Big Data Security
No ratings yet
A Survey On Data Anonymization For Big Data Security
4 pages
A Review On K-Anonymization Techniques
No ratings yet
A Review On K-Anonymization Techniques
8 pages
2023 Article 771
No ratings yet
2023 Article 771
10 pages
Data Anonymization
No ratings yet
Data Anonymization
1 page
K-Anonymity
No ratings yet
K-Anonymity
111 pages
20 Module 4 Data Privacy 11-09-2024
No ratings yet
20 Module 4 Data Privacy 11-09-2024
20 pages
CH 07
No ratings yet
CH 07
36 pages
ch06 Anonymization
No ratings yet
ch06 Anonymization
40 pages
Insurance Cia3 - New
No ratings yet
Insurance Cia3 - New
7 pages
LLMs-In-The-Loop Part 2 Expert Small AI Models For
No ratings yet
LLMs-In-The-Loop Part 2 Expert Small AI Models For
21 pages
FPF - Visual Guide To Practical Data DeID
No ratings yet
FPF - Visual Guide To Practical Data DeID
1 page
New Static Data Anonymization on Multidimensional Data 19-02-2024.Pptx (1)
No ratings yet
New Static Data Anonymization on Multidimensional Data 19-02-2024.Pptx (1)
71 pages
Privacy-Preserving Incremental Data Dissemination
No ratings yet
Privacy-Preserving Incremental Data Dissemination
28 pages
2.2.3 2.2.4
No ratings yet
2.2.3 2.2.4
25 pages
Anonymization Explained - Sara Szoc
No ratings yet
Anonymization Explained - Sara Szoc
15 pages
K Anonymity and Cluster Based Methods
No ratings yet
K Anonymity and Cluster Based Methods
37 pages
CCST 9047 lecture8
No ratings yet
CCST 9047 lecture8
49 pages
ijpds-08-2153
No ratings yet
ijpds-08-2153
12 pages
IFT-520-ResearchPaper Pranjal Mallela RadhakrishnanNair Group48
No ratings yet
IFT-520-ResearchPaper Pranjal Mallela RadhakrishnanNair Group48
22 pages
L-diversity Privacy Beyond K-Anonymity
No ratings yet
L-diversity Privacy Beyond K-Anonymity
12 pages
Data 102 Fall 2023 Lecture 24 - Privacy in Machine Learning
No ratings yet
Data 102 Fall 2023 Lecture 24 - Privacy in Machine Learning
46 pages
Privacy Preservation For Knowledge Discovery: A Survey: Jalpa Shah, Mr. Vinit Kumar Gupta
No ratings yet
Privacy Preservation For Knowledge Discovery: A Survey: Jalpa Shah, Mr. Vinit Kumar Gupta
8 pages
De-Identification Algorithm For Free-Text Nursing Notes
No ratings yet
De-Identification Algorithm For Free-Text Nursing Notes
4 pages
CIPT Onl Mod4Transcript PDF
No ratings yet
CIPT Onl Mod4Transcript PDF
16 pages
Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing
No ratings yet
Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing
19 pages
a-privacy-preserving-distributed-filtering-framework-for-nlp-30r6g0qti3
No ratings yet
a-privacy-preserving-distributed-filtering-framework-for-nlp-30r6g0qti3
10 pages
IoT Module 4_250326_090341
No ratings yet
IoT Module 4_250326_090341
101 pages
Team-17 Final
No ratings yet
Team-17 Final
42 pages
Personal Data Anonymization
No ratings yet
Personal Data Anonymization
7 pages
w8 K Anonymity
No ratings yet
w8 K Anonymity
26 pages
Chapter 6-Data Anonymization
No ratings yet
Chapter 6-Data Anonymization
37 pages
An Iterative Classification Scheme
No ratings yet
An Iterative Classification Scheme
6 pages
Avoiding Disclosure of Individually Identifiable Health Information -- A Literature Review
No ratings yet
Avoiding Disclosure of Individually Identifiable Health Information -- A Literature Review
17 pages
PHD Thesis Defense (Final)
No ratings yet
PHD Thesis Defense (Final)
96 pages
DE-Identification of Protected Health Information PHI from Free Text in Medical Records
No ratings yet
DE-Identification of Protected Health Information PHI from Free Text in Medical Records
11 pages
Distributed Health Records, Cryptographic Pseudonyms, and Privacy
No ratings yet
Distributed Health Records, Cryptographic Pseudonyms, and Privacy
16 pages
Privacy Preservation by Anonymization Method Accomplishing Concept of Hierarchical Clustering and DES: A Propose Study
No ratings yet
Privacy Preservation by Anonymization Method Accomplishing Concept of Hierarchical Clustering and DES: A Propose Study
4 pages
Protecting Privacy When Disclosing Information: K Anonymity and Its Enforcement Through Suppression
No ratings yet
Protecting Privacy When Disclosing Information: K Anonymity and Its Enforcement Through Suppression
4 pages
Closeness: A New Privacy Measure For Data Publishing
No ratings yet
Closeness: A New Privacy Measure For Data Publishing
14 pages
K Anonymity 2
No ratings yet
K Anonymity 2
18 pages
T Closeness Privacy Beyond K Anonymity and L Diversity
No ratings yet
T Closeness Privacy Beyond K Anonymity and L Diversity
10 pages
VenkatramanSection1.4 1.7
No ratings yet
VenkatramanSection1.4 1.7
40 pages
H D O W: Ealth Ata in An PEN Orld
No ratings yet
H D O W: Ealth Ata in An PEN Orld
23 pages
A Novel Approach For Privacy Preserving Publication of Data
No ratings yet
A Novel Approach For Privacy Preserving Publication of Data
7 pages
T-Closeness Privacy Beyond K-Anonymity and L-Diversity
No ratings yet
T-Closeness Privacy Beyond K-Anonymity and L-Diversity
10 pages
ECB12747
No ratings yet
ECB12747
20 pages
Privacy Preserving Data-Analytics_b27d365702de3b4b1d12abbf775d09ec
No ratings yet
Privacy Preserving Data-Analytics_b27d365702de3b4b1d12abbf775d09ec
39 pages
1.privacy Preservation For Abstracting Anonymization Techniques Using Generalization Algorithm - IJIEMR - Dr. K. Bhavana Raj
No ratings yet
1.privacy Preservation For Abstracting Anonymization Techniques Using Generalization Algorithm - IJIEMR - Dr. K. Bhavana Raj
12 pages
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
From Everand
IT Specialist: Data Analytics Certification Prep - 500 Exam Questions and Explanations
Steve Brown
No ratings yet
The Pitfalls of Hashing For Privacy 2018
No ratings yet
The Pitfalls of Hashing For Privacy 2018
15 pages
An Approach For Privacy Preservation Using XML Distance Measure
No ratings yet
An Approach For Privacy Preservation Using XML Distance Measure
5 pages
Week11a - Lecture 7 - Achieving Privacy
No ratings yet
Week11a - Lecture 7 - Achieving Privacy
49 pages
NIST - Ir.8053 De-Identification PI
No ratings yet
NIST - Ir.8053 De-Identification PI
54 pages
Security and Privacy Issues in Healthcare Information System
No ratings yet
Security and Privacy Issues in Healthcare Information System
5 pages
A Bird's Eye view of Data Visualisation
From Everand
A Bird's Eye view of Data Visualisation
Nisarg Patel
No ratings yet
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
No ratings yet
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
6 pages
Study of Customer Experience and Uses of Uber Cab Services in Mumbai
No ratings yet
Study of Customer Experience and Uses of Uber Cab Services in Mumbai
12 pages
An Importance and Advancement of QSAR Parameters in Modern Drug Design: A Review
No ratings yet
An Importance and Advancement of QSAR Parameters in Modern Drug Design: A Review
9 pages
THE TOPOLOGICAL INDICES AND PHYSICAL PROPERTIES OF n-HEPTANE ISOMERS
No ratings yet
THE TOPOLOGICAL INDICES AND PHYSICAL PROPERTIES OF n-HEPTANE ISOMERS
7 pages
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
No ratings yet
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
6 pages
THE TOPOLOGICAL INDICES AND PHYSICAL PROPERTIES OF n-HEPTANE ISOMERS
No ratings yet
THE TOPOLOGICAL INDICES AND PHYSICAL PROPERTIES OF n-HEPTANE ISOMERS
7 pages
Customer Satisfaction A Pillar of Total Quality Management
No ratings yet
Customer Satisfaction A Pillar of Total Quality Management
9 pages
Analysis of Product Reliability Using Failure Mode Effect Critical Analysis (FMECA) - Case Study
No ratings yet
Analysis of Product Reliability Using Failure Mode Effect Critical Analysis (FMECA) - Case Study
6 pages
Soil Stabilization of Road by Using Spent Wash
No ratings yet
Soil Stabilization of Road by Using Spent Wash
7 pages
Experimental Investigations On K/s Values of Remazol Reactive Dyes Used For Dyeing of Cotton Fabric With Recycled Wastewater
No ratings yet
Experimental Investigations On K/s Values of Remazol Reactive Dyes Used For Dyeing of Cotton Fabric With Recycled Wastewater
7 pages
Study of Customer Experience and Uses of Uber Cab Services in Mumbai
No ratings yet
Study of Customer Experience and Uses of Uber Cab Services in Mumbai
12 pages
Analysis of Product Reliability Using Failure Mode Effect Critical Analysis (FMECA) - Case Study
No ratings yet
Analysis of Product Reliability Using Failure Mode Effect Critical Analysis (FMECA) - Case Study
6 pages
The Mexican Innovation System: A System's Dynamics Perspective
No ratings yet
The Mexican Innovation System: A System's Dynamics Perspective
12 pages
Design and Detection of Fruits and Vegetable Spoiled Detetction System
No ratings yet
Design and Detection of Fruits and Vegetable Spoiled Detetction System
8 pages
A Deep Learning Based Assistant For The Visually Impaired
No ratings yet
A Deep Learning Based Assistant For The Visually Impaired
11 pages
An Importance and Advancement of QSAR Parameters in Modern Drug Design: A Review
No ratings yet
An Importance and Advancement of QSAR Parameters in Modern Drug Design: A Review
9 pages
The Impact of Effective Communication To Enhance Management Skills
No ratings yet
The Impact of Effective Communication To Enhance Management Skills
6 pages
Staycation As A Marketing Tool For Survival Post Covid-19 in Five Star Hotels in Pune City
No ratings yet
Staycation As A Marketing Tool For Survival Post Covid-19 in Five Star Hotels in Pune City
10 pages
Anchoring of Inflation Expectations and Monetary Policy Transparency in India
No ratings yet
Anchoring of Inflation Expectations and Monetary Policy Transparency in India
9 pages
Impact of Covid-19 On Employment Opportunities For Fresh Graduates in Hospitality &tourism Industry
No ratings yet
Impact of Covid-19 On Employment Opportunities For Fresh Graduates in Hospitality &tourism Industry
8 pages
Performance of Short Transmission Line Using Mathematical Method
No ratings yet
Performance of Short Transmission Line Using Mathematical Method
8 pages
A Comparative Analysis of Two Biggest Upi Paymentapps: Bhim and Google Pay (Tez)
No ratings yet
A Comparative Analysis of Two Biggest Upi Paymentapps: Bhim and Google Pay (Tez)
10 pages
Ijaiem 2021 01 28 6
No ratings yet
Ijaiem 2021 01 28 6
9 pages
The Effect of Work Involvement and Work Stress On Employee Performance: A Case Study of Forged Wheel Plant, India
No ratings yet
The Effect of Work Involvement and Work Stress On Employee Performance: A Case Study of Forged Wheel Plant, India
5 pages
Swot Analysis of Backwater Tourism With Special Reference To Alappuzha District
No ratings yet
Swot Analysis of Backwater Tourism With Special Reference To Alappuzha District
5 pages
Analysis of RCC Beam Using GFRP Wrapped With Cellular Stirrups
No ratings yet
Analysis of RCC Beam Using GFRP Wrapped With Cellular Stirrups
11 pages
Design and Manufacturing of 6V 120ah Battery Container Mould For Train Lighting Application
No ratings yet
Design and Manufacturing of 6V 120ah Battery Container Mould For Train Lighting Application
13 pages
Marco Economic Sustainability in India: Partisan Theory Approach
No ratings yet
Marco Economic Sustainability in India: Partisan Theory Approach
7 pages
Application of Mersey Silt As Fine Aggregate in Concrete
No ratings yet
Application of Mersey Silt As Fine Aggregate in Concrete
9 pages
Sheet Three
No ratings yet
Sheet Three
7 pages
QB 105722
No ratings yet
QB 105722
7 pages
Applications of DC Generators
No ratings yet
Applications of DC Generators
11 pages
Special Section: Karst: - Fort Worth Basin
No ratings yet
Special Section: Karst: - Fort Worth Basin
20 pages
Moe Tutor
No ratings yet
Moe Tutor
66 pages
Shane Math
No ratings yet
Shane Math
3 pages
TSP15M Series: DC Signal Line Surge Protector
No ratings yet
TSP15M Series: DC Signal Line Surge Protector
1 page
Math Lesson Plan Elapsed Time
No ratings yet
Math Lesson Plan Elapsed Time
11 pages
REFERENCE Datasheet-Tarun Jackated Pump (Ref. JH1-A-MOC-049) PDF
No ratings yet
REFERENCE Datasheet-Tarun Jackated Pump (Ref. JH1-A-MOC-049) PDF
2 pages
3-3-2 Create A Service Order in The CIC.: (C) Sap Ag IUT221 3-61
No ratings yet
3-3-2 Create A Service Order in The CIC.: (C) Sap Ag IUT221 3-61
2 pages
Design Calculations of Lightning Protection Systems - Part Eight
No ratings yet
Design Calculations of Lightning Protection Systems - Part Eight
16 pages
Resume
No ratings yet
Resume
3 pages
Literature Review of Acetic Acid
75% (4)
Literature Review of Acetic Acid
6 pages
Half Subtractor and Full Subtractor VHDL Simulation Code
No ratings yet
Half Subtractor and Full Subtractor VHDL Simulation Code
7 pages
MX2 Training Program 10F Acoustic Wedge Verification
100% (1)
MX2 Training Program 10F Acoustic Wedge Verification
21 pages
2x16 AWG TC 600V Overall Shielded Control Cable - 8KDP102101 - V - 1 - R - 3
No ratings yet
2x16 AWG TC 600V Overall Shielded Control Cable - 8KDP102101 - V - 1 - R - 3
2 pages
Krishna Murthy Iit Academy: SOLUTIONS - IIT-JEE - UNIT - 1 PAPER - 1 - 18-01-2011
No ratings yet
Krishna Murthy Iit Academy: SOLUTIONS - IIT-JEE - UNIT - 1 PAPER - 1 - 18-01-2011
9 pages
Working With The Big Ideas in Number and The Australian Curriculum: Mathematics
No ratings yet
Working With The Big Ideas in Number and The Australian Curriculum: Mathematics
15 pages
R. C. Hibbeler Statics and Mechanics of Materials Global Ed. Pearson P222pdf
No ratings yet
R. C. Hibbeler Statics and Mechanics of Materials Global Ed. Pearson P222pdf
2 pages
Casinghardware Saga Trade Product R
No ratings yet
Casinghardware Saga Trade Product R
37 pages
CBH-22-296 DIN 931 HB HT M10 X 70 SELF 8.8
No ratings yet
CBH-22-296 DIN 931 HB HT M10 X 70 SELF 8.8
2 pages
Chapter Ii - Fluid Statics
No ratings yet
Chapter Ii - Fluid Statics
72 pages
Airmassandfrontsawebquest
60% (5)
Airmassandfrontsawebquest
5 pages
Formulário - 2017 - 2018 PDF
No ratings yet
Formulário - 2017 - 2018 PDF
2 pages
R20-BE (CSE) - V-VIII-Semesters-Syllabus-without Matrix-Final-23-06-22
No ratings yet
R20-BE (CSE) - V-VIII-Semesters-Syllabus-without Matrix-Final-23-06-22
163 pages
Kinematics Worksheet
100% (2)
Kinematics Worksheet
4 pages
Unity 06 Motorgate
No ratings yet
Unity 06 Motorgate
13 pages
Beginning C#
No ratings yet
Beginning C#
2 pages
Treatment of Posterior Crossbite Comparing 2 Appliances: A Community-Based Trial
No ratings yet
Treatment of Posterior Crossbite Comparing 2 Appliances: A Community-Based Trial
8 pages

Real-Time De-Identification of Healthcare Data Using Ephemeral Pseudonyms

Uploaded by

Real-Time De-Identification of Healthcare Data Using Ephemeral Pseudonyms

Uploaded by

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Web Site: www.ijettcs.org Email: [email protected], [email protected]

Real-time De-identification of Healthcare Data

later so all traces of the patient should be removed and the

and initializes a report_schema for insertion of records in 4. if cQueue is null:

● EPGA-Init(Qti): ● createEi(gui , Ui):

@Consumer distributed resources which will require inference from

[6] Peterson, R.L., Encryption system for allowing

Mohit Kumar Sahni is an undergraduate

Sourav Aggarwal is an undergraduate

Bipin Kumar Rai, received the B.Tech(CSE)

Volume 7, Issue 2, March – April 2018 Page 25

You might also like