what are the roles in data collection and data publishing
Data Recipient
Data Publisher
Record Owners
an example of data recipient
Medical Center
(data mining)
an example of data Publisher
Hospital
(data anonymization)
an example of record owners
patients
what are the data attributes
Explicit Identifier
Quasi Identifier (QID)
Sensitive Attributes
Non-Sensitive Attributes
what are explicit identifiers
Data attributes that explicitly identifies record owners, e.g., name, identity card number, mobile phone number.
what are Quasi Identifier (QID)
Data attributes that could potentially identify record owners, e.g., postal code, age, gender.
what are sensitive attributes
Data attributes that are sensitive person-specific information, e.g., salary, disease, disability status
what are non sensitive attributes
Data attributes that do not fall into all of the other categories.
what are the roles responsible with data collection
data publisher
record owners
what are the roles responsible for data publishing
data receipient
what are the privacy attacks
record linkage
attribute linkage
table linkage
probabilistic
what is the record linkage model
example of record linkage model
Example: Hospital wants to publish the patient records in Table 1 to a research center
what is the attribute linkage model
example of attribute linkage model
Example: Hospital anonymizes the data, Job/Age, into a range, to reduce record linkage
what is the table linkage model
example of table linkage model
Example: Hospital publishes patient data in Table 3 – table linkage attack on
target victim, Alice
* Adversary is presumed to also have access to external public data in Table 4
* 4/5th or 80% probability that Alice has HIV
* 4 records in Table 3 and 5 records in Table 4 containing, Artist, Female,
[30−35]
what is Probabilistic Model
is the adversary’s knowledge limited to quasi identifiers ?
No,