Privacy
Ability to control or shield personal information from misuse.
Confidentiality
Ensuring information is not disclosed to unauthorized parties.
Anonymity
Individual’s identity cannot be linked to an action.
Unlinkability
Adversary cannot link multiple actions to same user.
Unobservability
Adversary cannot detect that an action occurred.
PII
Data that can identify a specific individual (e.g., name, SSN, address).
Threat Model
Assumed adversary capabilities and goals.
Context-dependence
Privacy varies by social, cultural, or situational setting.
Data Anonymization
Removing or altering personal identifiers to protect privacy.
Quasi-identifier
Combination of innocuous attributes that can re-identify individuals.
Naive Anonymization
Removing names but leaving quasi-identifiers that allow re-identification.
Privacy Model
Framework of definitions/assumptions for privacy guarantees.
Utility Trade-off
Balance between data privacy and usefulness.
k-anonymity
Each record in dataset is indistinguishable from at least k-1 others.
Generalization
Replacing specific values with broader categories to protect privacy.
Suppression
Removing values from dataset to prevent re-identification.
Homogeneity Attack
Sensitive values in a group are identical, allowing inference.
Background Knowledge Attack
External knowledge enables adversary to defeat anonymity.
l-diversity
Each group must have at least l distinct sensitive values.
Differential Privacy (DP)
Guarantee that presence/absence of one record has little effect on output.
Neighboring datasets
Datasets differing by only one record.
ε (epsilon)
Privacy parameter; smaller values = stronger privacy.
Laplace Mechanism
Adds Laplace-distributed noise proportional to sensitivity/ε.
Global Sensitivity
Maximum change in query output when one record is changed.