What is Attribute Inference (Model Inversion)?
Attack that infers sensitive features of an input from model outputs; works best with overfitting and high-fidelity outputs (logits). Example: Warfarin dosage → infer genotype.
What conditions make model inversion effective?
Overfitting, smooth decision boundaries, high-resolution outputs (probabilities/logits), strong correlation between features and predictions.
What is Property Inference?
Attack that learns global properties of the training set (e.g., most users wear glasses, dataset contains celebrities). Does NOT recover individuals.
How does Property Inference work?
Train shadow models on datasets with/without property P; meta-classifier predicts whether target model’s training data had P.
What is Model Extraction?
Attacker recreates a surrogate model f’(x) approximating target f(x). Often uses repeated queries, confidence scores, or boundary exploration.
How is linear model extraction done?
Query n+1 points in n-dimensional space → solve system for weights w and bias b.
What helps model extraction succeed?
Access to confidence scores/logits; smooth or simple model structure; deterministic output.
What is Membership Inference?
Attack determining whether a specific record x was in the training set. Relies on overfitting and confidence differences.
What is the Membership Inference pipeline?
Shadow models → attack model trained on their outputs → classify target model’s output as member/non-member.
What is Federated Learning?
Server sends model → clients train locally → send updates → server aggregates. Raw data stays local.
What are attack surfaces in FL?
Malicious server reading gradients, gradient leakage reconstructing inputs, malicious clients performing poisoning.
What is DSSGD (Selective Gradient Descent)?
Clients send only top-K gradients. Reduces leakage of small gradients but selection pattern still leaks info.
When does DSSGD fail?
When sensitive attributes heavily influence the largest gradients that are still transmitted.
What is Secure Aggregation?
Mechanism where users add pairwise noise (“antiparticles” +x/-x); noise cancels during aggregation so server sees only sum.
Pros of Secure Aggregation?
Strong privacy against malicious server; zero utility loss (noise cancels).
Cons of Secure Aggregation?
Protocol complexity; requires handling dropouts; involves peer-to-peer coordination.
What is Differential Privacy in FL?
Adds noise to protect individual updates. Only local DP (on-device) protects against malicious server; server-side DP does not.
DP vs Secure Aggregation difference?
DP adds irreversible noise reducing utility; secure aggregation preserves utility but requires stronger protocol complexity.
What is Fairness Through Blindness?
Removing protected attributes (race/gender) from inputs. Fails because proxy variables still encode them.
What is Statistical Parity?
Positive outcome probability should be equal across groups: P(positive|S) ≈ P(positive|S^c).
Limitation of Statistical Parity?
Ignores correctness of predictions; can hide discriminatory error rates.
What is QII (Quantitative Input Influence)?
Causal transparency method: replace a feature with random value from population; measure output change.
What questions does QII answer?
“Did gender change the decision?” or “Which feature mattered most for this prediction?”
What is memorization in GenAI?
LLMs store rare or duplicated sequences (k-eidetic memorization). Can reveal private data through prompting.