Data Security Flashcards

This deck teaches data security as concrete system controls: how data is identified and classified, how access control decisions are enforced, and how encryption in transit and at rest differ in what they stop. It covers key management patterns including envelope encryption, the mechanical difference between tokenization and encryption, and integrity checks using hashes, Message Authentication Codes (MAC), and digital signatures. It also covers retention and deletion mechanics, backups and resto (30 cards)

1
Q

What is data classification and how is it used by systems?

A

Data classification labels data by sensitivity so systems can apply different controls.
- Systems tag assets and datasets (public, internal, confidential, regulated).
- Policies use labels to decide access, encryption requirements, logging, and retention.
- Classification only matters if enforcement points read the label and apply rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does asset identification work for data security?

A

Asset identification is mapping where sensitive data lives and how it flows.
- Systems inventory stores (databases, object stores, logs, analytics) and the datasets inside them.
- Systems map data producers/consumers and transfer paths (APIs, ETL, exports).
- Controls are applied per asset; unknown assets are typically uncontrolled assets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What breaks classification and asset identification in practice?

A

Cause → system behavior → security impact.
- Cause: data stores or pipelines are not inventoried or labeled.
- Behavior: policies do not apply because enforcement cannot target unknown/labeled assets.
- Impact: sensitive data is stored or exported without required controls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are data access control decisions (who/what can read)?

A

Access control decisions are checks that determine whether a principal can read or modify data.
- Input: principal identity (user/service), requested action, target dataset/object, context.
- Decision: allow or deny based on policy rules.
- Enforcement: the storage or service must perform the check before returning data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do systems enforce data access control mechanically?

A

Step 1: Principal authenticates and receives an identity.
Step 2: Principal requests an action on a data resource.
Step 3: Enforcement evaluates policy rules (role/attribute/context) against the request.
Step 4: If allowed, data is returned; if denied, data is not returned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What breaks data access control in practice?

A

Cause → system behavior → security impact.
- Cause: over-permissioned roles, shared credentials, or missing authorization checks in a data API.
- Behavior: data is returned to principals that should be denied.
- Impact: unauthorized disclosure or modification of sensitive data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is encryption in transit vs at rest (applied)?

A

They protect data in different states and against different attacker positions.
- In transit: protects data while moving between endpoints by encrypting network traffic (eavesdroppers cannot read).
- At rest: protects stored data by encrypting it on disk/object storage (disk snapshots or stolen media cannot be read without keys).
| Neither control replaces access control; both reduce exposure when a specific layer is compromised.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What breaks encryption in transit in practice?

A

Cause → system behavior → security impact.
- Cause: TLS (Transport Layer Security) not used, weak validation, or trust misconfiguration.
- Behavior: attacker on the network can read or modify traffic because encryption/validation checks do not protect the channel.
- Impact: credential theft, data interception, and request tampering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What breaks encryption at rest in practice?

A

Cause → system behavior → security impact.
- Cause: keys are accessible to the attacker in the same environment as the encrypted data.
- Behavior: attacker reads ciphertext and also obtains keys, so decryption succeeds.
- Impact: at-rest encryption fails to reduce disclosure in that compromise scenario.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are key management patterns for data protection?

A

Key management patterns define how keys are generated, stored, used, and rotated.
- Separate data encryption keys from key encryption keys to limit blast radius.
- Restrict who/what can request decryption operations.
- Rotate keys so compromised keys stop enabling future decryption.
| The core mechanism is controlling which principals can cause decryption to happen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do systems enforce key usage controls mechanically?

A

Step 1: Principal requests an encrypt/decrypt operation using a key identifier.
Step 2: Key management service validates the principal’s authorization for that key and operation.
Step 3: If allowed, the service performs the cryptographic operation and returns result.
Step 4: All key usage is logged so misuse can be detected and investigated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is envelope encryption (applied) and what does it achieve?

A

Envelope encryption encrypts data with a Data Encryption Key (DEK) and protects that DEK with a Key Encryption Key (KEK).
- Data is encrypted with a per-object/per-record DEK.
- DEK is encrypted (“wrapped”) using a KEK controlled by a key management system.
- To decrypt, system must unwrap DEK via authorized access to the KEK.
| This limits blast radius and simplifies key rotation strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does envelope encryption work mechanically?

A

Step 1: Generate a DEK for the data item.
Step 2: Encrypt data using the DEK, producing ciphertext.
Step 3: Encrypt (wrap) the DEK using the KEK, producing wrapped key material.
Step 4: Store ciphertext + wrapped DEK; decrypt requires authorized KEK unwrap then data decrypt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What breaks envelope encryption in practice?

A

Cause → system behavior → security impact.
- Cause: KEK access is too broad or decryption service is exposed.
- Behavior: attacker can unwrap DEKs and decrypt data because authorization checks allow it.
- Impact: envelope structure exists but does not limit who can decrypt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is tokenization vs encryption (mechanical difference)?

A

They protect data using different mechanisms.
- Encryption: transforms plaintext into ciphertext using a key; decryption reverses it with the key.
- Tokenization: replaces plaintext with a token; a separate mapping system returns plaintext when authorized.
- Tokenization security depends on controlling access to the token vault/mapping lookup, not on cryptographic secrecy alone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do systems enforce tokenization mechanically?

A

Step 1: Service sends sensitive value to tokenization system.
Step 2: System stores mapping (token ↔ value) and returns token.
Step 3: Downstream systems store/process token instead of real value.
Step 4: Detokenization requires an authorized lookup; unauthorized principals cannot get the original value.

17
Q

What breaks tokenization in practice?

A

Cause → system behavior → security impact.
- Cause: token vault access is broad or detokenization endpoints are reachable by too many services.
- Behavior: attacker detokenizes tokens at scale because authorization checks allow it.
- Impact: tokens become thin wrappers and do not reduce disclosure.

18
Q

What are data integrity checks (hash/MAC/signature usage)?

A

Integrity checks verify data has not been altered, with different trust properties.
- Hash: detects change only if you have a trusted reference hash; hash alone does not prove who changed it.
- MAC (Message Authentication Code): proves integrity and authenticity to parties that share a secret key.
- Digital signature: proves integrity and authenticity to anyone who trusts the signer’s public key.

19
Q

How do systems enforce integrity checks mechanically?

A

Step 1: When data is created, compute integrity proof (hash/MAC/signature) over exact bytes.
Step 2: Store or transmit data with its integrity proof.
Step 3: On read/use, recompute proof and compare or verify with key/public key.
Step 4: If verification fails, system rejects the data as modified or untrusted.

20
Q

What breaks integrity checks in practice?

A

Cause → system behavior → security impact.
- Cause: integrity proof is not verified at use time, or keys used for MAC/signing are compromised.
- Behavior: altered data is accepted because verification is skipped or attacker can forge valid proofs.
- Impact: data tampering becomes silent and can affect correctness, safety, and security decisions.

21
Q

What are data retention and deletion mechanics?

A

Retention defines how long data exists; deletion defines how it is removed or made inaccessible.
- Systems enforce retention via lifecycle rules and expiration policies.
- Deletion requires removing references and ensuring copies (backups/replicas) are handled per policy.
| “Deleted” is only meaningful if systems prevent future reads of the data.

22
Q

What breaks retention and deletion in practice?

A

Cause → system behavior → security impact.
- Cause: unmanaged copies (exports, snapshots, caches) are outside retention enforcement.
- Behavior: data persists in secondary locations and remains readable.
- Impact: long-term exposure continues even after primary deletion.

23
Q

What are backups and restore trust (what is verified)?

A

Backups preserve data; restore trust is proving restored data and systems are safe to use.
- Backups can also preserve compromised or tampered state.
- Restore trust requires verifying integrity, provenance, and that restored credentials and configs are not attacker-controlled.
| Restore is not just copying bytes back; it is re-establishing a trustworthy state.

24
Q

How do you verify backups before and after restore?

A

Step 1: Verify backup integrity (hashes/signatures, immutable storage properties if used).
Step 2: Verify backup content version/time matches the intended recovery point.
Step 3: Restore into an isolated environment and validate expected behavior and access controls.
Step 4: Rotate credentials/keys as needed so old compromised material cannot be reused after restore.

25
What breaks backup restore trust in practice?
Cause → system behavior → security impact. - Cause: restoring compromised configs/keys or restoring from an unverified backup source. - Behavior: attacker access persists after “recovery” because proofs and credentials still work. - Impact: incident repeats or continues after restore.
26
What are common data leakage paths (logs, exports, analytics, snapshots)?
Leakage paths are places data leaves its intended boundary. - Logs: sensitive fields get recorded and then copied widely. - Exports: bulk data extraction to files or external systems. - Analytics: replicated datasets in warehouses with broader access. - Snapshots/backups: point-in-time copies accessible outside normal access paths. | These paths often bypass primary database access controls.
27
What breaks leakage prevention in practice?
Cause → system behavior → security impact. - Cause: sensitive data is included in logs/exports without redaction and with broad access. - Behavior: many systems and people can read sensitive data through secondary stores. - Impact: disclosure occurs even if the primary system has strong access control.
28
What are multi-tenant data isolation concepts?
Multi-tenant isolation ensures one tenant cannot access another tenant’s data. - Logical isolation: enforcement checks tenant ID on every query and result set. - Physical isolation: separate databases/storage per tenant reduces shared blast radius. - Strong isolation requires that tenant identity is derived from authenticated context, not from user-supplied input.
29
What breaks multi-tenant isolation in practice?
Cause → system behavior → security impact. - Cause: tenant ID is taken from request parameters without binding to authenticated identity, or missing filter in a query path. - Behavior: queries return data from other tenants because enforcement checks are wrong or skipped. - Impact: cross-tenant data disclosure and regulatory breach.
30
How do you reason about incident impact for data compromise?
Impact reasoning ties attacker access paths to data exposure outcomes. - Determine which identities/keys/tokens the attacker had and what data those can access. - Determine whether encryption controls reduce exposure for that attacker position (keys accessible or not). - Identify all leakage paths that could have been used (exports, snapshots, logs). - Conclude which datasets were readable, modifiable, or exfiltrated based on evidence and access capability.