What is a Primary Key (PK)?
A column (or set of columns) that uniquely identifies each row in a table. It must be unique and not null. PKs are commonly indexed automatically.
What is a Foreign Key (FK)?
A constraint that enforces referential integrity: values in the child table must reference existing values in the parent table (or be null, depending on the FK definition).
PK vs UNIQUE constraint — what’s the difference?
Both enforce uniqueness. A table can have only one PK but multiple UNIQUE constraints. PK columns are not nullable; UNIQUE columns may allow nulls (behavior varies by DB, but generally null is allowed).
What is a CHECK constraint?
A rule enforced by the database that restricts allowed values (e.g., status IN (‘NEW’,’PAID’)).
Why rely on DB constraints instead of only application validation?
Constraints protect data integrity regardless of which app/client writes to the DB, preventing invalid data from ever being stored.
What does ACID stand for?
Atomicity, Consistency, Isolation, Durability — key guarantees of database transactions.
What is a transaction?
A group of operations executed as a single unit of work: either all changes commit or none (rollback).
Atomicity — definition
All operations in a transaction succeed together, or the transaction is rolled back so no partial changes remain.
Consistency — definition
A transaction moves the DB from one valid state to another, preserving constraints and invariants.
Isolation — definition
Concurrent transactions should not interfere in a way that causes incorrect results (depends on isolation level).
Durability — definition
Once a transaction commits, its changes persist even after crashes (assuming proper DB configuration).
What is the most common isolation level in OLTP systems?
Read Committed — good balance between correctness and concurrency for many workloads.
What is a dirty read?
Reading data written by another transaction that has not committed yet. Read Committed prevents this.
What is a non-repeatable read?
You read the same row twice in one transaction and get different results because another committed transaction changed it.
What is a phantom read?
A repeated query returns a different set of rows because another transaction inserted/deleted rows that match the query condition.
High isolation levels vs throughput — tradeoff?
Higher isolation reduces anomalies but can increase blocking/conflicts and reduce throughput.
What is a lock (in basic terms)?
A mechanism that controls concurrent access to rows/tables to maintain correctness; writes typically require locks.
What is a deadlock?
Two (or more) transactions each hold locks the other needs, causing circular waiting. The DB detects this and aborts one transaction.
How do you reduce deadlocks?
Keep transactions short, acquire locks in a consistent order, avoid holding locks while doing slow work, and ensure proper indexing to avoid wide locking.
Why should transactions be kept short?
Long transactions hold locks longer, increasing contention, deadlock risk, and latency for other operations.
What is an index?
A data structure (often a B-tree) that speeds up lookups and range queries by allowing the DB to locate rows without scanning the whole table.
Main tradeoff of adding indexes?
Faster reads, but slower writes (INSERT/UPDATE/DELETE must maintain indexes) and more storage usage.
When does an index typically help most?
High-selectivity filters (WHERE), join keys, range scans, and ordered queries (ORDER BY) when the index matches the access pattern.
When might an index not help?
Low-selectivity columns (e.g., boolean), very small tables, or queries that must read most rows anyway.