07 Query Optimization Flashcards

(23 cards)

1
Q

span_0“What is the primary bottleneck query optimization tries to reduce?”

A

I/O Cost (Input/Output). Disk access does not improve as fast as CPU speed (Moore’s Law), so minimizing block reads is the main goal.span_0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three main resources that affect query response time?

A
  1. I/O (Disk access - most prominent)
  2. CPU usage (Complex calculations)
  3. span_1Memory usage (Cache effect)span_1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a ‘Trivial Plan’ in SQL Server?

A

A plan for a simple query where there is clearly only one way to execute it. span_2The optimizer skips the complex costing process.span_2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens if no Trivial Plan is found? (List the 3 Phases)

A

Phase 0: Search for simple/Hash plans (Execute if Cost < X).
Phase 1: Complex optimizations (Execute if Cost < Y).
span_3Phase 2: Parallel execution.span_3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Difference between Logical and Physical Execution Plans?

A

Logical Plan: What the query wants to do (Relational Algebra tree).
span_4span_5Physical Plan: How the database will actually do it (Specific algorithms like Hash Join, Index Seek).span_4span_5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the ‘Push Selection Down’ rule in refactoring?

A

Move the filtering (WHERE clause) as far down the tree as possible. span_6Filter the data before joining to reduce the number of rows processed early.span_6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the Relational Algebra rule for splitting a selection?

A

Sigma(F AND G)(R) = Sigma(F)(Sigma(G)(R)). span_7This allows you to re-order filters.span_7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nested Loop Join: How does it work and what is the I/O Cost?

A

Concept: Two embedded loops (For every row in A, scan all of B).
Cost: O(Num_Block_1 * Num_Block_2).
span_8Use Case: Good for small tables.span_8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Hash Join: How does it work and what is the I/O Cost?

A

Concept: Build a hash table in memory for the smaller table, then read larger table and search for matching records in hash table.
Cost: O(Num_Block_1 + Num_Block_2).
span_9Use Case: Good for large, unsorted tables.span_9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sort Merge Join: How does it work and what is the I/O Cost?

A

Concept: read both tables into memory , sort based on join columns with index
Cost: O(Num_Block_1 + Num_Block_2).
Use Case: Excellent if tables are already sorted or small.span_10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Table Scan?

A

Reading every page of the table sequentially. span_11Used when no suitable index exists or when fetching most rows.span_11

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Clustered Index?

A

The data blocks are ordered by the index , preferred over table scan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a Non-Clustered Index?

A

A separate structure (like a book index) that points to the data rows. span_14Reading data often requires a ‘Double Read’ (Index + Table Lookup).span_14

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a ‘Cover Index’ (Index with Included Columns)?

A

An index that contains all the data required by the query in its leaf nodes. span_15The database never has to look up the actual table row.span_15

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an ‘Index Seek’?

A

Using the B-Tree structure to jump directly to a specific record (or range start). span_16Much faster than a scan.span_16

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

span_17“Why should you avoid functions in a WHERE clause (e.g.

A

WHERE YEAR(date) = 2020)?”

17
Q

span_18“Why is ‘SELECT *’ considered bad practice?”

A

It retrieves unnecessary columns, increasing I/O and preventing the use of Cover Indexes.span_18

18
Q

How does MongoDB optimization differ from SQL Server?

A

MongoDB does not use statistics to estimate cost. span_19It runs candidate plans in parallel.span_19

19
Q

How does MongoDB choose a winning plan?

A

It runs a race between plans. span_20The first plan to return 101 results wins and is cached for future use.span_20

20
Q

What is a Compound Index in MongoDB?

A

An index on multiple fields (e.g., {score: 1, date: -1}). span_21The order of fields matters for sorting and filtering.span_21

21
Q

What does ‘SARGable’ mean?

A

Search ARGument ABLE. A query written in a way that can take advantage of indexes (e.g., avoiding leading wildcards like ‘%John’).

22
Q

In Relational Algebra, when can you distribute a Selection over a Join?

A

Sigma_F(R Join S) = Sigma_F(R) Join S.
span_22You can do this only if the filter F applies only to attributes in table R.span_22

23
Q

span_23“What is the ‘Explain’ command in MongoDB?”

A

A command (query.explain()) that shows the ‘winningPlan’ and ‘rejectedPlans’ from the optimization race.span_23