Evaluation Plan
An Evaluation Plan
defines what algorithm is used for each operation,
and how execution of these operations is coordinated.
Cost-Based
Query Optimization
Steps (3)
Query Costs:
Factors
There are multiple statistics on a DBMS:
Equivalence of
Relational Algebra Statements
Two RA statements are said to be Equivalent IF:
They produce the same tuples on every database instance.
Relational Algebra Equivalence Rules:
Set of Rules
Relational Algebra Equivalence Rules:
3 Most Useful Join Rules
Cost Estimation:
Basics
Goal:
Compare an execution plan’s cost,
not exactly measure computation time.
*More of an art, not exact.
Variables and Symbols:
General
Query Optimization
Rules
Cost Estimation:
Selection:
Cost of Scanning Disk with Index,
If indexed over an attribute, use a function F(R,a) to determine how many Input/Output operations required.
The Value of F(R,a) depends on the condition in the selection
Costs:
If Data is clustered:
Cost = F(R,a) * B(R)
(Reading Blocks at a time)
Unclustered:
Cost = F(R,a) * T(R)
(Reading Tuples at a time)
B(R) = Blocks in the relation
T(R) = Tuples in the relation
Cost Estimation:
Selection:
Cost of Scanning Disk without Index
Cost = B(R) + Seek Time
B(R) = Blocks in the relation
Cost Estimation:
Selection Cost Function, F(R,a):
Functions for common predicates
The Cost Function used depends on the Condition(Predicate) in the Selection Operation
Equals : σa=v(R)
F(R,a) = 1 / V(R,a)
V(R, a) = number of distinct values in R
Less than : σa<v>(R)</v>
F(R,a) = (v - min(a) ) /
( max(a)-min(a) )
Range : σx<a><span>(R)</span></a>
F(R,a) = (v - x ) /
( max(a)-min(a) )
Cost Estimation:
Selection Cost Function, F(R,a):
Equals Predicate :
σa=v(R)
F(R,a) = 1 / V(R,a)
V(R, a) = number of distinct values in R
Cost Estimation:
Selection Cost Function, F(R,a):
Less than predicate:
σa<v>(R)</v>
σa<v>(R)</v>
F(R,a) = (v - min(a) ) /
( max(a)-min(a) )
Cost Estimation:
Selection Cost Function, F(R,a):
Range Predicate:
σx<a><span>(R)</span></a>
σx<a><span>(R)</span></a>
F(R,a) =
(v - x ) / ( max(a)-min(a) )
Cost Estimation:
Joins
Optimizers:
Enumerating Equivalences
Optimizers use Equivalence Rules to systematically generate Equivalent Expressions
Process
This process is extremely expensive, optimized with:
Equivalence Rules:
Conjunctive Selections
Conjunctive Selections are
Selection Operations with Multiple Conditions:
σθ1 ^ θ2 (R)
Is equivalent to a
Sequence of Individual Selections
σθ1 ( σθ2 (R) )
Equivalence Rules:
Commutative Selection Operations
Selection Operations are Commutative:
The Selections can be performed in any order to get the same result:
σθ1 ( σθ2 (R) )
=
σθ2 ( σθ1 (R) )
Equivalence Rules:
Series of Projections
Within a series of Projections,
only the LAST Projection is needed.
All earlier Projections can be omitted:
πθn ( πθn-1 (…( πθ1 (R)…))
=
πθ1 (R)
Equivalence Rules:
Combining Selection with
Cartesian Product
Performing a Selection after a Cartesian Product is equivalent to a Theta Join with the same condition:
σθ (R1 X R2)
=
R1 ⋈θ R2
Equivalence Rules:
Join Commutative Property
Both Theta Joins and Natural Joins are Commutative:
The joins can be performed in any order
R1 ⋈θ R2
=
R2 ⋈θ R1
Equivalence Rules:
Natural Join Associativity
Natural Joins are Associative:
Can be optimized by performing the SMALLER join first
R1 ⋈ (R2 ⋈ R3)
=
(R1 ⋈ R2 ) ⋈ R3
Equivalence Rules:
Theta Join Associativity
-Special Case
Theta Joins are associative in the following case:
(R1 ⋈θ1 R2 ) ⋈θ2^θ3 R3
Where the second condition, θ2 , ONLY involves attributes from R2 and R3
=
R1⋈θ1^θ3 (R2 ⋈θ2 R3 )
Equivalence Rules:
Set Union
Set Intersection
Both operations are Commutative:
E1 ∪ E2 = E2 ∪ E1
E1 ∩ E2 = E2 ∩ E1
Both operations are Associative:
(E1∪E2)∪E3 = E2∪(E1 ∪ E3)
(E1∩E2)∩E3 = E2∩(E1 ∩ E3)