Explain the unary relational operations: SELECT and PROJECT.
SELECT:
PROJECT:
Given two union-compatible relations R and S, define the three operations UNION, INTERSECTION, and SET DIFFERENCE on them
UNION:
* R UNION S is the set of tuples that are in R or are in S or are in R and in S
INTERSECTION:
* R INTERSECTION S is the set of tuples that are in R and in S
SET DIFFERENCE:
* R SET DIFFERENCE S is the set of tuples that are in R but not in S
Explain the CARTESIAN PRODUCT operation. What is the degree and cardinality of the resultant relation? The CARTESTIAN PRODUCT operation by itself is generally meaningless; explain how it can be made meaningful.
CARTESTIAN PRODUCT
AKA CROSS PRODUCT
AKA CROSS JOIN:
* R X S
* a cartesian product gives you all possible combinations of tuples from both relations
* degree of resulting relation is n+m where n is the number of attributes in R and m is the number of attributes in S
* cardinality of resulting relation is n_R*n_S where n is the number of tuples from R and n_S is the number of tuples in S
example of CARTESIAN PRODUCT: USER: email,firstname,favoritecolor hmiller40,hope,blue dlalich90,dan,blue morleyt,Morley,grey
FOOTBALLTEAM: city,mascot,conference philadelphia,eagles,NFC cleveland,browns,AFC kansas city,chiefs,AFC green bay, packers,NFC
USER X FOOTBALLTEAM
email,firstname,favoritecolor,city,mascot,conference
hmiller40,hope,blue,philadelphia,eagles,NFC
hmiller40,hope,blue,cleveland,browns,AFC
hmiller40,hope,blue,kansas city,chiefs,AFC
hmiller40,hope,blue,green bay,packers,NFC
dlalich90,dan,blue,philadelphia,eagles,NFC
dlalich90,dan,blue,cleveland,browns,AFC
dlalich90,dan,blue,kansas city,chiefs,AFC
dlalich90,dan,blue,green bay,packers,NFC
morleyt,Morley,gray,philadelphia,eagles,NFC
morleyt,Morley,gray,cleveland,browns,AFC
morleyt,Morley,gray,kansas city,chiefs,AFC
morleyt,Morley,gray,green bay,packers,NFC
degree = n + m = 3 + 3 = 6 cardinality = n_R * n_S = 3 * 4 = 12
Discuss the various types of inner join operations. Why is theta join required?
JOIN in general: a CROSS PRODUCT followed by a SELECTION
inner joins: joins where tuples that are NULL or do not satisfy the join condition are eliminated from the result
outer joins: joins where all of the tuples from one or both relations are included in the results regardless of whether or not they satisfy the join condition; pad the mismatches with NULLs as needed
THETA JOIN: a type of inner join with a general condition like R \bowtie_{\theta} S where theta is the join condition
EQUIJOIN: a type of inner join where the only operator used in the join condition (theta) is equals
NATURAL JOIN *: a type of inner join; an EQUIJOIN that then removes one set of the duplicate attributes from one of the relations in the result; requires that the attributes in the join condition have the same name in both relations; join condition is constructed by equating each pair of join attributes that have the same name in both relations and combining these conditions with AND
LEFT OUTER JOIN: include all of the tuples from the left relation in the result regardless of whether or not they satisfy the join condition
RIGHT OUTER JOIN: include all of the tuples from the right relation in the result regardless of whether or not they satisfy the join condition
FULL OUTER JOIN: preserve all of the tuples from the left and right relations in the result regardless of whether or not they satisfy the JOIN condition
What is the difference between EQUIJOIN and NATURAL JOIN operation? Explain with the help of an example.
an EQUIJOIN (R \bowtie_{\theta}) is a type of inner join where the only operator used in the join condition (\theta) is the equals sign
a NATURAL JOIN (R * S) is a type of inner join and it’s basically just an EQUIJOIN with one set of the superfluous attributes removed
ex: USER name,email Hope,hmiller40 Dan,dlalich90 Morley,morleyt
INTEREST user_email,interest hmiller40,running hmiller40,cats hmiller40,football dlalich90,football dlalich90,space dlalich90,video games morleyt,sleeping morleyt,eating morleyt,getting on the counter
EQUIJOIN:
USER \bowtie_{email=user_email} INTEREST
name,email,user_email,interest
Hope,hmiller40,hmiller40,running
Hope,hmiller40,hmiller40,cats
Hope,hmiller40,hmiller40,football
Dan,dlalich90,dlalich90,football
Dan,dlalich90,dlalich90,space
Dan,dlalich90,dlalich90,video games
Morley,morleyt,morleyt,sleeping
Morley,morleyt,morleyt,eating
Morley,morleyt,morleyt,getting on the counterNATURAL JOIN: **IF INTEREST.user_email were just called "email" instead** USER * INTEREST name,email,interest Hope,hmiller40,running Hope,hmiller40,cats Hope,hmiller40,football Dan,dlalich90,football Dan,dlalich90,space Dan,dlalich90,video games Morley,morleyt,sleeping Morley,morleyt,eating Morley,morleyt,getting on the counter
What is the FUNCTION operation? For what is it used?
not really sure where this came from… assuming “FUNCTION” == “AGGREGATE FUNCTION”…
AGGREGATE FUNCTION is used to group tuples in a relation by applying an aggregate function independently to each group.
examples of functions: COUNT, AVERAGE, MIN, MAX, SUM
How are the OUTER JOIN operations different from the INNER JOIN operations? How is the OUTER UNION operation different from UNION?
in INNER JOINs: tuples that are NULL or do not satisfy the join condition are not included in the result
in OUTER JOINs: all tuples from one or both relations are included in the result regardless of whether they satisfy the join condition or not
in UNIONs (ex: R U S): only tuples that are in both R and S are included in the result
in OUTER UNIONs: takes the unions of two relations R(X,Y) and S(X,Z) that are only partially compatible (i.e. not type compatible); the result includes tuples where the attributes are union compatible (each such tuple is only represented once) and tuples where the attributes are not union compatible
* OUTER JOIN == FULL OUTER JOIN on the common attributes of both relations
In what sense does relational calculus differ from relational algebra, and in what sense are they similar?
RELATIONAL ALGEBRA:
RELATIONAL CALCULUS:
BOTH:
* same expressive power (both are relationally complete)
How is a query tree representation different from a query graph representation of queries?
Discuss the transformation of the universal and existential quantifiers.
“One general transformation can be described informally as follows:
Define the following terms with respect to the tuple calculus: tuple variable
t
Define the following terms with respect to the tuple calculus: range relation
R
Define the following terms with respect to the tuple calculus: atom
Define the following terms with respect to the tuple calculus: formula
AKA condition
AKA well-formed formula
AKA WFF
COND
* made up of one or more atoms connected via the logical operators AND OR NOT and defined recursively by the following two rules:
Define the following terms with respect to the tuple calculus: expression
a general expression of tuple calculus is of the form:
{t_1.A_j, t_2.A_k, …, t_n.A_m | COND(t_1, t_2, …, t_n, t_{n+1}, t_{n+2}, …, t_{n+m} ) }
where:
* t_1, t_2, …, t_n, t_{n+1}, t_{n+2}, …, t_{n+m} are tuple variables
* each A_i is an attribute of the relation on which t_i ranges
* COND is a condition or formula [made up of atoms, meets the rules that every atom is a formula and that if F1 and F2 are formulas then so are (F1 AND F2), (F1 OR F2), NOT(F1), NOT(F2) ]
What is meant by a safe expression in relational calculus?
a safe expression is one where all of the results come from the domain of the specified relation (vs the entire universe of all possible tuples ever)
Give two reasons why relational calculus is important. Why is it considered as a nonprocedural language?
relational calculus is important because
1. it has its basis in real math in set theory and logic