What is the check clause?
defines constraints on the values of a particular attributes
What is jaccard bag similarity?
Jaccard similarity but counts repetition of elements
Max is 0.5
What are personalisation and customisation in terms of data?
What is the official definition of a lossless decomposition?
let R a relation schema (with constraints). A decomposition R1, R2, … Rn of R is called lossless iff. for each valid relation (instance) r(R): r = piR1(r) join piR2(r) join …
What functional dependencies can be derived from armstrongs axoims? (4)
if X -> Y and X -> W, then X -> YW
if X -> YW, then X -> Y and X -> W
if X -> Y and WY -> Z, then XW -> Z
If X is a candidate key, then X -> Y for all Y
How do you create a view in SQL?
CREATE VIEW YoungActiveStudents (name, grade)
AS SELECT S.name, E.grade
FROM Students S, Enrolled E
Where S.Sid = E.Sid and S.age <21
Why is entity resolution useful? (3)
improves data quality and integrity, fosters re-use of existing data sources, optimises space
What is the standard blocking algorithm?
What is the similarity of two signatures?
the fraction of the hash functions in which they agree (for which signatures have the same value)
What is entity integrity?
Means the primary key cannot be null
How do you delete an attribute from an SQL table?
ALTER TABLE Students
DROP firstYear
What is a legal instance of a relation?
one that satisfies all specified ICs
What is a view in SQL and what is it used for?
A view is just a relation, but we store a definition, rather than a set of tuples
Views can be used to present necessary information (or a summary) while hiding details in underlying relations
What is locality sensitive hashing?
generate from the collection of all elements (signatures) a small list of candidate pairs: pairs of elements whose similarity must be evaluated
What is a similarity metric for documents?
Represent document as a set of its k-shingles
Use jaccard set similarity
What are the three ways of enforcing referential integrity in SQL?
What is the purpose of integrity constraints?
Integrity constraints guard against accidental damage to the database, by ensuring that authorised changes do not result in a loss of data consistency
What is a candidate key in terms of closure?
The minimal set of attributes for which their closure is all the set of attributes in a relation
How do you find the similarity of two sets(documents) from the shingle matrix?
no. of rows where both columns are 1 / no. of rows where either column is 1
How do you add a check into an SQL table?
CREATE TABLE section( semester VARCHAR(6) CHECK(semester IN (‘Fall’, ‘Winter’, ‘Spring’, ‘Summer’)), year NUMERIC(4,0) CHECK(year>1990))
What is functional dependence?
What are covering constraints?
instances of the children of an entity include all instances of their parent (ie cover it)
What is the key idea of hashing?
“hash”each column C to a small signature h(C), such that:
What is the gap distance?
Cost = insert + open + extend