What are the main goals of OLAP?
Compare OLAP applications with OLTP applications.
OLTP: many simple (point update) queries, small amounts of data, current data, fast update + high (transaction) throughput
OLAP: few complex queries (repeated), large data, current+historical data, fast calculation + load throughput + query tthroughput
What is a Data Warehouse and its basic architecture?
Separate datastore modeled by multidimensional historical data from various sources, supporting OLAP.
SOURCES -> ETL -> WAREHOUSE -> TOOLS
Explain the goals of multidimensional Data Modelling.
MOLAP (specific storage for multidimensional data) vs ROLAP (relies on RDBMS capabilities, enhanced SQL).
What is the role of the multidimensional cube in OLAP and its supported operations?
What are the main concepts of Multidimensional Data Modelling?
MD Schema: set of dimensions + set of measures (of a cube)
**dimensions: partially ordered set of category attributes (primary(finest granule), classification (hierarchy), dimension(additional info))
Explain and compare the terms Star Schema and Snowflake Schema.
Ways of modelling MD data in RDBMS
Star:
* fact table on the center has pointers to dimension tables and associated measures
* dimension attributes on dimension table -> more performance
Snowflake:
* Dimension tables are normalized, reflecting dimension hierarchy
* less redundant
What is the idea of slice and dice operations?
cub: join fact table with dimension tables
- > retrieve portions of interest from the cube (in WHERE clause)
- > slice: fix one dimension -> dice: restrict values to an internal
drill-down: finer partitions
rollup: coarser partitions
What is the main functionality provided by the SQL OLAP extensions?
extesions to GROUP BY: ROLLUP, CUBE, GROUPING SETS (grouping by different sets of attributes)
Give an overview over the ROLLUP operator.
Extends grouping semantics to combination of groups
GROUP BY ROLLUP (month, city, producer)
Give an overview over the CUBE operator.
Extends ROLLUP to produce multidimensional grouping (all group combinations)
GROUP BY CUBE (month, city, producer) = m,c,p+m,c+m,p+c,p+m+c+p+0
What are grouping sets and how do they generalize other group operators?
Allow multiple groupings on a single scan (pass)
-> arbitrary definition of desired groups
ROLLUP (a,b,c) = GROUPING SETS ((a,b,c), (a,b), (a), ())
**() -> special syntax for grand total
CUBE(a,b,c) = GROUPING SETS((a,b,c),(a,b),(a,c),(b,c),(a),(b), (c), ())
How does the analysis of functional dependencies provide more flexibility to expressing aggregations in SQL queries?
GROUP BY rule: columns projected must either be agg functions or grouped
-> functional dependency analysis may detect that column value is unique within group, allowing violation of the rule.