What is a data warehouse?
A subject-oriented, integrated, time-variant, non-updateable collection of data used in support of management decision-making processes.
(Subject-oriented = customers, patients, etc)
(Integrated = consistent naming conventions formats, encoding sturctures; from multiple data sources)
(Time-variant = can study trends and changes)
(Non-updatable = read-only, periodically refreshed)
What is a data mart?
A data warehouse that is limited in scope
Why is there a need for data warehousing?
What are some issues with company-wide view?
What organizational trends drive data warehouses?
What is an operational system?
A system that is used to run a business in real time, based on current data; also called a system of record
What is an informational system?
A system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications
Comparison of operational and inofrmational systems
What are the data warehouse architectures?
(all involve some form of extract, transform and load (ETL)
Independent Data Mart visual
What are the limitations of an independent data mart?
Dependent data mart visual
logical data mart visual
Differences in Data warehouse and Data mart
Visual of three-layer data architecture for a data warehouse
What is an event?
a database action (creat/update/delete) that results from a transaction
Status vs event visual
What happens with transient data?
Changes to existing records are written over previous records, thus destroying the previous data content.
What happens with periodic data?
They are never physically altered or delected once they have been added to the store.
Besides transient data coverting to periodic data what other six changes must happen when going from a data mart to data warehouse?
What is derived data?
Data that have been selected, formatted, and aggregate for end-user decision support applications
What are the objectives of derived data?
What are the characteristics of derived data?
What is the most common data model?
Dimensional model (usually implemented as a star schema)