What is Data Warehousing?
A subject-oriented, integrated, time- variant, and non-volatile collection of data in support of management’s decision- making process.
Characteristics of Data Warehouse?
Benefits of Data Warehousing?
Comparison of OLTP Systems?
Main Purpose: support operational processing
Data age: current
data latency: real-time
data granularity: detailed data
data processing: predictable pattern of data operations and queries. High level of transaction throughput.
reporting: predictable, one-dimensional, static reporting
users: serves large number of operational users
Comparison of Data warehousing?
Main Purpose: support analytical processing
Data age: historic
data latency: time-variant
data granularity: detailed data, lightly and highly summarized data
data processing: less predictable pattern; medium to low level of transaction throughput.
reporting: unpredictable, multidimensional, dynamic reporting
users: serves lower number of managerial users
Problems of Data Warehousing?
Data Warehouse Architecture?
End-User Access Tools?
What is Data Mart?
A database that contains a subset of corporate data to support the analytical requirements of a particular business unit (such as the Sales department) or to support users who share the same requirements to analyse a particular business process (such as property sales).
Benefits:
Data Warehousing Tools and Technologies?
The requirements for a data warehouse DBMS?
ETL (Extraction, Transformation, Loading) processes?
The data destined for an data warehouse must first be extracted from one or more data sources, transformed into a form that is easy to analyze and consistent with data already in the warehouse, and then finally loaded into the data warehouse.
4 main operations in Data Mining?
example of application:
Retail / Marketing
- Identifying buying patterns of customers
- Finding associations among customer demographic characteristics
- Predicting response to mailing campaigns
- Market basket analysis
what is OLAP?
online analytical processing (OLAP) is the dynamic synthesis, analysis, and consolidation of large volumes of multidimensional data.
Phases of the CRISP-DM Model?
Data mining and data warehousing?
a. major challenge = identifying suitable data to mine
b. data mining requires single, separate, clean, integrated and self-consistent source of data
c. data warehouse is well equipped for providing data for mining.