Characteristics
Data Operations
○ Two primary data operations in a data warehouse are Data Loading and Data Access.
○ Data loading involves extracting, transforming, and loading data into the warehouse. (ETL Operation)
○ Data access enables users to retrieve and analyze integrated data for decision-making and reporting.
Other Characteristics
→ Concurrent Use:
○ Data warehouses support concurrent use, allowing many different users to access and utilize the data simultaneously.
○ This is particularly useful in scenarios where multiple users want to interact with the data at the same time, such as when booking airline tickets to the same destination
→ Metadata: ○ Metadata is crucial in a data warehouse environment. ○ It provides information about the data, such as its source, meaning, relationships, and transformations. ○ Metadata helps users understand and manage the data within the warehouse. → Multi-dimensional Data Model: ○ Data warehouses often use a multi-dimensional data model, which means they organize data into dimensions and facts. ○ Dimensions are the perspectives by which data can be analyzed (e.g., time, geography), and facts are the measurable metrics (e.g., sales, revenue). → Aggregated Data: ○ Data in warehouses is often aggregated, meaning that it is collected and summarized from various sources to provide a higher-level perspective and support efficient querying and reporting. ○ Aggregations can include summaries, averages, totals, or other statistical measures. → Dimension Tables: ○ Data warehouses organize information into dimension tables, where each table represents data from a specific perspective. ○ For instance, understanding the effect of inflation on the Euro from a time-based perspective. → Synchronization Periodically: ○ Data warehouses synchronize periodically, typically through periodic transactions rather than after every transaction. ○ This approach is chosen to prevent potential crashes that might occur if synchronization happens too frequently. → Software Independent: ○ Definition: Data warehouses are software-independent, meaning they are not tied to a specific database software, and they can handle various data, relationships, schemas, and database languages.
OLAP and Cube Operations
→ OLAP stands for Online Analytical Processing, which is a technology that enables multi-dimensional analysis of business data.
→ OLAP is used to support business intelligence and decision-making processes.
→ Grouping of data in a multidimensional matrix is called data cubes.
→ A data cubes are used to handle multidimensional data that allows for complex analysis and querying.
→ Operations on data cubes typically involve manipulating and aggregating data along different dimensions.
Data cube Operations
Commands used in slicing
N dimensions data model (Data warehouse Schema Types)
Business Intelligence
encompasses a range of activities and tools aimed at transforming raw data into meaningful insights for business decision-making.
Steps for Business Intelligence
Goals of Data Warehouse
Data Warehouse vs Data Models
(Dominant Operations, Goals, Type of model, Technology)
OLTP vs Warehousing
(Dominant Operations, Goals, Type of model, Technology)
OLTP:
Dominant Operations
Optimal Goals
Improving Maintenance Efficiency: Achieving acceptable transaction processing times for routine maintenance operations (insertion, modification, deletion).
Model
Normalized model with a focus on maintaining data integrity, often in a 2D structure
DB Technology
VS.
Data Warehousing:
Dominant Operations
Optimal Goals
Improving Query Efficiency: Achieving acceptable response times for complex queries and reporting.
Model
Multi-dimensional model with N dimensions and data blocks, supporting efficient querying and analysis.
DB Technology