What are 5 architecture properties that are essential for a data warehouse system?
What is meant by seperation?
Analytical and transactional processing should be kept apart as much as possible
What is meant by scalability?
Hardware and software architectures should be easy to upgrade as the data volume and number of users requirements increase
What is meant by extensibility?
It should be able to host new appliations and technologies without redesiging the whole system
What is meant by security?
Monitoring access is essential because of the strategic data stored in data warehouses
What is meant by administerability?
Management of the system should not be overly difficult
What are the characteristics of a single-layer data warehouse?
When can a single layer data warehouse be successful?
If analysis needs are particularly restricted and the data volume to analyze is huge.
What are the stages in the two-layer architecture?
–> So it has four stages, but two physical layers, namely the source layer and the data warehouse layer
What is the source layer?
Data that is originally stored to corporate relational databases or legacy databases
What is meant by data staging?
Here data that is stored in the physical layer is extracted, cleansed and integrated into one common scheme.
For this, ETL tools are used.
What are ETL Tools?
Extraction, Transformation and Loading tools - to merge, extract, transform, cleanse, alidate, filter and load source data into a data warehouse.
What is the data warehouse layer?
The place where the information is stored to one logically centralized repository.
This data warehouse can sometimes automatically be accesed or sometimes data marts are used.
What is a data mart?
A subset or an aggregation of the data stored to a primary data warehouse. It includes a set of information pieces relevant to a specific business area, deparment or group of users
What happes in the analysis stage?
Integrated data is efficiently and flexibly accessed to issue reports, analyze information and simulate business scenarios
What are three reasons data marts are useful?
Which two types of data marts are there?
How can you avoid data inconsistencies between independent data marts?
By creating a primary data warehouse that is populated by the individual marts (instead of the other way around).
What are five advantages of the two-layer architecture?
What is the key characteristic of the three-layer architecture?
It has all the stages from the two-layer architecture, but in between the data staging and the data warehouse there is another layer called ‘reconciled data’.
-> So the data warehouse is populated by the reconciled data, not by the operational sources
What are the advantages of the added ‘reconciled data’ layer?
Which main architectual principles are used for data warehouse systems?
Which main five types of system are distinguished in scientific literature that also include the beforementioned layers?
What is the independent data mart architecture?
Different data marts are separately designed and build in a non-integrated fashion.