Data cube
A multidimensional matrix representing high-dimensional space to show how data attributes are arranged.
Columnar
storage
Stores data by columns instead of rows to provide faster analytics and better compression.
Data lake
A central storage for holding raw data (structured and unstructured) in its original format.
data warehouse
A central database that stores cleaned and structured data and optimized for analysis reporting
Roll-up
An operation that make similar data attributes having the same dimension together
Dicing
Performs a multidimensional cutting that cuts a range of more than one dimension, resulting in a subcube
Slicing
Filters unnecessary parts of the data to focus on a particular attribute for analysis
Drill down
The reverse of roll-up. It subdivides information for coarser granularity analysis, zooming into more detail
Pivot
Transforms the data cube in terms of view without changing the data. It allows the user to change the viewpoint
Multidimensional data
cube
Use multi-dimensional arrays to store data; it is faster and more efficient.
Rational data cube
Uses relational tables to store data; it is slower compared to multidimensional cubes.
AWS uses data columnar Amazon Redshift
Stores data in columnar format to speed up analytics.
AWS uses data columnar Amazon Athena
Queries data stored in columnar formats like Parquet and ORC directly on S3.
AWS uses data columnar AWS Glue
Supports ETL (Extract, Transform, Load) jobs that read and write columnar formats.
AWS uses data columnar Amazon S3 select
Reads specific columns from Parquet or ORC files.
Graph processing
The computational process of analyzing data structured as a graph (vertices and edges) to extract insights, such as finding the shortest path or influential users.
Graph databases
A type of NoSQL database that uses graph theory to store, map, and query relationships.
AWS SageMaker
A fully managed service that simplifies building, training, and deploying machine learning models.
Google cloud AutoML
Used for training an AutoML machine learning model and its development.
Google cloud speech to
text
A speech recognition system for transmitting speech to text, supporting 120 languages.
Google cloud vision AI
Used to create machine learning models for cloud vision that detect text, etc..
Microsoft Azure
machine learning
Used to create and deploy machine learning models on the cloud.
Microsoft Azure
databricks
Provides Apache Spark-based analytics.
Microsoft Azure bot
service
Provides smart, intelligent, and scalable bot services