A unified analytics framework providing a consistent interface for handling big data across multiple domains.
Apache Spark
Provides the foundation for all Spark applications, handling memory management, fault recovery, scheduling, and task distribution.
Spark Core Engine
The brain of a Spark application, responsible for planning and coordinating execution
Driver
Manages cluster resources and allocates them to the Driver (internal).
Cluster Manager (Master)
Nodes in the cluster that host Executors
Worker
Processes on Worker nodes that execute tasks assigned by the Driver.
Executors
Groups of tasks that can be executed in parallel
Stages
The individual units of work executed by Executors.
Tasks
This Spark UI provides per-application monitoring through the SparkSession, offering details on progress, DAG visualization, resource usage, and more.
Application UI
This Spark UI gives a cluster-wide view for monitoring multiple applications, showing the health status of nodes and overall resource allocation.
Master UI
Interactive clusters that support notebooks, jobs, and dashboards with configurable auto-termination.
All Purpose Clusters
Ephemeral clusters that start when a job runs and terminate automatically upon completion, optimized for non-interactive workloads
Job Clusters
Optimized clusters for SQL query performance with instant startup and auto-scaling to balance cost and performance
SQL Warehouses