Intro to Apache Spark Flashcards

Question 1

Q

A unified analytics framework providing a consistent interface for handling big data across multiple domains.

Answer

A

Apache Spark

Question 2

Q

Provides the foundation for all Spark applications, handling memory management, fault recovery, scheduling, and task distribution.

Answer

A

Spark Core Engine

Question 3

Q

The brain of a Spark application, responsible for planning and coordinating execution

Question 4

Q

Manages cluster resources and allocates them to the Driver (internal).

Answer

A

Cluster Manager (Master)

Question 5

Q

Nodes in the cluster that host Executors

Question 6

Q

Processes on Worker nodes that execute tasks assigned by the Driver.

Answer

A

Executors

Question 7

Q

Groups of tasks that can be executed in parallel

Question 8

Q

The individual units of work executed by Executors.

Question 9

Q

This Spark UI provides per-application monitoring through the SparkSession, offering details on progress, DAG visualization, resource usage, and more.

Answer

A

Application UI

Question 10

Q

This Spark UI gives a cluster-wide view for monitoring multiple applications, showing the health status of nodes and overall resource allocation.

Answer

A

Master UI

Question 11

Q

Interactive clusters that support notebooks, jobs, and dashboards with configurable auto-termination.

Answer

A

All Purpose Clusters

Question 12

Q

Ephemeral clusters that start when a job runs and terminate automatically upon completion, optimized for non-interactive workloads

Answer

A

Job Clusters

Question 13

Q

Optimized clusters for SQL query performance with instant startup and auto-scaling to balance cost and performance

Answer

A

SQL Warehouses

Intro to Apache Spark Flashcards

(13 cards)