Component Roles Flashcards

(59 cards)

1
Q

What are the three main responsibility planes in Kafka?

A
  1. Control plane: cluster metadata and leadership management, handled by the controller. 2. Data plane: partition reads, writes, and replication, handled by partition leaders and followers. 3. Consumer-group plane: membership, rebalancing, and offset commits, handled by the group coordinator.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the controller do in Kafka?

A

The controller manages cluster metadata and control-plane decisions such as topic and partition metadata, broker membership changes, and partition leader elections. It is not in the normal produce/fetch data path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does a partition leader do?

A

A partition leader is the authoritative broker for one partition. Producers write to it, consumers fetch from it, and followers replicate from it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a partition follower do?

A

A follower is a non-leader replica that actively fetches data from the leader to stay caught up and can become leader if elected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is Kafka replication push-based or pull-based?

A

Pull-based. Followers send fetch requests to the leader and copy data from it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ISR in Kafka?

A

ISR stands for In-Sync Replicas. It is the set of replicas sufficiently caught up with the leader and eligible for safe leader election.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Is every follower automatically in the ISR?

A

No. A follower is any non-leader replica. It is only in the ISR if it is sufficiently caught up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Does replication factor include the leader?

A

Yes. Replication factor is the total number of replicas for a partition, including the leader. RF=3 means 1 leader and 2 followers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between replication factor and ISR size?

A

Replication factor is the configured total number of replicas. ISR size is the number of replicas currently in sync. RF is static; ISR is dynamic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does acks=all mean?

A

The leader waits for all replicas currently in the ISR to acknowledge the write, not all assigned replicas in the replication factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does min.insync.replicas do?

A

It defines the minimum ISR size required for a write with acks=all to succeed. If ISR is below that threshold, the write fails.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is acks=all often misunderstood?

A

Because many think it means all assigned replicas. In reality, it means all replicas currently in the ISR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the group coordinator do?

A

The group coordinator manages consumer-group state: membership, joins, heartbeats, rebalances, and offset commits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where are consumer offsets stored?

A

In the internal compacted Kafka topic \_\_consumer_offsets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Who handles an offset commit?

A

The consumer sends the offset commit to the group coordinator, which persists it in \_\_consumer_offsets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why are offset commits handled by the group coordinator and not the partition leader?

A

Because committed offsets are consumer-group progress state, not part of the business topic’s data stream.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can a broker have multiple roles at the same time?

A

Yes. A broker can be leader for some partitions, follower for others, coordinator for some consumer groups, and in combined KRaft mode also have the controller role.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Are Kafka roles global or scoped?

A

Scoped. A broker is leader for specific partitions, coordinator for specific groups, and controller only if configured and currently active in the controller quorum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the difference between controller and partition leader?

A

The controller manages cluster metadata and leader elections. The partition leader handles actual reads, writes, and replication for one partition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the difference between controller and group coordinator?

A

The controller manages cluster-level metadata and leadership. The group coordinator manages consumer-group membership, rebalances, and offset commits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the difference between partition leader and group coordinator?

A

A partition leader serves actual topic data for one partition. A group coordinator manages group control functions such as join, heartbeat, rebalance, and commit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does a producer find where to send records?

A

It first gets cluster metadata from a broker, learns which broker is leader for the target partition, then sends the record to that partition leader.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Does a producer send records to the controller?

A

No. A producer sends records to the leader of the target partition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does a consumer get actual records?

A

It fetches records from the leader of its assigned partitions.

25
Does a consumer fetch records from the group coordinator?
No. It uses the group coordinator for membership and commits, but fetches actual records from partition leaders.
26
What does a consumer use the group coordinator for?
Joining the group, participating in rebalance, sending heartbeats, and committing offsets.
27
What is a bootstrap broker?
The initial broker a client contacts for metadata. It helps the client learn leaders, coordinators, and broker endpoints.
28
Can admin requests go to any broker?
Many metadata-changing admin requests can go to a broker listener, and that broker can forward them to the active controller. This is not true for all Kafka traffic.
29
Which Kafka traffic must go to specific brokers?
Produce and fetch requests must go to the correct partition leader. Consumer-group membership and offset-commit operations must go to the correct group coordinator.
30
What is the classic consumer-group rebalance flow?
Find coordinator, join group, one consumer becomes group leader, assignment is computed, assignments are distributed via SyncGroup, consumers fetch from partition leaders, and heartbeats keep them in the group.
31
In the classic group protocol, what is the group leader?
A temporary consumer-side role during rebalance. One consumer member helps compute partition assignments for the group.
32
What is the difference between group coordinator and group leader?
The group coordinator is a broker-side role managing the group protocol. The group leader is a temporary consumer member chosen during classic rebalance to help compute assignments.
33
What are the three different 'leader-like' concepts that are easy to confuse?
1. Active controller: metadata leader at cluster level. 2. Partition leader: data leader for one partition. 3. Group leader: temporary consumer member during classic rebalance.
34
What usually triggers a rebalance?
A new consumer joining, a consumer leaving, a consumer timing out, or any change affecting partition ownership in the group.
35
What are heartbeats for?
They tell the group coordinator that the consumer is still alive as a member of the group.
36
What is `session.timeout.ms` about?
Liveness timeout. If heartbeats are not received within this time, the coordinator considers the consumer dead and triggers rebalance.
37
What is `max.poll.interval.ms` about?
Processing-progress timeout. If the consumer does not call `poll()` again within this interval, Kafka considers it stuck or too slow and rebalances the group.
38
What is the difference between heartbeat liveness and poll progress?
Heartbeats prove the consumer is alive to the coordinator. Poll progress proves it is still actively participating in the processing loop.
39
Can a consumer be alive but still lose its assignment?
Yes. If it does not call `poll()` again before `max.poll.interval.ms`, rebalance can happen even if the process is still running.
40
What happens to partition ownership during rebalance?
Previously owned partitions can be revoked from one consumer and reassigned to another.
41
Why is revocation important for offset commits?
If a consumer loses a partition before committing its latest processed offset, the next consumer may resume from an older committed offset and reprocess messages.
42
What does a committed offset represent?
The next offset to read, not the last record already processed.
43
If a consumer committed offset 850, from which offset will the next consumer resume?
Offset 851.
44
What is `commitSync()`?
A blocking offset commit. The consumer waits until the commit result is known.
45
What is `commitAsync()`?
A non-blocking offset commit. The consumer continues without waiting synchronously for success.
46
Why use `commitAsync()` during normal processing?
To avoid slowing the processing loop with blocking commits and improve throughput.
47
What is the risk of `commitAsync()` around rebalance?
A partition may be revoked before the latest processed offsets were durably committed, so another consumer may resume from an older offset and reprocess messages.
48
What is a common manual commit pattern?
Use `commitAsync()` during normal processing for throughput, and `commitSync()` in revocation or shutdown handling for the latest completed work.
49
Why use `commitSync()` in `onPartitionsRevoked()`?
To durably save the latest fully processed offsets for partitions about to be lost, reducing duplicate processing after reassignment.
50
What is the difference between last processed offset and last committed offset?
Last processed offset is what the application handled in memory. Last committed offset is what the group coordinator durably recorded as progress.
51
Why can duplicate processing happen after rebalance or crash?
Because the next consumer resumes from the last committed offset, not from the last record processed only in memory.
52
What is KRaft?
Kafka’s modern metadata architecture in which Kafka manages its own metadata using a Raft-based controller quorum instead of ZooKeeper.
53
What is the difference between KRaft vs ZooKeeper and classic vs consumer protocol?
KRaft vs ZooKeeper is about cluster metadata architecture. Classic vs consumer protocol is about consumer-group rebalance protocol. They are separate concerns.
54
Which consumer-group protocol should be prioritized for CCDAK study?
The classic consumer-group protocol.
55
What election or selection mechanisms are important to know in Kafka?
Active controller election, partition leader election, and in the classic protocol the temporary selection of a consumer group leader during rebalance.
56
How is a new partition leader chosen after failure?
From eligible in-sync replicas. Only ISR members are eligible for safe leader election.
57
What is the preferred leader for a partition?
The first replica in the configured replica assignment list. Kafka may later move leadership back to it.
58
Is the new partition leader always just the first ISR entry?
No. The safe rule is that the new leader is elected from eligible ISR replicas. The preferred leader is separately the first replica in the assignment list.
59
What should be memorized about leader elections for the exam?
Know that controller leadership and partition leadership are different, partition leaders are elected from ISR, and the preferred leader is the first replica in the replica list.