References:
This article summarizes my personal understanding after watching the aforementioned video and reading the Kafka paper. I am still a student, so any inaccuracies or misunderstandings are unintentional—corrections are welcome.
Kafka is a widely adopted distributed streaming platform. In this article, I will explore the key trade-offs Kafka makes in order to achieve its design goal of efficient log processing, and how these decisions contribute to Kafka’s architecture and performance.
Kafka’s Design Objective
Kafka was originally designed to enable fast, real-time processing of log messages—a need driven by modern social media applications that require low-latency log handling. Logs are no longer merely passive records for offline analysis; they now directly influence user-facing features, such as personalized recommendations based on recent user interactions.
This context implies several characteristics:
- High volume of data with small individual message sizes
- Non-critical nature of the data (some message loss or duplication is acceptable)
- High timeliness requirements (second-level latency)
These traits lead to several implications:
- The overhead per message (e.g., header size) must be minimized
- Message transmission efficiency is critical (transmission time « TCP handshake time)
- Low requirements for consistency, allowing architectural simplifications
- High throughput is essential
Kafka Architecture Overview
Kafka is composed of several core components: producers, brokers, and consumers. Messages are the basic unit of data in Kafka—a message is simply a byte array with an optional key and metadata.
- Producer: Responsible for publishing data (messages) to Kafka topics. It determines which topic (and possibly which partition) the data should go to.
- Broker: A Kafka server that stores published messages and serves them to consumers. It handles disk storage and responds to consumer fetch requests.
- Consumer: Subscribes to topics and reads messages from the broker at its own pace.
- Topic: A category or feed name to which messages are published. Each topic is split into partitions for parallelism and scalability.
Trade-off 1: No Application-Level Cache
Kafka brokers do not implement their own in-memory caching. Instead, Kafka fully relies on the operating system’s file system page cache to buffer disk reads and writes.
This is feasible due to the following reasons:
- In many use cases, consumers are only slightly behind producers in offset, so sequential disk access patterns align well with how OS page cache operates.
- Kafka brokers are responsible only for message storage and delivery—not processing—so the data stored in the file system is exactly what is needed.
Advantages of this design include:
- Reduced memory management overhead within Kafka (no manual memory allocation or GC pressure)
- Cache persistence across broker restarts (as page cache is managed by the OS)
- Elimination of double caching (application-level + file system-level)
Trade-off 2: Stateless Broker
Kafka brokers maintain minimal state, allowing them to remain lightweight and easy to scale. Their core responsibilities are limited to:
- Serving messages based on a given topic and offset
- Deleting messages based on a time-based retention policy
Let’s examine several important architectural choices that support this stateless design.
Pull-Based Consumption
Kafka adopts a pull model rather than pushing messages from broker to consumer. Consumers explicitly request messages from a given topic and offset.
This design has several advantages:
- Consumers control their own pace of consumption, reducing the risk of being overwhelmed by incoming data.
- Brokers are relieved from having to track consumer state or implement flow control.
- Consumers can retry or re-read messages by requesting the same offset again, which is essential for fault tolerance.
Time-Based Message Retention
Unlike traditional message queues that delete a message only after it’s acknowledged by a consumer, Kafka retains all messages for a configurable time period, regardless of whether they are consumed.
This strategy has two key benefits:
- Brokers remain stateless with respect to consumer progress, eliminating the need to track acknowledgments.
- Consumers can recover from failures by simply re-reading messages from the last known offset.
Given that Kafka was designed for log data—which is typically time-sensitive and often processed shortly after ingestion—discarding old messages based on time is both practical and efficient.
Partition-Based Distribution
To support multiple consumers reading the same topic in parallel without duplicating effort, Kafka divides each topic into partitions.
- Each partition is consumed by only one consumer in a consumer group.
- Kafka does not require brokers to manage consumer-partition assignments directly.
- Instead, consumers coordinate among themselves to claim responsibility for partitions (details in next section).
This design enables parallelism, fault isolation, and even load distribution, all while preserving Kafka’s stateless broker model.
Trade-off 3: No Master Node
Kafka avoids introducing a central coordinator or master node to manage consumer coordination. Instead, it leverages Apache ZooKeeper for managing consumer group metadata.
A consumer group is a set of consumers that collaboratively consume messages from a topic. Kafka guarantees that each partition is consumed by at most one consumer in a group at a time. The consumers we talked later are by default within the same consumer group.
Here’s how this decentralized assignment works:
- Brokers publish topic and partition metadata to ZooKeeper.
- Each consumer registers its presence and subscribes to topics via ZooKeeper.
- Consumers exchange information (via ZooKeeper) to determine who is responsible for which partitions.
Typically, every consumer uses a decentralization partition assignment strategy (like range) to assign partitions.
Offset Tracking and Rebalancing
Each consumer commits its current offset (the last-read message) to ZooKeeper. However, there is often a delay between reading a message and committing its offset.
If a consumer crashes before committing, the next assigned consumer may reprocess messages from the last committed offset, potentially resulting in duplicate processing.
This is acceptable within Kafka’s target use cases (log data), where at-least-once delivery semantics are often sufficient, and occasional duplicate messages are tolerable.
Conclusion
Kafka’s architecture is a masterclass in fit-for-purpose system design. By making specific trade-offs for log-based streaming—such as tolerating message duplication, relying on time-based retention, and offloading cache and state management—Kafka achieves a system that is both elegant and highly performant.
Many of Kafka’s design choices reinforce each other:
- Pull-based consumption works well with persisted logs
- Persisted logs enable rewind and consumer-controlled state
Kafka demonstrates how focusing on a well-defined use case can enable a highly optimized and streamlined distributed system design.