This video provides a clear and essential breakdown of Kafkaโs core mechanics that many developers overlook in practice. It effectively demystifies the trade-offs between partitioning and rebalancing with practical, high-level precision.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Kafka Interview Questions That Confuse Even Experienced Developers ๐ณAdded:
Most of the time Kapka interview questions are not difficult but Kapka's internal behavior is so smart that it can easily confuse to any developers and trust me it doesn't matter whether you are a fresher or an experienced developer. If your Kapka fundamentals are not clear then interviewer can easily catch you with a few tricky questions. So in this video let's discuss some very tricky Kapka interview questions with simple examples and after watching this complete video you rate yourself how well do you really understand Kapka internals and also drop a comment. Okay. All right. So without any further delay let's get started.
So imagine we have a capka topic called order topic and it has six partition P 0 P1 2 3 4 and five. Okay. So this particular order topic has six partition. Now the first question from the interviewer how many consumer instance are recommended here? Since your topic has six partition, how many consumer instance needs to configure.
So I would request before I answer this you can open your note and write your answer down so that you can validate at the end how strong you are in Kapka internals. So since we have six partition it's always recommended to use six consumer instance. Why we need to use the six consumer instance? Because capka follow the rules. One partition, one active consumer. Like this. P 0 will point to the C1.
P1 point to the C2. Like this. Each partition will assign to each consumer instance. But make sure this needs to be part of a single consumer group. Let's say order consumer product something like that. So here answer is always it's recommended to keep the six consumer as per the partition count so that you can achieve the maximum parallelism. Now let's say you have these six partition but you will define 10 consumer instance. Will it improve the performance? Not at all.
Okay. Because Kapka does follow the rules one partition per one active consumer. Fine. Now let's move to the next question. What happens when if you have only two consumer instance? Okay.
If you have six then each partition will assign to the each consumer instance.
But if you have only two consumers then what will happen? Now let me copy this particular image so that I can explain you.
Now what happen if you have only two consumer instance? Let me remove others.
C3 4 5 6.
Now in this case since we have two consumer instance now we have six partition Kapka always distribute partitions as evenly as possible. Okay.
So P 0 P1 P2 will be consumed by this C1 P3 P4 P5 consumed by C2 instance. Okay.
So since we have even numbers and we have two consumer instance it will distribute 33 to each consumer instance something like this you can see this right each consumer instance point to three partition they will just calculate the event and will equally distribute okay now again interview will try to confuse you what if I have four consumer instance what you will answer since it is two it was equally divided three three partition to the each consumer instance but if you have four consumer instance then how it will work so the answer is it doesn't matter how many consumer instance you have kapka will always try to balance the partition based on the consumer instance count now here C1 can point to P 0 P1 C2 can point to P2 P3 C3 can point to P4 and C4 can point to P5 or any consumer instance can point to two different partition. Okay. So it will look something like this.
C1 can point to two partition. C2 can point to single partition. C3 can point to single partition. C4 can point to two different partition. It depends. It could be C3. C3 also can point to two partition and C4 can point to single partition. Kapka internally will try to balance the partition and consumer instance mapping.
So let's move to the next question. Now you might have a question. Hey Vasan, how do you know C1 will point to two partition? C4 will point to two partition. It could be C2 and C3 who can point to two partition. C1 and C4 can point to a single partition. How do you know? or what principle does capka follow internally to distribute partition to different consumer. So what happened? Kapka follows partitionbased parallelism and internally Kapka uses consumer group realancing. Whenever a consumer joins or a consumer leaves or a consumer crashes, Kapka redistributes partitions again. Okay. So always remember the thumb rules. one partition, one active consumer inside the same consumer group. Now let's move to the next question. What happens if we have eight consumers? If you have six consumers, each will map to one, right?
What happen if we'll have eight consumer instance? Let's say we have something like this.
Now in this case, what will happen? Your topic has six partition. But you have eight consumer instances. Now what happened in this case? Six partitions.
So only six consumer can actively consume and rest two will go and sit ideal. Okay. For example, it will be something like this.
Now if you see there are two partition C5 and C6. These two consumer instance will stay ideal. Okay? Because six partition will map to only six consumer instance. If you'll keep or if you'll try keep adding more consumer instance, it doesn't make any sense. It will never improve any performance. Remember that because people always assume okay if I'll spin up more instance then I'll get better performance benefit. But that is not true. Okay fine. Now let's move to the next question. What if one active consumer dies from this consumer instance? Let's say this C3 die.
Now what will happen in that case? Kapka immediately trigger rebalancing. What happened? There are two ideal consumers, right? Which is not doing anything. one of them will make immediately active and this P2 will point to that particular consumer instance. Kapka immediately does the rebalancing so that your partition will assign to available consumer instance.
In our case, it assigned to C5.
Right? So here this C5 instance was ideal V4 which will act as a standby member of my consumer group. Okay, it will easy to do the rebalancing.
Remember this term if there is something goes wrong in the consumer instance. If one consumer die or even it is crashed something happened wrong then that concept internally capa execute called rebalancing. Okay. Or you can say consumer revalancing. Now you think and answer me. Does this realance happen only for affected partition? Because in our case C3 consumer instance die earlier C3 was point to partition P2.
Correct. So does it does this realance happen for this specific consumer instance or whole consumer instance present inside this consumer group. This should be consumer instance.
Now what happened in traditional capka the realencing happens whole consumer group level not in the part not in the consumer instance label. this particular rebalancing will apply in the whole consumer group label. Okay. What actually happen when one consumer dies let's say in our case C3 all consumer pause temporarily C1 C2 4 5 who all are active pointing to the partition they will pause temporarily then Kapka recalculates the assignment Kapka will check okay how many consumer instance is available for me he will do some recalculation then entire group participate in the realance not only the single Kapka consumer instance. Okay, trust me this is one of the best confusing question if you don't understand this particular concept.
Fine. Now let's move to the next question. In this previous use case we have two backup ideal consumer instance.
So that is the reason if one die immediately Kafka assigned to the available one. But what if we have only six consumer instance?
Now the question will be we have exactly six consumer instance and let's say one die here. Now what will happen? Earlier we have backup right C7 C8 we have. So Kapka immediately assign partition to the other consumer instance. But if you have exact same identical consumer instance as per the number of partition then what will happen? Okay. So in that case capka pause the whole group and redistribute again. So for example here C3 is direct this particular P2 can listen by consumer instance one.
Okay. Kapka doesn't simply move only one partition. Instead entire assignment may change. This partition P2 can again started listen by C1 or it can again started listen by any other partition who is available until unless another instance spin up automatically to this consumer group. Okay. But do we see any problem in this approach?
My consumer three die. Why should I do the rebalancing in whole consumer group label? Why I should expect some downtime here? If consumer 3 is down, let C1, C2, other available consumer instance process the records. Why? It will take time to rebalance everything.
That is one problem in traditional Kapka. But in modern Kapka, it introduced something called cooperative rebalancing. What it does? It only affect the partitions and move gradually. It will only do the rebalancing on the affected instance.
Okay, which will reduce the downtime.
This is really interesting, right? I don't have much idea about it. I found it over the uh internet. But definitely I will explore about this cooperative rebalancing and if I found some good resource, I'll definitely try to make a video of it. Okay, fine. Now let's move to the next question. How many copy of Kapka messages are stored? Now let's try to answer this. How many copies of Kapka message are stored in Kapka ecosystem?
Now let's say I publish some message to this particular topic.
Let's say hello world. Now my question is how many copy of this hello world Kapka message will stored in Kapka infrastructure?
How many copy of this particular text?
Can anyone try to answer before I respond to it? This depends how many replication factor you have.
Let's say I have three replication factor. It means three copy of your messages will store in Kapka. However, there is one leader copy and two followers copy.
It depends on your replication factor what you define while create the topic based on that it will have that many number of copy of your events or messages. Now you might have a question hey Bashan what is this leader and follower. So in Kapka we do create the replica to achieve the fall tolerance where the leader copy what it does it handle all the read and rights. However this followers copy is your replicated data which provide the fall tolerance in case something goes wrong. Kapka will retrive the messages from this follower copy. Okay. That is the reason we have something called leader and follower.
Now the next question that is really very interesting question. How does Kapka guarantee ordering? Okay. Now let's say now you need to understand this very carefully. Now what happened? Imagine we don't have any partition or it has only a single partition. Now any event let's say order created payment done order received if any event I published it will go to this particular partition and since capka store messages sequentially using offset so consumer will also listen in the same order okay the way it will store in the capka topic similar way your consumer will also get the proper order first order created consumer will also listen okay order created Payment done, payment done, order shipped, order saved. Okay, if there is no partition, this will work and it maintain the ordering. Now let's say this topic has two partition P 0 and P1.
If it has multiple partition then the message now may can distribute to different partition. For example, order created come to P 0.
Order shipped come to P1. Okay. Or payment done come to P1 as well.
Now in that case what will happen?
Consumer C1 and C2 what they will do?
they will read the assigned partition P 0 and P1 independently and in parallel Azure. So application may see okay C1 has order created then or C2 might get first messages order saved and payment done there will be no order. C1 can execute fast or C2 can execute fast depends on their parallelism. Okay, that is the reason Kapka does not guarantee which partition finish first because P 0 P1 there are two partition message is messed up here and consumer instance are there they will execute parallelly. So in Kapka global ordering is not guaranteed across multiple partition.
Okay, if you have multiple partition and if you are just sending the events without having any unique identification then Kapka will not guarantee about ordering. Then you might have a question. Hey Basan, then how do companies maintain ordering?
There is a smart trick everyone playing while using the Kapka. They use the same partition key. When I'm saying partition key, let's say order ID or customer ID, any unique ID, you can keep as a key. Let's say 101.
Now this is my what? This is my key. Now what happened when you define key let's say order id.1 capka hashing ensure all event for this specific order id one will go to the same partition always.
Now all event related to the order 101 will always go to the P 0. Let's assume it can be P 0 or P1. But once 101 started goes to this P1 partition going for a n number of events. If you'll send with this particular key, it will always go to the P1 only. So if specific order id will always go to the same partition then now kapka can maintain perfect sequence because everything is inside one partition right so the smart game what industry follow to define a key while playing with kapka okay now remember this is what interview will try to confuse but you always respond kapka guarantees ordering only within a partition not across the entire topic.
Okay, this is what you need to respond.
So I hope now you get little confidence boost in Kapka concept because just knowing producer and consumer is not enough to say you know Kapka as a backend developer. These fundamentals are extremely important and honestly in this video we just scratched the surface of Kapka. There is still a lot of interesting and tricky concept to explore deeper. So if you are interested let me know in the comment. We can continue this Kapka interview series and discussion or if you have any specific Kapka scenarios or confusion then drop that in a comment. We'll definitely discuss that as well. So if you found this video helpful make sure to like, subscribe and share it with your friends. I'll see you in the next video.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 viewsโข2026-05-28
How agent o11y differs from traditional o11y โ Phil Hetzel, Braintrust
aiDotEngineer
450 viewsโข2026-05-28
Re: ๐ฃ๏ธ๐theprophedu๐2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 viewsโข2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation๐ฏโ
LearnwithSahera
1K viewsโข2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 viewsโข2026-05-29
Search Algorithms Explained in 60 Seconds! ๐ค๐จ
samarthtuliofficial
218 viewsโข2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 viewsโข2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 viewsโข2026-05-29











