This video demonstrates a senior system design interview where a candidate designs an e-commerce platform with three core features: keyword search, cart management with checkout, and price change notifications. The candidate implements a cart reservation system using Redis with a 10-minute TTL to prevent overselling, handles concurrent access through optimistic locking, and uses Change Data Capture (CDC) with queue-based workers to notify subscribers of price changes. The design addresses scalability through queue-based notification processing, handles edge cases like popular items with many subscribers, and discusses trade-offs between different approaches including Redis vs. database-level locking and Server-Sent Events vs. WebSockets for real-time notifications.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Senior System Design Mock Interview: Design an e-commerce platformAdded:
So um let's design an e-commerce website. What I have in mind and um the main three features I want to do is ability to search based on keywords.
>> Okay.
>> And ability to add items to the cart and check out with it.
>> But at the time of adding items to the cart, we cannot oversell. If there is an item that is going close to zero, >> yeah, >> only one person can add it. We cannot allow two people to add it to the card.
>> Okay. And then the last one is when price changes and similar to like Airbnb and other companies if the price change some people will subscribe to the price changes and I would like the system to notify them about the price changes.
>> Oh okay. Okay. So let me write some of this down. So have an e-commerce.
We want search search items, right? You said that first.
>> You said we also wanted people to add items to cart only if there is stock, right?
>> Uhhuh. And the last thing you said is uh ability to track prices and have notifications.
>> Yeah. And the way this would work is people will subscribe for certain products.
>> Yep. Yep. Okay.
>> And then Yeah. Once the price drops.
Yeah. And then last last functionality is check out with whatever in the cart.
But let's do that as an optional like if we really master the other three. It depends how it goes. I just want to add that but it's not that critical.
>> Okay. Absolutely. So I see that um yeah I see that this question touches a lot of stuff.
>> All right.
Yeah. Yeah. I just want to really practice more with you. We we'll see how >> of course. Uh okay. So are these are the main five four things right? Nothing else. No like returns or cancellations.
One thing I want to ask about the cart only if they're stock. Will the cart have a time to live?
>> Yeah. If the if and let me just tell you what would you do here like what would be your recommendation?
>> Okay. So like if I'm thinking about it like ticket master booking when I do like a reservation they say like you have 10 minutes to check out.
>> Yeah. So that's my initial gut instinct, like a 10 minute TTL because we don't want to just someone have someone add something to a cart and hold it indefinitely and then no one else buy it. Right.
>> Exactly.
>> I'm just going to say 10-minute TTL.
>> Exactly.
>> But make sure we can tune that.
>> Okay. Nonfunctional.
Okay. Nonfunctional. So search of items.
We want the search to be low latency and available.
Uh doesn't have to be exact if stuff like is on stock or not. If it shows up, they click, they go through like stock.
Oh crap. But we want that to be available >> people to add on. So for the for checkout and cart, we want this to be consistent because we're dealing with a stock amount. We have like a real world asset behind it. We want to people be able only be able to check out if there's actually inventory left.
And then notifications.
Uh we also want this to be available and a relatively low latency uh to send out the notification.
You know we uh people are subscribing to this. They want to hop on it quickly. So we want notification.
We want them to be able to get the notification quickly or let's send it send it out and they can do their purchases.
So, but that doesn't have to be like the most like super robust financial transaction. We do want checkout and cart to be consistent because those are financial and have a real world assets.
Okay, cool. I think how do you feel about these nonfunction requirements?
>> This is um solid. Let's also just add fault tolerance. Of course, fall tolerant, fall tolerant.
>> If we get to check out, we can practice that on the check out.
>> Yes.
>> More specifically, >> right? Fall tolerant on the on the process of purchase, the stages of purchasing. Cool. Okay. So, let's define a quick API for each thing API. So for search items we can do let's call this our uh let's define this as a store. Let's see let's see let's call it let's call it let's call it my store >> just to practice through real life example we're going to try to do everything within 40 45 minutes. Okay.
>> Yes. 40 minutes. Okay. So quick >> I have I have all the time. I'm not gonna rush us out but I just want to have like a real time practice with you.
>> Okay. Virtual store search and then when we talk our search we can talk about like the keyword equals word. So that could be the API for search.
Uh pretty straightforward.
And then people to add items. So it could be a post to store add to cart and then item and then we also probably need a user ID item ID and then we also need a user user ID.
Then we also might need a cart cart cart ID maybe. Yes, maybe. No, I don't know yet.
And then check out seller think post virtual store and check out and then that could be a cart cart ID notification that would be a be a put be a post doesn't really matter at this point that can be uh subscribe describe and then that can be uh user ID user and that could also be uh items you subscribe to a specific item I'm guessing right okay uh check out check out at cart search of items and then subscription okay I think that's enough for the API I think they're pretty fine Okay, let's talk about entities. We obviously have an items item ID.
Um, I'm just going to assume all items are deliverable everywhere the users can do. So, you don't have to worry about this place. Okay, item ID. We obviously need a quantity.
Um, quantity. Now, I'm getting a bit of ahead of myself. Um might do we go to hello this time I might do an optimistic locking mechanism here. So we're going to have a version ID um depending on what the scale is which we haven't talked about yet. That's item and then we have a subscriptions subscriptions subscription.
This will be also about user ID and then item ID. This probably has to be normalized. I'm just guessing.
Um, and then we also need a cart.
cart. This will have a user ID, cart ID, and then also um and then we might need another table items which would be card ID and then um okay, so I think those are my entities right off the off the top of my head. Uh I think we should go to the high level design. I will be adding to this as we get into the database schema and I come up with things we need to do. So we're at seven minutes. So we have like 35 minutes.
>> You have a lot of time. Don't don't rush. Okay.
>> Okay. Sounds good. Okay. So I'm going to start out with like a simple design and then expand from there as to what we do.
Oh, I forgot to add notification here too. Oh no, subscription right there. So we have our user.
Now this is interesting. Now I usually say like oh we have a dummy log balancer connect to a service but we might actually need a little bit more intelligence here because we have a cart and the cart does have a sense of state you know so we'll connect to an API gateway and it could potentially be a websocket connection. I'm not super sure if we need it yet.
And this will communicate with our service our >> I want to roll this out now. I usually let you like decide on this later and or like set everything but I want to rule this out.
>> Why would you use a websocket?
>> Why?
>> So because the idea of a cart has a TTL and we need to maintain a connection during the checkout phase.
>> I'm just I'm thinking we might need that. We might be able to reconstruct it from the database and keep it stateless.
But I guess it depends how I designed it. So >> yeah, I think I mean it if users put something on the cart and go use another device or close >> catch it, right?
>> Yeah.
>> Um and there is very little communication between the user and the card. So even like you say like just will add something to the cart maybe update the cart but it's not like every second there will be communication back and forth >> then you're right I think then websocket will be overkill >> okay okay let's rule this out then >> sounds good thank you okay so the virtual so the connect to the virtual store for search is going to look at some type of um I'm guessing it will look at some elastic search for search and I don't think that has to be too complicated. It can just be keyword keywords they're looking for and then the documents here would be the item ids.
>> So elastic search for search and then now for actual carts and items and stuff.
Let's it will go to a database here some type of database solution.
Okay. So how would it look like? Let's say the uh they get the item ID and they want to know if it's in stock. They can search they go to the database and get the item and they get the quantity and see is available. We also probably want in here some metadata of the like descriptions and stuff uh and S3 links and object store links links for images that will be there too.
So we can get all that information and let's say the user wants to start a checkout, wants to start a cart with that TTL of 10 minutes that I said earlier. So what I would suggest we do is we have the database. I would suggest we have a in-memory inmemory cart state cart inmemory soft purchase.
So let's see how that will look like. So first things first before the in-memory thing what we have to do is create a new cart.
Uh this will go to the database. So let me write this down.
First things first, create a new cart.
This will add the cart to carts table.
The user now has a cart.
Now we have a cart. Uh that's that access database and there's nothing inside the cart items. Now I want to I want to purchase it. So what we do is we want to soft purchase not necessarily decrement the quantity but kind of make like a reservation. So in this memory soft purchase what we do is we can have a key value of user ID plus cart and uh I need the cart here plus plus the item let's just do item I don't think we need cart plus item ID and then this will have this entry in this memory thing we have a TTL of tendons.
So let's talk about what happened. And then inside here we also have a KV here another another table which would be item ID to a quantity.
Okay. So let's go through. First user goes I want to buy a packet of like a 12-pack of Coke. I check out get the item ID from the items table and that can be cached to double caching optimization see like there's a quantity there's availability I create a cart now the cart exist and then I want to add that to my cart and create and like make sure only I can add it to my cart only the people only if there's quantity added to the cart um first thing since I'm the first person populate this quantity field and then decrement that by one and create the record of item ID like my myself as a user I am reserving this item ID with a TTL of 10 minutes if the TTL expires we would increase the quantity back in the in the in this other table inside of memory so now if another let's say that was the last piece so I check it the database quantity one I add the item ID quantity of one I add myself I decrement the quantity zero let's say s you want to also buy this item I check the quantity, it's there. I add it to cart, but then it will check this thing and it will see that the item ID, the quantity right now is zero. It will say try again later.
>> Okay. And and all of that like this just want to make sure this will be the time to live where is he is he in a database?
What just where is it >> that will be configured with like Reddus? This is an inmemory reddus solution.
>> This is in memory. Okay.
>> Yes, this is in memory. And then we'll talk about what happens if this fails in a minute after think.
>> Sure. Sure. Yeah. Yeah. But just want to make sure one thing is clear.
>> Yes.
>> So in the item table >> we have quantity there.
>> Yes.
>> Meaning one item might have a thousand available quantity.
>> Right. So, let's say we have 500 available right now >> and I try to add another item.
>> Mhm.
>> How would it just just walk me through?
Is just just going to check and see if there's a 500 available.
Uh, which part are you talking about?
Like another user trying to buy it?
>> Yes. Yes.
>> So, another user tries to buy it and has somebody else added it to the cart. Mhm.
>> It would still show for now it would still show the 500 um which would be unfortunate because then they would but when they try to add it to this memory cache they will get the error like not available.
>> Why would they get the error?
>> No. So we have to go we have to we have to change it now. We have to change it to check the inmemory cache.
>> Okay.
>> And then and then they'll get the error.
>> It will go to the immemory cache. Where would it go in like what would be the logic again? Let's just say there is 500 available right now.
>> So let's go through let's say there's 500 available. Now in the inmemory cache here we have the item ID and the quantity available.
>> So >> so if we okay let's let's go with the actual ideal flow.
>> Oh I see I see. So >> there's two flows.
>> You have the item and the quantity >> and then you will have the item the user and the TTL.
>> Yes. Yeah. And then we also need a quantity here too.
>> Sure. Sure. Sure. Just but really quickly, I think I totally understand what you're doing, but how would the TTL when when um when an item expires, how would that change the quantity? How would you update the quantity >> upon eviction? Upon eviction, we increase we increase the item ID quantity.
>> No, no. Yes, that makes sense. But how like is there like how would that work within Red?
>> How would that work under the hood? We can't configure the custom TTL logic with Reddus, I think.
>> Okay. So, what you're saying is there is some sort of a trigger on Reddus. Once >> Yep.
>> fires up, you can write some logic and then you will do the rec. Okay, sounds good. Let's keep going. That makes sense.
>> Okay, cool. So, now we're going to change the arrows. So, so this is not necessarily a cache. It will serve as a cache certainly, but it's also a software reservation system. So now we have this. Now we do need to persist this state in the event that this memory this memory storage goes down. We have to be able to reconstructed if this thing dies. And I think the key to that is the cart items table. So upon adding this thing to the memory cache to the in-memory solution we will now add the items and quantity to the cart items. So oh hi sorry I accidentally clicked backspace.
>> Okay it's okay.
>> So sorry. So now in here in cart items we have a card ID, item ID and then we also have a quantity.
So now if this thing if this inmemory thing dies we can get the users users can get their card ID and we can reconstruct the state of this and then the users can build rebuild what they what it was looking like.
Cool. Does that make sense?
>> Makes sense so far. Let's keep going.
>> All right. Cool. Now we have built the cart. It has items. it has a reservation to it. Uh the first thing will be a cache miss that populates the item ID and quantity and then puts the slot in and then the second the subsequent things will be um there uh for item metadata. We could also cach that u but let's talk about now more complicated bit complicated workflow.
So, we have two different things we can go through. Uh, you want to go through notifications and subscriptions or do you want to go to the actual checkout?
>> Um, let's do the the subscription and the notification the price changes.
>> Okay, perfect, perfect, perfect, perfect.
Okay. So now item ID, we also probably need a price.
We need a price. So let's see. Let's go through it. We have a subscription service and item ID. And um we will also want and this would just have sounds like this same thing as subscription.
I'm trying to figure out so let's say now this would be a different work path workflow. Uh this would be some sort of admin like I know items admin like the actual owner and then he's going to talk to our virtual store right he will talk to a virtual store and then he will say update the price of item. The service is intelligent enough to know like okay this item price changed. I will communicate with our subscription notification service.
So we do now so these notifications are not scheduled. It feels like they're just happen at hawk right.
>> Yeah. So so let me just make sure the the the requirement flow is clear. Mhm.
>> Someone will change the price change the price uh you know value in in the item table >> and let's say those services are totally blackbox for us. We don't know who there might be multiple services but the database will learn about that.
>> Oh okay. So we don't manage this. We don't manage the prices.
>> We do manage the price changes.
>> Someone else maybe it's one service maybe it's multiple services. We don't know. We don't care.
>> Okay.
>> Once that happens be and even before that I already know who is subscribing to what products.
Maybe I have million product.
>> Mhm.
>> Maybe million users subscribe to a thousands of those you know distributed in different ways. Right. Right.
>> Like maybe you and I love Nike shoes and it's right now expensive. We both subscribe to it. Maybe someone else signed you know subscribed for a piece of jewelry whatever it is. Right.
>> Right.
>> So once the price change happened.
>> Yes.
>> Your service your solution should understand who subscribed to that particular price and notify them as soon as possible.
>> How do we know the price changed in our database?
>> Yeah. So this is part of what I want you to solve for. Okay. The expectation is that price field in the item table >> we don't control it.
>> It will change right. So we need to observe something on the database that we know real time or near real time or as soon as possible that it did change.
>> Is there a is there are there database solutions that can spark an event when they something changes? Is that a possibility that we can use? Of >> course. Okay. So, >> so, so let me since you maybe did not know about that which is totally okay that can be simple added thing to your knowledge base.
>> CDC it's called change data capture.
>> Change data capture. Okay. Thank you.
>> So databases can emit basically you know price changes and you know there are many other ways right? You know another classical way is you will have a chrome job that runs every you know minute or so and keep checking on those also trigger you can like just go old school and just say you know anytime I mean CDC is probably implemented behind the scene through a trigger I don't know if you used triggers before they are not popular now but they are still effective you'll say you know anytime that particular feed is that particular row is edited you can check has the device before and after change and then you log that somewhere.
>> Okay.
>> But basically CDC is is is a method for you to push those changes to kind of somewhere.
>> Okay, perfect. So this kind of segus us into scale the scale of the service. Um so you should Okay, so the scale here you said we have millions of items, right?
>> Uhhuh. And then you said million how many million users here?
>> So let's say we have 10 million products and 100 million users.
>> But not of course not all of them will subscribe to our products, right?
>> Of course. Of course. And then um on a given day we probably have how many on a high per on a high high interest item we could have thousands of concurrent purchase right?
Does that sound reasonable to you?
>> Yes.
>> Regular two probably transactions per second.
Okay. And then subscriptions we would >> those are not just p. Yeah. Yeah. Go ahead. Go ahead.
>> Okay. And then so for subscriptions you know we might have you know how many actual ticket price calers or I say like maybe 10% 10 million users subscribing.
>> I say this is a popular feature.
>> Yeah. You know >> okay I >> mean 1 to 10% probably what I would estimate.
>> Okay. So 10 million users subscribe to to 10 items. Okay. So we're talking about uh we're talking about around same thing around 100 million subscriptions.
So 100 million rows in these tables.
Okay. Not the worst thing. Okay. So right. So let's say we set up that CDC event on the database and say like hey price change for this item and then we would have another our subscription service.
So if we're talking how often do these prices change um for 100 million items they don't change that.
>> This is a good question. Some of them change very frequently like every day, sometimes twice a day.
>> Some of them rarely change and there are a lot in between.
>> How often does the price change? Very good question. Okay, so the price changes. Let's say you say some update every day, many times a day. So we have one 10 million products, right? Let's say how many of them change on a daily basis? 5%.
>> No, no, no. Much less than that.
Probably one over a thousand.
>> One out of a thousand.
>> Yes.
>> Okay, cool. Let me get the calculator out. We have 10 million items. 1 2 3id,000.
So, we have 10,000 10,000 items change price daily. Still a decent amount.
10k items change daily. And then you said some of them an even further smaller subset change a lot, right?
>> Yeah.
>> So I would say the average price change for these 10,000 items would be let's say 10% of these are the ones changing a lot and they change maybe 20 times. So that's 20,000. So we're talking about 30,000 events. 30,000 price change events.
>> Yeah.
>> Okay. Cool.
>> Okay. So 30,000 is nothing to sweat at.
Yeah.
>> Uh daily though. So daily. So the calculator we get 30,000 daily 3ide 4 / 60 divided by 60 again. So we're talking about like one per second around one event per second. Not too bad. Okay. So we we trigger the the C the CDC event changes goes to the subscription service and we want to keep this relatively available and some subscription events will be a lot bigger than others. Like if a lot of people are subscribed to the new Nike shoes, it could be a million subscribers in theory, right? So we want to be careful with this. Okay. So when we have we have us we get the item ID. So I will make this table shoot.
I will make this table primary key by the item ID because that's what we're going to get the event about. And then we have a whole list of users. We got to be very careful about fetching that subscription list of item ID user ID because there could be a million, right?
So we want to make sure the subscription service pagenates that properly.
>> Yeah, I would I would like I would just not worry about that yet.
>> Okay.
>> Just just make sure it's clear.
>> How would you implement a service that will really observe for those changes and send them >> right away? Yeah.
>> Right. Okay. And then Okay. So now we know the item changed through the CDC event and um >> I would I would decouple the subscription service away with a queue to a worker that talks to notification service because like I said one event could be really fat like Nike shoes or it could be really small like a small niche event and then the worker will interface with our database safely. There will probably be a little lightweight service engine here that can pagionate and then that worker is going to get the item ID. It will talk to the database about which item which people are subscribed to it and then it will talk to an an external notification service.
All right. So not that's that's the main idea there.
>> Okay. So let me just make sure I follow.
>> So people will subscribe.
>> Yep. and and when the subscription happens that subscription will be recorded in the database. So there will be like a table for user product subscription, right?
>> It's right here. Subscription user item ID to user ID. Right.
>> Yes. Exactly.
>> Exactly. And then the database will produce price changes.
>> Mhm.
>> Um where those will go like in a price change. Does it say the pair of Nike shoes that you and I subscribe to >> have just changed right now?
>> Yes.
>> You said there is a CDC event. So the database would admit the event to the subscription service.
>> Oh, so the database will send that to the subscription service and then >> I see.
>> And then we want to make the subscription service like just really available just like we don't want to hang the database at all. So this is kind of a dummy thing that goes, "Oh, okay. I got it. Your thing changed." And it puts that in the cube for the actual workers to do the tough work.
>> Okay. Okay. Sounds good. Um I see. And then that subscription will put it in a queue.
>> Yes.
>> And then then the worker picks up. The worker right now only knows the item ID.
>> Okay. It doesn't know what the delta is between the price. It just knows that the price changed.
>> Okay. Sounds good. Sounds good. So, um, now if and I like that. This is actually a cool solution. Let's say I change the functionality on you or like I I just changed my mind or wanted to kind of do the notification not through an email. I want that to be shown on the browser.
>> On the browser >> like you know you're on the browser and I want to say you know like you're already on Amazon.com or whatever the e-commerce >> and I want this alert to show hey you know the pair of Nike shoes you've subscribed to the price have dropped right now the price is $90.
>> Well so okay so well that's interesting now. So we want to expand on how this notification service would work then right.
>> Mhm.
>> Now this notification service >> or do something different. Do you need to go to the notification service?
>> Oh, so you don't want me to use an abstract notification?
>> It's up to you. It's up to you. It's up to you. But I'm just saying you know >> we want to >> and you may be able to do it without going to the notification service.
>> Uh to have that banner then at that point we would need a sort sort of websocket connection.
What other alternatives you have other than websocket?
>> So well alternative to that is we would have a popsub and then the popsub the channel ID could be the user ID. So then when the user is logged on their browser could subscribe to that channel popsup and get notifications that way.
>> Yeah. But what is it? I I think you're on the right track or >> this could work here or this can be a good solution. So let's just split two things. I think there is the decision about the technology to be used for sending the change websocket or something else.
>> Mhm.
>> And then there is the preparation for the information to be available for the websocket or the other technology.
>> Mhm.
>> To be used to be sent. Right.
>> Right. So let's settle on the uh let's just put the websocket aside and and the other technologies. So if you use popsub what you will put on the popupsub >> the popsub. So the publish there >> what would I publish there?
>> Yes.
>> I would just publish price change on item idx.
>> It's just the item ID. Super simple.
>> So basically the channel is the item or the topic is the item.
>> Oh the topic is the item. So stupid.
Okay, that's so obvious. Okay. Yes, the item is so the channel is the item.
Sure.
>> When the user logs on, they can subscribe to that item while they're logged on.
>> Exactly. Okay. So, so we've resolved that. Who which service is the one who is going to subscribe to that popsup?
>> It could be a sidecard service in the user UI.
>> Yeah. Or it could be the same subscription service, right?
>> Or it could be a subscription service.
Yeah.
>> Right. Okay. Sounds good. Now we've settled on that. What would be your uh choice of of of of server to client communication? Websocket is one solution.
>> What are some of the other solutions and let's talk about you know the pros and cons for each one.
>> Right. So websockets it's a it's a heavy thing and it has state. Uh what we could also do is we could have polling. Uhhuh.
>> Um, you know, I don't want to do long exponential backoff polling because, you know, what if the user only cares about a couple items, right? And then, but we still don't want them to get notified only five minutes later. That kind of sucks. Uh, we just have a sensible polling um of a minute, but we got to be very careful because if we're talking about, you know, we said how many subscriptions could there be?
>> Yeah. And what I'm worried about polling is let's say I subscribe to products and they never changed.
>> Exactly. And you're pulling for a bunch of things that don't happen.
>> Um this is why I was name this is why I was saying websocket might be >> there is one in between >> one between websocket and polling.
>> So it's called server sent events.
>> So server sent events.
>> Okay. Thank you. Once you subscribe, you will establish that SSD connection and then basically you're telling your service, hey, once an event change on that price, let me know you have my connection.
>> So, it will be more of just a server to client communication.
>> So, making the client a server here.
So the client will establish that with the server once it subscribe for a product >> and then the the service will use the existing connection >> to send this once it knows about a price change.
>> How does the service know there's a connection alive?
So there is I mean I can send that to you in the feedback but once you once you have the intent to start a server sent event you will there are some you know like techniques that you can just do it there and then you will you will tell the server hey you know next time this changes let me know.
>> Oh you can uh oh wow >> yeah this is how server sent events work.
>> Okay I haven't read about server sent events. Yeah, it's it's the third it's the third technique to communicate between the server and the client. I mean you mentioned polling.
>> We talked about polling. Polling is not a good choice. Whoopsets is really >> too much here.
>> Yeah.
>> Because you cannot keep whoopset for 10 million users forever while a price might never change and so on and so >> that does suck. Yeah. server sent events is is going to be like the optimal solution here because you're telling you know the server only when something change let me know >> how have I not heard about this okay >> it's okay >> thank you so much >> so so yeah so basically this is this is good and I really just we're going to debrief shortly but before like just I'm going to catch a moment of of of interesting kind of things to reflect on I want you to reflect on the server has changes.
>> The database knew about changes. The server has changes and now we wanted to deliver those to the client.
>> Yeah.
>> When we wanted to do kind of email notification, you did the right thing.
You had you put them in a queue and you had workers working on top of those cues and then you called the notification service. when I tweaked that solution for you, you immediately reacted nicely and you kind of said, I'm going to use a popsup because I want more real time, you know, push instead of a pull technique. Um, so those are two very very popular repeatable patterns when a change happen on the database and you want to send it to the client. And then always the method of transition the method of of of of transition is a choice of socket polling or or or SSE.
SSE usually is the most effective lighter when you just want to send updates from the server to the client.
>> Okay. I will read up on server >> and we can talk more during the brief.
For sure. For sure.
>> Awesome.
>> Sounds good. I want to go back to the uh solution about time to live.
>> Um Actually, you know what?
Let me just ask you a question there.
Yeah. First, what would be an alternative solution to do this instead of for this?
>> An alternative to Reddus to have the like the soft reservation.
>> Correct.
>> I don't want to have a feel. I don't I don't want to have a separate column in that database of items with the quantity. That's just going to overload that table too much. So, we want to separate it still. We could have uh you know, when I'm thinking about this, like it just my head keeps going back to like right as TTL because there's a TTL of 10 minutes. Uh I'm trying to think here >> like you very early on you talked about optimistic concurrency control.
>> Yeah.
>> Would that be an alternative solution?
>> Well, optimistic concurrency control here helps us when the purchase quantity actually goes down.
So, we could have it I didn't want to actually touch that quantity going down until like the full purchase flow is done. We could have the that quantity go down when you add it to the cart.
And then we still need to keep some sort of TTL thing. Um, which is why I don't know how to do this away from the Reddit solution.
Uh so we um um >> I don't yeah I think yeah I don't know how to you have >> using like just purely postgress you can just do currency control you don't need to rely on you know read this and then you can just have a chrome job >> to really look for those items that have been in the cart for longer than 10 minutes or whatever the threshold is and then you release them from there.
>> Okay. Okay. That's Yeah, that makes sense. You need uh you need several crown jobs coordinated based on item table because we have 10 million items.
We can't just have one do everything, right?
>> Yeah. Yeah.
>> So, yeah, that is an absolutely.
>> Okay. So, now just read the last thing for checkout. What would you do to do checkout? And let's assume check out is a heavy duty process that takes time to finish.
>> Absolutely. payment and so on and so forth. So in that checkout if we start a checkout thing um let's say the user has a cart has the cart and there is we want to specify can we can we have like a third party surv providing like strike >> sure sure >> uh so we could also have here like third party uh check out token so okay so the the we have the thing in the TTL we have cart this populate. We have all the items. What we do now the the server clicks purchase right here. We have a post ID yeah virtual checkout clicks check out. What that does it will talk first to uh payment sur payment uh payment provider will talk to the payment provider. It will say like hey initiate a checkout for this quantity and then the payment provider will give back a token. We store that token in the cart database.
So if the losser if the if the user loses connection, we have the token and the cart all still there. And then the and then the payment provider will provide a 302 I think 302 redirect for the user to do it. Okay. The user checks out ta da da da da and then the the payment provider will respond to our server like hey this payment token is paid and then well now we can add a state here with the cart and call it state. It could be pending cancelled or paid and then the payment has gone through.
>> Okay. And then that at that point we know the payment's done and then the virtual store can respond with a redirect again to like the homepage or something or like a confirmation page.
Here's your confirmation.
And then you have here a confirmation.
Yeah.
And then this is nicely wrapped as like as like an order. That's the order right there.
>> Okay. Okay. Sounds good. But I just want to see um are you going to make this synchronous or a synchronous process?
The checkout.
>> The checkout.
>> Yes.
>> The checkout. So the check out will be async. Well, part of it is it will say, hey, start a checkout here. The payment process will will linger there and return with a token and then we just have to return that token and then we're done for now.
>> Okay. But um if if that process takes a lot of time and you know there might be multiple retries with the payment and so on and so forth.
>> Oh yeah that does like we're we're assuming that even just the initial like initiation could be slow.
>> Mhm.
>> Then yeah we will have to decouple it.
>> Okay. Sounds good. So just um one other question.
>> What could fail in this design and how would you solve for it? What could fail in this design? I mean, if the payment provider fails, then we're kind of not great. Um, but I'm guessing that's not what you talk about. Well, a lot of things can fail here. If my virtual store fails, this whole thing fails. So, we need that to be stateless and scalable, right? Um, if the payment provider fails, then we would just have a bunch of cards pending, which we would need some sort of cleanup solution, too.
What would you do? What would be the solution?
>> Is that the failure you're thinking about?
>> Yeah.
>> Okay.
>> I mean, just one of them. I just Yeah.
Like you said, that many things could fail here, >> right? So, in the card here, we have a time stamp too.
I would suggest we only keep a card alive um for like card species stay alive for a while I noticed but you could we can have a time stamp here of last time updated last time touched. This could be like last touched and then we can have a crown job go through this cards and delete the cards that are older than like five days.
>> Okay. Okay. Sounds good. What would be some of the edge cases in the price notification that you can think of?
>> The first one that came to my mind is like, you know, if we are the new Nike dropped and there's a million subscribers. That's the first one that came to my mind. Um, an edge case could be that >> I just want to make sure I understand it. Repeat it again.
>> Like, u So, a really popular item, let's say the new Grand Theft Auto 6 dropped, right?
>> Uhhuh.
We that could be the edge case and I accounted for that one with the pagenation here to make sure like there's so many subscribers we don't want to overload that worker so we can pagonate.
>> Another edge case is the price not changing like there could be a price change of cents or zero. We got to make sure we don't actually carry out a CDC event on that.
Um, we have a threshold other edge cases.
>> Yeah, I think this is good. Just last question. I promise I keep saying last question.
>> No, you're good. I'm I'm loving this.
This is really testing me.
>> What would the if you want to manage and monitor latency particularly for the price updates because that is the most critical kind of functionality you want to make sure is latency is low. What would you log and what would be your strategy to monitor latency >> for the price changes? Right.
>> Correct.
>> Right. Absolutely. Absolutely. So we could have metrics based on user ID events. So a CDC event happens, we can have a metric of item ID, received price change for this at this time. Another thing we really care about here is the Q size. The Q size is big.
We care about that. And then each each event here that gets thrown into the queue will have a time stamp of it of the time gotten queued. And we could also obviously generate a time stamp of when the worker got it. So we could basically generate timestamps at each event and we can track the latency in between events.
>> Okay. And that would be um so I'm you going back to my Google experience all of this is centrally managed by Monarch and each metric is updated via RPC.
So the target field could here be a U a UYU ID the the ID of the item. So the item and then the metric field could be the event and each event could be CDC sent subscription added to Q worker received Q. Those are all the events that happened and then we have all the events. They're managed centralized by Monarch and we have the time stamps in between each one basically.
>> I see. I see. Okay. Cool. So, let's reflect on this design. I want to give you a lot of >> I want to think with you about a lot of this.
>> Okay.
>> Before before I start, do you have time after uh the top of the hour? I just want to >> Absolutely. I have my afternoon free.
>> Okay, sounds good. So, I think this is great job. I I like I like your way of thinking and I like what you have done.
>> Um, but let me let's first start with self-reflection maybe if you don't mind.
What's what's your thoughts? How do you think the success? Okay, so I think I went a little too fast trying to get to I started the high level design and I kind of started that and I realized while we're talking about stuff I did not go over scalability at early point.
So I should have done that earlier. I just got carried away with the high level design. And then another thing that I didn't go deep into that I do want to is the our database solutions. I did explicitly say we go SQL no SQL by the way I would go no SQL for most for especially for carts cart items use subscriptions even items we can have a no sequel too so we didn't talk about that um I do kind of feel like I wishywashed over the payments part of the third token and like that that uh I'm just guessing that's how it would work.
>> Yeah. So that's my that's my self-reflection.
>> Yeah. I think um let me do this if it's okay with you. I want to give you my comments on the solution itself.
>> Mhm.
>> And then I'll give you general coaching comments. Is that okay?
>> Yeah, absolutely. Uh before we do that, can I take a quick bathroom break?
>> Of course. Of course. Of course.
>> I will be back. I'll be back in like two seconds.
>> Take your time.
>> Thank you.
Hi Tom, I'm back.
>> Yeah, no problem.
>> Yeah. So, uh so so let me quickly uh draw something here. So you you see my you see my drawing at the bottom, right?
>> Uh yes.
>> Okay. So service for search and a database and here we would have some sort of like elastic search right >> I think you know we're using elastic search maybe there is a room for you to kind of articulate why you're using elastic search you can say you know all the search capabilities are built in >> it's pre-index so it will be much faster and I will not add indices on my primary database so I will ease up the pressure on add edit and update on my primary database those are just some reasons that you can say and then you can say you know how would you sync it right with with the primary database you can say I'm going to use CDC for Okay, make sense?
>> Yes, >> you can also say, you know, I'm going to cache some of the popular queries and popular results. Maybe, you know, running shoes is a very popular query.
Christmas toys, you know, those are things that you can cache, right?
>> Of course, >> those are just um, you know, extra things that you can add. Now, the second one is cart. I'm not going to repeat the card here. All my comments about the cart is during design it will be good for you to talk about multiple options and talk about the tradeoff and the pros and cons for each option. So here you could have talked about you know um you could have talked about doing this on the database using like optimistic concurrency control versus doing that with a redis cache, right? Um and we can we can dive deep into this distributed lock. So basically used you you know this has a distributed lock here.
>> Mhm.
>> Right. Um the optimistic concurrency control is probably needed at the database anyways.
You know how it works, right? On the >> Yeah. But that actually means you're going to have to use some sort of SQL database, right? But let's put that aside, >> right? Let's put that aside. Um I think it's really needed because of some edge cases >> or the edge cases you really have to solve for.
>> I mean at checkout time you probably have to check on the quantity not going down to zero anyways >> because you know think about some educ cases like you know while while the item was expiring >> Mhm. it went out of dus someone else, you know, got it and added to the cart and purchases super fast. Yeah. Why the other user is still trying to put their payments and so on.
>> Absolutely. No, I Yeah, I I do want to say that like for my solution like I do say we need to update the quantity anyway.
>> Sure. Okay.
>> Yeah. For Yeah. But my point here is find a find an area of the design where you can really talk about pros and cons and for multiple options and do a trade-off discussion. Do you see what I mean?
>> Yeah.
>> Okay. Now for um for the price change I think what you did is is great but I just want to organize it here a little bit if that's okay with you. So here you can put the changes through CDC into a queue and you have workers that will pull from that queue.
>> And those workers you have here two choices, right? You can put them in another queue and this queue is really to fan out the changes. So those workers will go and say who is subscribing to that product that have just have just changed. Right.
>> Mhm.
>> And it will say and sell. Right.
>> Right. So it will say Q2 items in the queue here to notify Jana cell.
>> Right.
>> Right.
>> Yep.
>> Notification service. Right.
>> And then that notification service will pull from the IQ >> and it will send it directly to the >> the user >> customer. Right.
I see.
>> Okay. So, you're actually doing the queueing for the notification service.
>> Yeah. Yeah. Here just as a first solution, right?
>> Okay.
>> And basically I'm I'm I'm leveraging multiple EQs. I'm the first queue is just to capture the products >> that like really this is you know product.
>> Yep. Makes sense.
>> Price maybe timestamp, right? Something like that, right? The >> other EQ is really, you know, user.
>> Yep.
>> Just like user and product, right?
>> You see what I mean? Now, you can also leverage another queue for the real time notification that we've talked about.
You can use the same EQ, but I want to just put another EQ to articulate the solution here.
>> If if we want to do, you know, real time notification here. Mhm.
>> Right.
Let me just fix this. Right.
I'll use a popsub like you said.
>> Mhm.
>> And then that subscription service here.
Oh, there is a subscription service that we have to have, right? Let's just put it here.
It will subscribe and then those will also subscribe to this popsup and then the popsub will publish >> to >> those events to them >> with the server and events right >> the server events comes here. So this service will send the results back through service server send events. Do you see what I mean?
>> Yep.
>> That that that pops sub notification is just the nature of popsup.
>> You know CDC will publish events that products have changed. These guys have already subscribed to those topics, >> right? Once a price change for a product they have subscribed to, they will already be notified about those. And this is probably going to vertically scale, right?
>> Yeah.
>> So any service that has users who have subscribed to certain products, they will they will all be notified.
>> Make sense?
>> Makes sense.
>> Yeah. Now um after you're done with all of this and you do the proper dive deep to scale it and so on and so forth and here we kind of doing both together like the scaling and the function requirements but for example like this latency thing here this is super super optimized in terms of latency we're pushing the changes you know to cues and then those will really work very quick very fast right >> now after you do all of that You can talk about failure scenarios and that here you're kind of tapping into the territory of getting a guaranteed yes for system design. So after you do proper coverage for functional requirements and non-functional requirements if you talk to me about failure scenarios you talk about each cases you talk about observability and supportability that will guarantee a passing for you. Right? So some of the things that could fail here maybe this will blow up. Right. um you talked about other things that are all valid but just pick one that you think is valid or one or two that is valid >> and and common to fail and then how would you solve it like for example for this you can say I'm going to already rely on OC at the back end so even if that fails OCC will catch those and I can have an active active or a primary and a secondary for that popsup >> okay >> right uh edge cases I think you've talked about some edge cases for the price changes is you know maybe a popular product will change and a lot of people subscribing to it or maybe when user is subscribing to a lot of products where they change in a very frequent time how many notification do I need to send them do I need to ger you you're not expected here to provide full-fledged solution for those edge cases and failure scenarios but at least mentioning them and mentioning some ideas it just signals to the interviewer that you think of various scenarios in each cases while you're doing design.
Now after all of that if you can catch something important for the nonfunctional requirements such as latency and say you know hey to be able to create P90 and P99 and P50 out of that or for that I'm going to capture you know the time people subscribed and the time you know the product actually probably in this case the time a product has changed the price to the time I actually send them the notification or the server or something else >> and that is something that has to be the lowest possible.
>> Yes.
>> Right. So again just for you to mention that and come forward and talk about it.
It will be a bonus for you.
>> Okay.
>> Um those are just some very specific comments about the solution itself. Does it make sense? Any comments? Any questions?
>> No, this is amazing. Thank you so much.
>> Okay. Beautiful. Now, in terms of like some general coaching, and by the way, I'll type that for you.
>> Um, I think you did really well. You're very logical. You're very, you know, skilled and experienced. I really like what you did. I think, um, you talked about that when it comes to the main crux of the problem, make sure you spend quality time to detail the solution and close the circle.
>> Okay? uh usually like things related to concurrency server to push updates to to client those are super important not to be hand wavy on them and you know just dive deep on them and so on and so forth we also talked about you know finding an opportunity and talk about trade-off discussion that is also very needed and expected during dive deep >> um the um the other one I think um we talked about fair scenarios and edge cases um but the other one is there are some very like th those patterns every patterns of practice today in terms of server to push updates to the client you know the elastic search why would you use elastic search you know it is distributed like versus optimistic concurrency control or roll locking at the database um as well as you know websocket versus SSC versus polling those are very tradition traditional basic you know patterns please master on them I'll send you some links >> yeah for >> um but yeah please master those um uh those will just be good tools for you during system design to use to solve you know some problems >> thank you s >> this will be passing for me maybe with light confidence or >> okay >> but but if you do those things I think it will it will get much uh stronger but I think you are there you really about what it takes.
>> Okay. Thank you so much. That is this was this is one of my toughest ones.
>> Okay.
>> Thank you. So I have one technical question for you.
>> Yeah. Ask me ask me all the questions that you like.
>> What database would you would you go for SQL or NoSQL here? I was like >> I would go for SQL >> for everything even the cart and the sub. What about subscriptions?
>> Yeah. Yeah. I mean all of those are like there isn't any this is not really a massive database where it's going to scale or it needs to be flexible like everything is predetermined and there are good relationships between things so relational will be will be good and and we're not talking about trillions or you billions of transactions >> right what what quantities do you start really pulling yourself away from SQL like hundreds of millions or billions >> yeah so so this is actually in my mind a common misunderstanding.
It is really not just the quantity, right? The quantity is one factor, but it's really not the primary factor because you can shard your database, >> right?
>> Like the right quantity versus right, right?
>> Right. I think I think the nature of how would you use it? You know, does it are are there good relation are there, you know, relationships between, you know, the the entities? Uh is the predetermined and predefined? Is it does it had to be flexible? Those are more of the deciding factors to me.
>> Okay. Thank you so much.
>> Yeah. And and you know like maybe another one you know there are some some databases that have very specific nature and very specific purpose or they are very good at certain patterns. Like for example if you have you know massive number of rights and streams of rights and you just need to keep them somewhere maybe Cassandra will be good there. If you have, you know, a need for analytics, maybe use OLAP. Um, and maybe that will be OLAP on top of SQL or OLAP on top of, you know, no, >> but in general, the SQL NoSQL thing is overrated to be honest with you. Okay?
>> And I would uh I would not just look at the dimension of numbers of records. I mean no SQL horizontal scale and vertical scales better by nature by default out of the box. So that is definitely a plus for them. It's really more do you have enough good relationships between the tables. If yes then maybe SQL is better. Do you need the schema to be relaxed and extensible and so on and so forth? Then here no SQL is better.
>> Okay. Definitely. Okay. That could really clear my head that this was definitely a SQL solution.
Cool. Yeah. What other comments, questions you have?
>> Um, >> when is your interview now?
>> It's on Tuesday.
>> Sounds good.
>> Yeah, I I don't know. Yeah, I just have to I read your feedback carefully. You you you'd said a lot of good stuff already.
I'm I'm um I'm I'm glad you've I hope that you see this helpful and let me know if you need me to help you by any way, shape, or form. So, let me know.
>> How was my pacing?
>> Your pacing was good. You can slow down a little bit.
>> Slow down. Okay.
>> Yeah, you can slow down a little bit.
Slow down again when it comes to the high level and dive deep particular.
Yeah. at the at the at the high level in the deep take your time slow down and provide a quality content >> not necessarily quantity you know a lot of people start to explain things that you shouldn't explain >> okay >> but here for example how would the solution for concurrency should work you should elaborate more on that >> okay >> sweet okay thank you so much S >> what's the what's The was the in-memory solution not great for you?
>> The in-memory solution it was a good solution. It was a very good solution.
>> But anytime we introduce another solution probably it's worth it to justify it and I wanted to use that as a as a point of exercise to to discuss trade-offs. Why would we use that versus you know okay control in the database itself? And if we use that, is that gonna be enough on its own or do we still need to do optimistic concurrency control? That is what I wanted to press on. Okay.
Absolutely. Sweet. Thank you so much.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











