System design mastery comes from understanding six core concepts: (1) Access patterns determine database selection—key-value databases suit simple lookups like URL shorteners, while relational databases handle strongly related data like social media apps; (2) Reads and writes scale differently—read-heavy systems use caching, read replicas, and precomputation, while write-heavy systems use async queues and partitioning; (3) Caching improves performance but introduces complexity with stale data and cache stampede risks; (4) Queues enable asynchronous processing, absorbing traffic spikes and allowing retry mechanisms; (5) Partitioning (sharding) scales writes by splitting databases based on a shard key, requiring careful key selection to ensure even distribution; (6) Consistency models range from strong consistency (bank balances) to eventual consistency (social media feeds), with the choice depending on whether temporary incorrect data is acceptable.
Approfondir
Prérequis
- Pas de données disponibles.
Prochaines étapes
- Pas de données disponibles.
Approfondir
System Design was hard until I learned these 6 conceptsAjouté :
I used to think that system design was insanely hard compared to things like leak code or the basics of learning how to code. But I've spent the last two years of my career basically dedicating myself to system design and now I feel like I have a good handle on how things work. So today I want to walk you through six core concepts which help me design any system whether it's for a personal project on the job or during an interview. Now to be clear I'm not going to be starting from the absolute basics.
I'm assuming you know things like what is caching, what is horizontal scaling, what is a load balancer. If you don't know this stuff, pause the video. I have a free 42page guide in the description which you can download and read which will walk you through that. Then come back here and I'll teach you the patterns. So let's get right into it.
The first thing I want to cover is how access patterns define your database. I see way too many people asking questions like should I use SQL or NoSQL for this project? The actual question you should ask yourself is what are the access patterns that you're expecting? How will the data be read and written? Once you've defined that, the answer will naturally emerge. Let's look at a simple example. One of the most common system design questions you'll see in interviews and a great case study if you want to practice is design a URL shortener. In the case of a URL shortener like bitly or tiny URL, the primary access pattern is to find some long URL given some short URL. So when somebody wants to go to a website using that shortened URL, well, you just look up the long URL using the short URL.
That's pretty much the only access pattern. There are no joins. There's no complicated data analytics you have to do or anything like that. In this case, the best fit for your database is a key value database, something like DynamoB or Reddus. Let's look at another case.
Say you're designing a social media app like Instagram. In this case, data is strongly related. Users have posts, users have followers, followers have post, yada yada yada. In this case, a relational database is going to be a much better fit. The data is fundamentally related. Take that structure, pick your database based off of it. It sounds obvious, but people try to shoehorn their data into the wrong database all the time. If you have to do that, do it into a SQL database like Postgress, but please don't do it. Now, the second thing is reads and writes scale differently. And there's easy ways to scale both. The first thing you want to do is identify whether your system is going to be read heavy or write heavy.
Read heavy systems are things like URL shorteners where the most common access pattern is a lookup, a news feed where you have millions of people looking at the same news story, but that news story only gets written once or a product page where it never changes and people just come and look at images of your product every day. On the right heavy side, we have things like analytics where a bunch of stuff is getting populated every day into a dashboard that your CEO looks at once a week or a chat app. Of course, people are texting back and forth.
There's pretty much one right for every read. And of course, logging design. A distributed logging system is a great case study if you want to learn a little bit about this. But logging is the textbook example of where you're just writing a ton of stuff and people probably aren't going to read most of it unless there's a huge issue. Scaling reads is not difficult if you know what you're doing. There's four major things in your toolkit that I want you to consider anytime you're looking at a read heavy system. The first of which is obviously caching. As I mentioned before, I'm expecting you to know what caching is, but caching can be applied differently in different types of systems. You can cache inside the browser or on a user's device. You could cache inside the server in memory. You could cache between the server and the database with something like Reddus. And all of these are viable options depending on where most of your work is happening. You select where to place your cache. Related, but not exactly the same, is using a CDN. This is pretty much a given for most systems where you need to serve static content, use a CDN.
Another thing you have in your toolkit that you should almost always consider is read replicas. Very self-explanatory.
Instead of just having one database which serves all your traffic, you can have one primary database which serves reads and writes and then duplicate that database multiple times into let's just say five different databases and anyone can read from any database at any given time but you only write to one. Reduces a lot of the read traffic on that one database. And of course, the last thing you can do is premputee stuff and then serve the precomputed information.
Here's a simple example. Let's say you have a database of student GPA with 1 million records. And the most common query is to get the average GPA. Well, you could sum the million records and divide by a million every time somebody makes a query. Or you could just calculate the average once and when somebody wants to get that information, it's a single query. Classic case where you might want to premputee that. Now, for writes, we have a different set of strategies for scaling. My personal favorite is making things async and adding a queue. Instead of trying to serve a million write requests to come in at the exact same time, we basically just put them in a line like you would at a restaurant. The requests have to wait in a line for the request in front of them to be completed before they can be served. This reduces the load on your database, but it also makes your users wait for the answers. So, you can't use it in all cases. But if you can't use it, then you should consider partitioning. Partitioning is a strategy to scale out database rights. It's quite simple. Let's say you're building a system that stores information about students at a school. To begin with, you saw all students and all their information in one database. But now you have a 100 million students for whatever reason. This is a massive school and your database can't handle all the rights that are coming in when an exam happens because all the grades are coming in at once. What you can do is split your database into four different databases. And let's just say we split it by letter of your last name. And just to go completely wild here, let's say we have one database for every single letter of your last name. Now that database is handling the rights. You've basically just gotten 26 times more right capacity. Of course, partitioning is not free ever. It's a nightmare because now you have to route requests to the correct partition. Scale in becomes interesting. Logging now has to be aggregated. All this fun stuff. But it is a great strategy to scale rights.
Now, quick plug. If you're learning system design or already know it and just want to practice every single day, I've been building out the daily dev, which is kind of like my Dolingo for system design. You get a system design question every day, a couple other ways to practice, mini lessons covering the basics, and of course, a question history showing you where you're doing well and what you need to improve on.
You can check it out on the web and the app store and I'll link it in the description down below. Back to the video. Now, I know we talked a little bit about caching, but I want to bring it back up because it's something that comes up so often in interviews and when people are designing systems. Caching is just remembering expensive answers.
Baseline refresher. Without a cache, requests directly hit whatever the service is that you're trying to protect, which eventually becomes the bottleneck. So, you add a cache to premputee or store information a little bit closer to the end user than the current database. Hooray, it works. It's super simple. It solves all our problems, right? Not really. In actual systems, I'm very hesitant to add caching. Yes, it helps with performance, but it becomes a huge pain in terms of all the bugs you now have to deal with.
You get stale data. It's difficult to know when to kick information out of the cache. And of course, if a key in the cache expires, a bunch of requests connect your underlying system at once and take it down. It sucks. So, yes, caching is powerful, and if you have a lot of heavy requests that are coming to your database, then by all means, go ahead and add a cache. But just keep in mind that it's going to make your life a lot harder. Caching is something I tend to reach for a lot more in interviews than I do in real life. It sounds great, but when you're actually going to implement it yourself, it kind of sucks.
Next, let's talk about Q's. If you've seen my channel before, you know that I'm a huge fan of Q's. I have an entire video on them. Let's look at this simple example system where a Q might help us.
In this case, let's say we're uploading a post to an app like Instagram. You upload the image, it gets processed, maybe you check it for harmful content, then you create the thumbnail for the image, so a lower resolution thing that you might see on someone's page when you're scrolling. You notify the user that it's complete or their followers.
You index that post in your database and then you respond back to the user. Now the user has to wait that entire time and all those steps could take I mean 2 minutes. I imagine making an API request and waiting 2 minutes. Doesn't really make sense. Okay. So now let's introduce a queue and make it asynchronous. When a user goes to upload a post we immediately respond to them which an HTTP 202 code which means accept it.
We're going to start working on your request but we don't really have any information for you except for a tracking link which you can use to figure out the status of it. Then we take that information and we put it in a queue. So, hey, there's a bunch of posts that we need to process and yours is now in the queue. We have workers, which can be servers essentially, that process requests from the queue and handle all of that additional work. This could take 2 minutes in the best case, or it could take 30 minutes if there's a bunch of stuff in the queue or something goes wrong or a dependency fails, we need to retry, whatever. But now, the user is not sitting there waiting for that request to complete before it can continue with some other work. Much better system. These are some of the main benefits that it's worth noting when you want to add a queue to your system or you're thinking about if it's the right solution. First of all, you can absorb spikes in traffic. Without a queue, if you get a million requests at the same time, your server is going to try to respond to a million requests at the same time and probably break or go down. With a queue, well, put those million requests in the queue and your server can only process 100 per second.
It pops 100 off the top of the queue, processes them, pops another 100, processes them, no issues. Another thing you get is easy retry on failure. Most Q systems like SQS don't actually remove the items from the queue when the server starts processing them. It just makes them invisible to other workers. Then once your server actually completes them and notifies the queue, they're finally removed. But if they fail for some reason, those messages become visible again and another worker can pick them up and retry automatically. They also give you the ability to decouple services. Let's say you're working in a micros service-based system and you have three different microservices which all depend on each other. Well, if service A has to directly call service B and that call fails, this is very risky for you.
Instead, you can just have service A put item into Q and then service B can pop item from the queue. No direct dependency. And of course, background work, like I mentioned, if you want to do thumbnail generation, processing checks, all that kind of stuff can happen in the background easily with a queue. So, I know I mentioned partitioning earlier, but I want to talk about it again because I feel like people don't understand what it is or they reach for it in the wrong situations. Here's a more realistic example for my students database. We can sort students by last name and instead of storing all students in one database, based on their last name, they'll be put into one database. One of the hardest parts about doing this correctly is picking your shard key or what we decide to split the database based on. For example, let's say we did pick first letter of your last name as a shard key and actually did split out into 26 databases. Like I mentioned earlier in this video, in the US, 9% of people have a last name that starts with B, while only 0.02% of people have a last name that starts with X. So that B database is serving way more traffic than that X database and we're going to run into problems there first. So what do we do about this? Well, we could shard by something like a unique identifier that we create for each student, right? With that, we know our data is going to be evenly distributed. But now, if we want to find two brothers, they might be stuck in different shards, even though they have the same last name. Quering becomes more difficult. I don't think there's any magic when it comes to picking your shard key, but just make sure you consider all the trade-offs like this. This is not an easy decision and it's going to be one of the most important things you do when it comes to scaling your database for rights. And if you pick the wrong chart key to begin with, rebalancing is not trivial. You have to move data across all these different servers and it just sucks.
Trust me, you do not want to do this.
Last thing I'm going to talk about is consistency. This was one of the most confusing concepts to me when I first joined Amazon. It felt like this thing that people were always talking about and I never really understood.
Consistency is just the lies your system can tell. With an acid compliant database or a strongly consistent database, you need strong consistency.
Meaning, if I write, any subsequent read request is going to get the information that I immediately wrote. But in a lot of large scale systems that you interact with every day, this is just not true.
For example, if you look at an Instagram post, the like count is probably not correct. You might be seeing the like count from 10 minutes ago, 20 minutes ago, even 30 minutes ago. This happens to me all the time when I look at my own posts on Instagram. Why? because it just doesn't make sense for Instagram to be checking the total like count from the server every single time somebody makes a like, especially on a popular person's account. But when it comes to something like a bank balance, you definitely want strong consistency. What if I wire transferred $100 out of my account? It wasn't reflected yet, and then I wire transferred another 100 out of my account, but my total account balance was only 100 in the first place. Now I've withdrawn $200 from an account with only $100 of balance. Not good. So, when should you actually prioritize strong consistency? Now, this becomes pretty intuitive the more you practice with it.
But basically, when you're okay with having incorrect information, you can go ahead and have looser consistency. And in fact, you probably should. Feeds, like counts, social media, eventual consistency is completely fine. Things like inventory in an e-commerce store, bank account balances, or permissions updates in a user account settings, you should definitely focus on strong consistency. All righty. Well, that about does it for today. Hopefully, this was helpful for you guys. And again, if you want to practice system design every day, check out the Daily Dev in the description down below. Comment what you want me to cover next and follow for more.
Vidéos Similaires
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











