An API gateway is a centralized server that acts as a single entry point for all API traffic, handling routing, authentication, rate limiting, request/response transformation, caching, load balancing, SSL termination, and observability, thereby simplifying backend service management by consolidating cross-cutting concerns that would otherwise need to be implemented in each individual service.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
API GATEWAYS Deep Dive (System Design for Beginners – Episode 50)Added:
Today we learn about API gateways. And I have a bunch of diagrams for you guys that we can go through. But here's the simplest way to understand what an API gateway does. Imagine you're building an app like Uber and you've got a backend with lots of services. One for users, one for trips, one for payments, one for drivers, one for notifications, maybe 20 different services in total. And now your mobile app wants to make a request to the server. But without a gateway, your mobile app would need to know the address of every single of those services that we talked about. It would need its own login code and its own retry logic. It would need to make multiple round trips to assemble just one screen. And it needs its own way to handle when one service is down. And apart from that, every single service would need to do its own authentication, its own logging, and its own rate limiting. So all of this would be the same code but written 20 different times in each of these services. But with an API gateway, all of that becomes simpler. The mobile app talks just one address, which is the API gateway. And the gateway itself handles authentication, rate limiting, logging, and routing. and it has to do it only once for all the services in our software. And now each of the backend services just has to focus on its actual job. So that's what an API gateway is.
It's just a single front door for your entire backend service. And today we're going to walk through every job it does so that when you see API gateway in a system design diagram, you'll know exactly what's happening inside that box. At a fundamental level, an API gateway is a server that sits between your clients and your backend services and acts as a single entry point for all API traffic. So every request from your web app, your mobile app, a third party integration comes in through the API gateway first. The gateway then decides what to do with it. Authenticate it, rate limit it, route it to the right service, transform it, log it, and then send a response back. Now let's look at the plan for this video. We'll essentially go through each of the jobs that an API gateway does one by one. for example, routing, authentication and authorization, rate limiting, throttling, request and response transformation, caching, load balancing, SSL security, observability, resilience patterns like retries and circuit breakers. So there are about eight of these jobs that we'll go through as you can see on the diagram. Then we'll do a quick tour of all the popular tools that you'll see out there in the wild. And finally, we'll talk about when you actually need a gateway and when you don't need a gateway. And by the end of this video, you'll have a complete mental model of what API gateways do. So let me walk you through every job it does and we'll go through eight jobs in total. The first job is routing. Routing is the part where the gateway decides which backend service handles each request. The most common way is by the URL path. So anything starting with / users goes to the user service. Anything starting with /ders goes to the order service. Anything starting with / payments goes to the payment service and so on. So the gateway looks at the path of each incoming request and matches it against your rules and forwards accordingly. Routing also handles API versioning. You might have two versions of your user service running at the same time, version one and version two. The gateway can route requests with v1/ users to the old service and v2/ users to the new service. And this is super useful when you're rolling out a new version and want to keep the old version running just for clients that haven't upgraded to the new API yet. Now, all of this routing logic lives in one place.
And the best part is your services don't need to know about each other. And adding a new service means just adding a new routing rule. And moving a service means just updating one rule. Nothing else has to change. So API gateway makes things really straightforward for us.
The second job is authentication and authorization. These are two related but different things. And the gateway handles both. Authentication checks who is making the request. Authorization checks what they're allowed to do.
Without a gateway, every backend service would need its own authentication code.
It'll be the same logic but written 20 different times. And if there's a security bug, you'll have to fix it in 20 different places. So if you change how authentication works, you have to update 20 services. But with a gateway, authentication happens only once and authorization can also work the same way. The gateway can check things like is this user allowed to call this endpoint or is this API key allowed to access this resource? If not, it returns a 403 forbidden error. Now the best part is the backend never sees this rejected request. All of this rejection is happening at the API gateway level. Your backend services don't have to deal with raw tokens or permission rules anymore.
Now the third job is rate limiting and throttling. By the way you might know that this video is part of an ongoing playlist which is called system design for beginners and that has more than 50 videos and in that playlist we have done a video on rate limiting and throttling as well. So I won't go very deep down into rate limiting and throttling concepts. You can watch those videos and by the way there's also a playlist on intermediate system design on this channel. And if you binge watch both of those playlists you'll know system design better than 99% of the people on this planet. Raid limiting protects your backend from too many requests by rejecting them. And throttling is the idea of slowing down requests instead of rejecting them outright. The gateway is the perfect place for both. Every request flows through it so it can keep counters for each client. And requests that are over that limit get rejected or throttled at the gateway before they ever touch your backend services. So your services only see the traffic that's already been filtered. Now if you want to learn the algorithms behind how rate limiting works like token bucket algorithm the sliding window and all of that go check out the dedicated rate limiting video on this channel. Now before we continue, I want to take a few seconds and go a bit off topic and tell you about the Algorok cohort which is a live cohort that I run with usually 10 15 engineers in a cohort because I'm trying to solve a problem which is that there are no cohorts that exist for senior engineers. Basically people who already know the basics of architecture and system design and now want to build the next generation of production software in the postAI world. So this is a 12 weeks program and we cover topics like advanced system design, AI native system design, architecture for the post AI world and tips and tricks to using AI coding tools but from a senior engineer perspective. So I'm planning the next cohort which is actually going to be my third cohort around the mid or end of May 26. So I wanted to go through the algor.io website which is algorq.io and I want you to check out these topics that we'll be covering in this cohort and if this interests you, you can fill up the form that opens up when you hit the enroll button. Once you fill up the form, I'll check to see if you're a great fit and we'll set up a call with you. Now, it's $2,500 for the entire live cohort, which I teach personally, and it's a great experience. And once you fill up the form, you can actually talk to any of the people who have attended the cohort previously, and then decide if you want to go ahead or not.
Now, that's the live cohort, but if you just wanted access to pre-recorded cohort videos, that'll be about $800.
All right, now back to the video. The fourth job is transforming requests and responses on their way through.
Sometimes what a client sends isn't what the backend expects and sometimes what the back end returns isn't what the client needs. And the gateway can handle both. On the request side, transforming JSON to the format a service expects like renaming fields, injecting headers, translating protocols. For example, your mobile app sends HTTP 1.1 rest and your internal service uses gRPC. So the gateway translates between them and the client will never know this. On the response side, the API gateway could be filtering out internal fields that should not reach the client and also combining responses from multiple services into one payload, thereby shaping the data for specific client types. The fifth job is caching. A lot of API responses don't change very often. For example, the list of product categories, the current weather, a user's profile picture, URL, and if the same data is being requested over and over again, why hit the backend service every single time? The gateway can actually just cache the responses. So when a request comes in, the gateway first checks, do I already have a cache response for this exact request? If yes, return the cache version immediately. If no, forward it to the back end and get the response, save it in the cache and return it to the client. And we have seen multiple videos on caching in this playlist. And there was a video specifically about caching at different places in our infrastructure and that covered API caching in more details. So make sure you check out those videos.
The sixth job is load balancing. In production, you don't usually run just one copy of each service. You run multiple copies all running in parallel so that if one crashes the others can take over and this is how you get reliability and scaled. But then there's a question when a request comes in for the user service which instance should handle it and the gateway makes that decision for us. It keeps a list of all the healthy instances of each service and when a request arrives it picks one to forward to. We've already covered load balancing in a dedicated video in this playlist along with all the algorithms used to take the decisions while routing. So you can check it out.
The seventh job is SSL termination and basic security. Your backend services don't need to handle encryption individually. The gateway handles the TLS handshake at the edge. Decrypts the traffic and forwards plain HTTP internally. Your services communicate over the private network where you control trust and this offloads CPU cost from every backend service and centralizes your certificate management in one place. The eighth job is observability, logging, metrics and tracing. Since every request passes through the gateway, that makes it the best place to collect data about your systems behavior. The gateway logs every requests. For example, who called what, when was it called, how long it took, what status code came back, metrics, request rates, error rates, latency percentiles that feed your dashboards.
It also injects trace ids that propagate through your entire service chain, so you can reconstruct exactly what happened during a slow request. Without a gateway, you'll have to add this instrument to each service individually.
And with a gateway, you get systemwide observability from just one place. Now, we covered quite a bit about API gateways and how they perform a lot of jobs for us in our infrastructure and how we can combine load balancers, rate limiters, authentication, all of that at the API gateway level itself. By now, you have a good intuitive understanding of what API gateways are and how they work and how they're like a single interface for your entire backend services. Now, I hope you appreciate the amount of effort it takes to build these diagrams and to make these videos. So, make sure you share this video with someone and you comment on this video if you have any questions. Make sure you have subscribed to this channel and you like this video as well. And I'll see you in the next video.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











